Bug #3832
Scheduler: Resuming a stopped VM tries to check IMAGE datastore capacity
Status: | Closed | Start date: | 06/09/2015 | ||
---|---|---|---|---|---|
Priority: | Normal | Due date: | |||
Assignee: | Carlos Martín | % Done: | 90% | ||
Category: | Scheduler | ||||
Target version: | Release 4.14 | ||||
Resolution: | fixed | Pull request: | |||
Affected Versions: | OpenNebula 4.12 |
Description
I stopped a VM that is using RBD disks in a shared datastore.
When I tried to resume the VM it stayed in a pending state and would not be scheduled because the scheduler is attempting to check for free space in the
image datastore of the VM.
Tue Jun 9 18:25:30 2015 [Z0][SCHED][I]: Getting scheduled actions information. Total time: 0.00479594s Tue Jun 9 18:25:30 2015 [Z0][VM][D]: Found 1 pending/rescheduling VMs. Tue Jun 9 18:25:31 2015 [Z0][HOST][D]: Discovered 1 enabled hosts. Tue Jun 9 18:25:31 2015 [Z0][SCHED][I]: Getting VM and Host information. Total time: 0.00683541s Tue Jun 9 18:25:31 2015 [Z0][SCHED][D]: Match-making results for VM 74: Cannot schedule VM, image datastore does not have enough capacity. Tue Jun 9 18:25:31 2015 [Z0][SCHED][D]: Match Making statistics: Number of VMs: 1 Total time: 0s Total Match time: 0s Total Ranking Time: 0s Tue Jun 9 18:25:31 2015 [Z0][SCHED][D]: Dispatching VMs to hosts: VMID Host System DS ------------------------- Tue Jun 9 18:25:31 2015 [Z0][SCHED][I]: Dispatching VMs to hosts. Total time: 1.3103e-05s VIRTUAL MACHINE 74 INFORMATION ID : 74 NAME : sqltest USER : user GROUP : group STATE : PENDING LCM_STATE : LCM_INIT RESCHED : No START TIME : 06/04 20:34:47 END TIME : - DEPLOY ID : one-74 VIRTUAL MACHINE MONITORING USED MEMORY : 0K USED CPU : 0 NET_TX : 3.5G NET_RX : 1.8G PERMISSIONS OWNER : um- GROUP : --- OTHER : --- VM DISKS ID TARGET IMAGE TYPE SAVE SAVE_AS 0 hda disk1 rbd NO - 1 vda disk2 rbd NO - VM NICS ID NETWORK VLAN BRIDGE IP MAC 0 Network2 yes br2 10.xx.x.x0 02:00:0a:60:01:14 VIRTUAL MACHINE HISTORY SEQ HOST ACTION DS START TIME PROLOG 0 node4 none 0 06/04 20:35:58 3d 15h44m 0h00m04s 1 node4 none 0 06/08 12:25:02 0d 00h50m 0h00m00s 2 node4 stop 0 06/08 13:22:46 1d 02h41m 0h00m00s USER TEMPLATE ERROR="Mon Jun 8 12:26:19 2015 : Error attaching new VM Disk: Could not attach rbd (hdb) to one-74" HYPERVISOR="kvm" LOGO="images/logos/windowsxp.png" SUNSTONE_CAPACITY_SELECT="YES" SUNSTONE_NETWORK_SELECT="YES" VIRTUAL MACHINE TEMPLATE AUTOMATIC_REQUIREMENTS="!(PUBLIC_CLOUD = YES)" CPU="0.2" DISK=[ CEPH_HOST="mon-1 mon-2 mon-3", CLONE="YES", CLONE_TARGET="SELF", DATASTORE="new", DATASTORE_ID="100", DEV_PREFIX="hd", DISK_ID="0", IMAGE="disk1", IMAGE_ID="31", IMAGE_UNAME="user", LN_TARGET="NONE", READONLY="NO", SAVE="NO", SIZE="102400", SOURCE="rbd/one-31", TARGET="hda", TM_MAD="ceph", TYPE="RBD" ] DISK=[ CEPH_HOST="mon-1 mon-2 mon-3", CLONE="YES", CLONE_TARGET="SELF", DATASTORE="new", DATASTORE_ID="100", DEV_PREFIX="vd", DISK_ID="1", IMAGE="disk2", IMAGE_ID="32", IMAGE_UNAME="user", LN_TARGET="NONE", READONLY="NO", SAVE="NO", SELECTED_RESOURCE_ID_ATTACH_DISK="32", SIZE="307200", SOURCE="rbd/one-32", TARGET="vda", TM_MAD="ceph", TYPE="RBD" ] GRAPHICS=[ LISTEN="0.0.0.0", PORT="5974", TYPE="VNC" ] MEMORY="4096" NIC=[ AR_ID="0", BRIDGE="br2", IP="10.xx.x.x0", MAC="02:00:0a:60:01:14", MODEL="e1000", NETWORK="Network2", NETWORK_ID="1", NETWORK_UNAME="oneadmin", NIC_ID="0", VLAN="YES", VLAN_ID="xxxx" ] NIC_DEFAULT=[ MODEL="e1000" ] TEMPLATE_ID="32" VCPU="2" VMID="74"
Looking through the scheduler it seems to be getting stuck here:
src/scheduler/src/sched/Scheduler.cc: 734 //---------------------------------------------------------------------- src/scheduler/src/sched/Scheduler.cc: 735 // Test Image Datastore capacity, but not for migrations src/scheduler/src/sched/Scheduler.cc: 736 //---------------------------------------------------------------------- src/scheduler/src/sched/Scheduler.cc: 737 if (!vm->is_resched()) src/scheduler/src/sched/Scheduler.cc: 738 { src/scheduler/src/sched/Scheduler.cc: 739 if (vm->test_image_datastore_capacity(img_dspool) == false) src/scheduler/src/sched/Scheduler.cc: 740 { src/scheduler/src/sched/Scheduler.cc: 741 if (vm->is_public_cloud()) //No capacity needed for public cloud src/scheduler/src/sched/Scheduler.cc: 742 { src/scheduler/src/sched/Scheduler.cc: 743 vm->set_only_public_cloud(); src/scheduler/src/sched/Scheduler.cc: 744 } src/scheduler/src/sched/Scheduler.cc: 745 else src/scheduler/src/sched/Scheduler.cc: 746 { src/scheduler/src/sched/Scheduler.cc: 747 log_match(vm->get_oid(), "Cannot schedule VM, image datastore " src/scheduler/src/sched/Scheduler.cc: 748 "does not have enough capacity."); src/scheduler/src/sched/Scheduler.cc: 749 continue; src/scheduler/src/sched/Scheduler.cc: 750 } src/scheduler/src/sched/Scheduler.cc: 751 } src/scheduler/src/sched/Scheduler.cc: 752 }
Simply commenting this section out however did not fix the problem. After recompiling and restarting the scheduler it still did not schedule the VM.
It did quit displaying the error message, but did not actually do anything about the VM.
The output became:
Tue Jun 9 18:47:56 2015 [Z0][SCHED][I]: Getting scheduled actions information. Total time: 0.00433847s Tue Jun 9 18:47:56 2015 [Z0][VM][D]: Found 1 pending/rescheduling VMs. Tue Jun 9 18:47:57 2015 [Z0][HOST][D]: Discovered 1 enabled hosts. Tue Jun 9 18:47:57 2015 [Z0][SCHED][I]: Getting VM and Host information. Total time: 0.00669545s Tue Jun 9 18:47:57 2015 [Z0][SCHED][D]: Match Making statistics: Number of VMs: 1 Total time: 0s Total Match time: 5.8248e-05s Total Ranking Time: 2.9626e-05s Tue Jun 9 18:47:57 2015 [Z0][SCHED][D]: Dispatching VMs to hosts: VMID Host System DS ------------------------- Tue Jun 9 18:47:57 2015 [Z0][SCHED][I]: Dispatching VMs to hosts. Total time: 1.4495e-05s
Associated revisions
Bug #3832: Scheduler detects VMs to be resumed instead of first deployments
Bug #3832: Cleanup xpath calls in scheduler
History
#1 Updated by Ruben S. Montero about 6 years ago
- Target version set to Release 4.14
#2 Updated by Ruben S. Montero almost 6 years ago
- Assignee set to Carlos Martín
#3 Updated by Carlos Martín almost 6 years ago
- Status changed from Pending to New
- % Done changed from 0 to 90
#4 Updated by Ruben S. Montero almost 6 years ago
- Status changed from New to Closed
- Resolution set to fixed