Bug #3832
Scheduler: Resuming a stopped VM tries to check IMAGE datastore capacity
| Status: | Closed | Start date: | 06/09/2015 | ||
|---|---|---|---|---|---|
| Priority: | Normal | Due date: | |||
| Assignee: | % Done: | 90% | |||
| Category: | Scheduler | ||||
| Target version: | Release 4.14 | ||||
| Resolution: | fixed | Pull request: | |||
| Affected Versions: | OpenNebula 4.12 |
Description
I stopped a VM that is using RBD disks in a shared datastore.
When I tried to resume the VM it stayed in a pending state and would not be scheduled because the scheduler is attempting to check for free space in the
image datastore of the VM.
Tue Jun 9 18:25:30 2015 [Z0][SCHED][I]: Getting scheduled actions information. Total time: 0.00479594s
Tue Jun 9 18:25:30 2015 [Z0][VM][D]: Found 1 pending/rescheduling VMs.
Tue Jun 9 18:25:31 2015 [Z0][HOST][D]: Discovered 1 enabled hosts.
Tue Jun 9 18:25:31 2015 [Z0][SCHED][I]: Getting VM and Host information. Total time: 0.00683541s
Tue Jun 9 18:25:31 2015 [Z0][SCHED][D]: Match-making results for VM 74:
Cannot schedule VM, image datastore does not have enough capacity.
Tue Jun 9 18:25:31 2015 [Z0][SCHED][D]: Match Making statistics:
Number of VMs: 1
Total time: 0s
Total Match time: 0s
Total Ranking Time: 0s
Tue Jun 9 18:25:31 2015 [Z0][SCHED][D]: Dispatching VMs to hosts:
VMID Host System DS
-------------------------
Tue Jun 9 18:25:31 2015 [Z0][SCHED][I]: Dispatching VMs to hosts. Total time: 1.3103e-05s
VIRTUAL MACHINE 74 INFORMATION
ID : 74
NAME : sqltest
USER : user
GROUP : group
STATE : PENDING
LCM_STATE : LCM_INIT
RESCHED : No
START TIME : 06/04 20:34:47
END TIME : -
DEPLOY ID : one-74
VIRTUAL MACHINE MONITORING
USED MEMORY : 0K
USED CPU : 0
NET_TX : 3.5G
NET_RX : 1.8G
PERMISSIONS
OWNER : um-
GROUP : ---
OTHER : ---
VM DISKS
ID TARGET IMAGE TYPE SAVE SAVE_AS
0 hda disk1 rbd NO -
1 vda disk2 rbd NO -
VM NICS
ID NETWORK VLAN BRIDGE IP MAC
0 Network2 yes br2 10.xx.x.x0 02:00:0a:60:01:14
VIRTUAL MACHINE HISTORY
SEQ HOST ACTION DS START TIME PROLOG
0 node4 none 0 06/04 20:35:58 3d 15h44m 0h00m04s
1 node4 none 0 06/08 12:25:02 0d 00h50m 0h00m00s
2 node4 stop 0 06/08 13:22:46 1d 02h41m 0h00m00s
USER TEMPLATE
ERROR="Mon Jun 8 12:26:19 2015 : Error attaching new VM Disk: Could not attach rbd (hdb) to one-74"
HYPERVISOR="kvm"
LOGO="images/logos/windowsxp.png"
SUNSTONE_CAPACITY_SELECT="YES"
SUNSTONE_NETWORK_SELECT="YES"
VIRTUAL MACHINE TEMPLATE
AUTOMATIC_REQUIREMENTS="!(PUBLIC_CLOUD = YES)"
CPU="0.2"
DISK=[
CEPH_HOST="mon-1 mon-2 mon-3",
CLONE="YES",
CLONE_TARGET="SELF",
DATASTORE="new",
DATASTORE_ID="100",
DEV_PREFIX="hd",
DISK_ID="0",
IMAGE="disk1",
IMAGE_ID="31",
IMAGE_UNAME="user",
LN_TARGET="NONE",
READONLY="NO",
SAVE="NO",
SIZE="102400",
SOURCE="rbd/one-31",
TARGET="hda",
TM_MAD="ceph",
TYPE="RBD" ]
DISK=[
CEPH_HOST="mon-1 mon-2 mon-3",
CLONE="YES",
CLONE_TARGET="SELF",
DATASTORE="new",
DATASTORE_ID="100",
DEV_PREFIX="vd",
DISK_ID="1",
IMAGE="disk2",
IMAGE_ID="32",
IMAGE_UNAME="user",
LN_TARGET="NONE",
READONLY="NO",
SAVE="NO",
SELECTED_RESOURCE_ID_ATTACH_DISK="32",
SIZE="307200",
SOURCE="rbd/one-32",
TARGET="vda",
TM_MAD="ceph",
TYPE="RBD" ]
GRAPHICS=[
LISTEN="0.0.0.0",
PORT="5974",
TYPE="VNC" ]
MEMORY="4096"
NIC=[
AR_ID="0",
BRIDGE="br2",
IP="10.xx.x.x0",
MAC="02:00:0a:60:01:14",
MODEL="e1000",
NETWORK="Network2",
NETWORK_ID="1",
NETWORK_UNAME="oneadmin",
NIC_ID="0",
VLAN="YES",
VLAN_ID="xxxx" ]
NIC_DEFAULT=[
MODEL="e1000" ]
TEMPLATE_ID="32"
VCPU="2"
VMID="74"
Looking through the scheduler it seems to be getting stuck here:
src/scheduler/src/sched/Scheduler.cc: 734 //----------------------------------------------------------------------
src/scheduler/src/sched/Scheduler.cc: 735 // Test Image Datastore capacity, but not for migrations
src/scheduler/src/sched/Scheduler.cc: 736 //----------------------------------------------------------------------
src/scheduler/src/sched/Scheduler.cc: 737 if (!vm->is_resched())
src/scheduler/src/sched/Scheduler.cc: 738 {
src/scheduler/src/sched/Scheduler.cc: 739 if (vm->test_image_datastore_capacity(img_dspool) == false)
src/scheduler/src/sched/Scheduler.cc: 740 {
src/scheduler/src/sched/Scheduler.cc: 741 if (vm->is_public_cloud()) //No capacity needed for public cloud
src/scheduler/src/sched/Scheduler.cc: 742 {
src/scheduler/src/sched/Scheduler.cc: 743 vm->set_only_public_cloud();
src/scheduler/src/sched/Scheduler.cc: 744 }
src/scheduler/src/sched/Scheduler.cc: 745 else
src/scheduler/src/sched/Scheduler.cc: 746 {
src/scheduler/src/sched/Scheduler.cc: 747 log_match(vm->get_oid(), "Cannot schedule VM, image datastore "
src/scheduler/src/sched/Scheduler.cc: 748 "does not have enough capacity.");
src/scheduler/src/sched/Scheduler.cc: 749 continue;
src/scheduler/src/sched/Scheduler.cc: 750 }
src/scheduler/src/sched/Scheduler.cc: 751 }
src/scheduler/src/sched/Scheduler.cc: 752 }
Simply commenting this section out however did not fix the problem. After recompiling and restarting the scheduler it still did not schedule the VM.
It did quit displaying the error message, but did not actually do anything about the VM.
The output became:
Tue Jun 9 18:47:56 2015 [Z0][SCHED][I]: Getting scheduled actions information. Total time: 0.00433847s
Tue Jun 9 18:47:56 2015 [Z0][VM][D]: Found 1 pending/rescheduling VMs.
Tue Jun 9 18:47:57 2015 [Z0][HOST][D]: Discovered 1 enabled hosts.
Tue Jun 9 18:47:57 2015 [Z0][SCHED][I]: Getting VM and Host information. Total time: 0.00669545s
Tue Jun 9 18:47:57 2015 [Z0][SCHED][D]: Match Making statistics:
Number of VMs: 1
Total time: 0s
Total Match time: 5.8248e-05s
Total Ranking Time: 2.9626e-05s
Tue Jun 9 18:47:57 2015 [Z0][SCHED][D]: Dispatching VMs to hosts:
VMID Host System DS
-------------------------
Tue Jun 9 18:47:57 2015 [Z0][SCHED][I]: Dispatching VMs to hosts. Total time: 1.4495e-05s
Associated revisions
Bug #3832: Scheduler detects VMs to be resumed instead of first deployments
Bug #3832: Cleanup xpath calls in scheduler
History
#1
Updated by Ruben S. Montero about 6 years ago
- Target version set to Release 4.14
#2
Updated by Ruben S. Montero almost 6 years ago
- Assignee set to Carlos Martín
#3
Updated by Carlos Martín almost 6 years ago
- Status changed from Pending to New
- % Done changed from 0 to 90
#4
Updated by Ruben S. Montero almost 6 years ago
- Status changed from New to Closed
- Resolution set to fixed