Bug #3832

Scheduler: Resuming a stopped VM tries to check IMAGE datastore capacity

Added by Roy Keene about 6 years ago. Updated almost 6 years ago.

Status:ClosedStart date:06/09/2015
Priority:NormalDue date:
Assignee:Carlos Martín% Done:

90%

Category:Scheduler
Target version:Release 4.14
Resolution:fixed Pull request:
Affected Versions:OpenNebula 4.12

Description

I stopped a VM that is using RBD disks in a shared datastore.

When I tried to resume the VM it stayed in a pending state and would not be scheduled because the scheduler is attempting to check for free space in the
image datastore of the VM.

Tue Jun  9 18:25:30 2015 [Z0][SCHED][I]: Getting scheduled actions information. Total time: 0.00479594s
Tue Jun  9 18:25:30 2015 [Z0][VM][D]: Found 1 pending/rescheduling VMs.
Tue Jun  9 18:25:31 2015 [Z0][HOST][D]: Discovered 1 enabled hosts.
Tue Jun  9 18:25:31 2015 [Z0][SCHED][I]: Getting VM and Host information. Total time: 0.00683541s
Tue Jun  9 18:25:31 2015 [Z0][SCHED][D]: Match-making results for VM 74:
    Cannot schedule VM, image datastore does not have enough capacity.

Tue Jun  9 18:25:31 2015 [Z0][SCHED][D]: Match Making statistics:
    Number of VMs: 1
    Total time: 0s
    Total Match time: 0s
    Total Ranking Time: 0s
Tue Jun  9 18:25:31 2015 [Z0][SCHED][D]: Dispatching VMs to hosts:
    VMID    Host    System DS
    -------------------------

Tue Jun  9 18:25:31 2015 [Z0][SCHED][I]: Dispatching VMs to hosts. Total time: 1.3103e-05s

VIRTUAL MACHINE 74 INFORMATION                                                  
ID                  : 74                  
NAME                : sqltest 
USER                : user
GROUP               : group
STATE               : PENDING             
LCM_STATE           : LCM_INIT            
RESCHED             : No                  
START TIME          : 06/04 20:34:47      
END TIME            : -                   
DEPLOY ID           : one-74              

VIRTUAL MACHINE MONITORING                                                      
USED MEMORY         : 0K                  
USED CPU            : 0                   
NET_TX              : 3.5G                
NET_RX              : 1.8G                

PERMISSIONS                                                                     
OWNER               : um-                 
GROUP               : ---                 
OTHER               : ---                 

VM DISKS                                                                        
 ID TARGET IMAGE                               TYPE SAVE SAVE_AS
  0 hda    disk1                               rbd    NO       -
  1 vda    disk2                               rbd    NO       -

VM NICS                                                                         
 ID NETWORK              VLAN BRIDGE       IP              MAC              
  0 Network2             yes br2          10.xx.x.x0      02:00:0a:60:01:14

VIRTUAL MACHINE HISTORY                                                         
SEQ HOST            ACTION             DS           START        TIME     PROLOG
  0 node4           none                0  06/04 20:35:58   3d 15h44m   0h00m04s
  1 node4           none                0  06/08 12:25:02   0d 00h50m   0h00m00s
  2 node4           stop                0  06/08 13:22:46   1d 02h41m   0h00m00s

USER TEMPLATE                                                                   
ERROR="Mon Jun  8 12:26:19 2015 : Error attaching new VM Disk: Could not attach rbd (hdb) to one-74" 
HYPERVISOR="kvm" 
LOGO="images/logos/windowsxp.png" 
SUNSTONE_CAPACITY_SELECT="YES" 
SUNSTONE_NETWORK_SELECT="YES" 

VIRTUAL MACHINE TEMPLATE                                                        
AUTOMATIC_REQUIREMENTS="!(PUBLIC_CLOUD = YES)" 
CPU="0.2" 
DISK=[
  CEPH_HOST="mon-1 mon-2 mon-3",
  CLONE="YES",
  CLONE_TARGET="SELF",
  DATASTORE="new",
  DATASTORE_ID="100",
  DEV_PREFIX="hd",
  DISK_ID="0",
  IMAGE="disk1",
  IMAGE_ID="31",
  IMAGE_UNAME="user",
  LN_TARGET="NONE",
  READONLY="NO",
  SAVE="NO",
  SIZE="102400",
  SOURCE="rbd/one-31",
  TARGET="hda",
  TM_MAD="ceph",
  TYPE="RBD" ]
DISK=[
  CEPH_HOST="mon-1 mon-2 mon-3",
  CLONE="YES",
  CLONE_TARGET="SELF",
  DATASTORE="new",
  DATASTORE_ID="100",
  DEV_PREFIX="vd",
  DISK_ID="1",
  IMAGE="disk2",
  IMAGE_ID="32",
  IMAGE_UNAME="user",
  LN_TARGET="NONE",
  READONLY="NO",
  SAVE="NO",
  SELECTED_RESOURCE_ID_ATTACH_DISK="32",
  SIZE="307200",
  SOURCE="rbd/one-32",
  TARGET="vda",
  TM_MAD="ceph",
  TYPE="RBD" ]
GRAPHICS=[
  LISTEN="0.0.0.0",
  PORT="5974",
  TYPE="VNC" ]
MEMORY="4096" 
NIC=[
  AR_ID="0",
  BRIDGE="br2",
  IP="10.xx.x.x0",
  MAC="02:00:0a:60:01:14",
  MODEL="e1000",
  NETWORK="Network2",
  NETWORK_ID="1",
  NETWORK_UNAME="oneadmin",
  NIC_ID="0",
  VLAN="YES",
  VLAN_ID="xxxx" ]
NIC_DEFAULT=[
  MODEL="e1000" ]
TEMPLATE_ID="32" 
VCPU="2" 
VMID="74" 

Looking through the scheduler it seems to be getting stuck here:

src/scheduler/src/sched/Scheduler.cc:   734            //----------------------------------------------------------------------
src/scheduler/src/sched/Scheduler.cc:   735            // Test Image Datastore capacity, but not for migrations
src/scheduler/src/sched/Scheduler.cc:   736            //----------------------------------------------------------------------
src/scheduler/src/sched/Scheduler.cc:   737            if (!vm->is_resched())
src/scheduler/src/sched/Scheduler.cc:   738            {
src/scheduler/src/sched/Scheduler.cc:   739                if (vm->test_image_datastore_capacity(img_dspool) == false)
src/scheduler/src/sched/Scheduler.cc:   740                {
src/scheduler/src/sched/Scheduler.cc:   741                    if (vm->is_public_cloud()) //No capacity needed for public cloud
src/scheduler/src/sched/Scheduler.cc:   742                    {
src/scheduler/src/sched/Scheduler.cc:   743                        vm->set_only_public_cloud();
src/scheduler/src/sched/Scheduler.cc:   744                    }
src/scheduler/src/sched/Scheduler.cc:   745                    else
src/scheduler/src/sched/Scheduler.cc:   746                    {
src/scheduler/src/sched/Scheduler.cc:   747                        log_match(vm->get_oid(), "Cannot schedule VM, image datastore " 
src/scheduler/src/sched/Scheduler.cc:   748                            "does not have enough capacity.");
src/scheduler/src/sched/Scheduler.cc:   749                        continue;
src/scheduler/src/sched/Scheduler.cc:   750                    }
src/scheduler/src/sched/Scheduler.cc:   751                }
src/scheduler/src/sched/Scheduler.cc:   752            }

Simply commenting this section out however did not fix the problem. After recompiling and restarting the scheduler it still did not schedule the VM.

It did quit displaying the error message, but did not actually do anything about the VM.

The output became:

Tue Jun  9 18:47:56 2015 [Z0][SCHED][I]: Getting scheduled actions information. Total time: 0.00433847s
Tue Jun  9 18:47:56 2015 [Z0][VM][D]: Found 1 pending/rescheduling VMs.
Tue Jun  9 18:47:57 2015 [Z0][HOST][D]: Discovered 1 enabled hosts.
Tue Jun  9 18:47:57 2015 [Z0][SCHED][I]: Getting VM and Host information. Total time: 0.00669545s
Tue Jun  9 18:47:57 2015 [Z0][SCHED][D]: Match Making statistics:
    Number of VMs: 1
    Total time: 0s
    Total Match time: 5.8248e-05s
    Total Ranking Time: 2.9626e-05s
Tue Jun  9 18:47:57 2015 [Z0][SCHED][D]: Dispatching VMs to hosts:
    VMID    Host    System DS
    -------------------------

Tue Jun  9 18:47:57 2015 [Z0][SCHED][I]: Dispatching VMs to hosts. Total time: 1.4495e-05s

Associated revisions

Revision c9ccd944
Added by Carlos Martín almost 6 years ago

Bug #3832: Scheduler detects VMs to be resumed instead of first deployments

Revision 29045fd1
Added by Carlos Martín almost 6 years ago

Bug #3832: Cleanup xpath calls in scheduler

History

#1 Updated by Ruben S. Montero about 6 years ago

  • Target version set to Release 4.14

#2 Updated by Ruben S. Montero almost 6 years ago

  • Assignee set to Carlos Martín

#3 Updated by Carlos Martín almost 6 years ago

  • Status changed from Pending to New
  • % Done changed from 0 to 90

#4 Updated by Ruben S. Montero almost 6 years ago

  • Status changed from New to Closed
  • Resolution set to fixed

Also available in: Atom PDF