Bug #5174

termination of undeployed vm fails on ceph image DS

Added by Arnaud Abélard about 4 years ago. Updated almost 4 years ago.

Status:ClosedStart date:06/03/2017
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:Drivers - Storage
Target version:-
Resolution:wontfix Pull request:
Affected Versions:OpenNebula 5.2

Description

We have 5 opennebula nodes, one controller, one ceph image datastore available on the 5 nodes (not on the controller), one shared system datastore available on all the nodes and controller.

When using a ceph image datastore, if a VM is in UNDEPLOYED state and the user tries to terminate (hard) it, it will fails:

Sat Jun  3 18:40:45 2017 [Z0][VM][I]: New state is POWEROFF
Sat Jun  3 18:40:45 2017 [Z0][VM][I]: New LCM state is LCM_INIT
Sat Jun  3 18:40:54 2017 [Z0][VM][I]: New state is ACTIVE
Sat Jun  3 18:40:54 2017 [Z0][VM][I]: New LCM state is EPILOG_UNDEPLOY
Sat Jun  3 18:40:54 2017 [Z0][VM][I]: New state is UNDEPLOYED
Sat Jun  3 18:40:54 2017 [Z0][VM][I]: New LCM state is LCM_INIT
Sat Jun  3 19:06:12 2017 [Z0][VM][I]: New state is ACTIVE
Sat Jun  3 19:06:12 2017 [Z0][VM][I]: New LCM state is EPILOG
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: Command execution fail: /var/lib/one/remotes/tm/ceph/delete one-ctrl-1.dprv.univ-nantes.prive:/var/lib/one//datastores/0/762/disk.0 762 100
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: delete: Deleting /var/lib/one/datastores/0/762/disk.0
Sat Jun  3 19:06:13 2017 [Z0][TM][E]: delete: Command "    RBD="rbd --id opennebula" 
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: 
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: if [ "$(rbd_format opennebula/one-109-762-0)" = "2" ]; then
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: rbd_rm_r $(rbd_top_parent opennebula/one-109-762-0)
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: 
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: if [ -n "762-0" ]; then
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: rbd_rm_snap opennebula/one-109 762-0
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: fi
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: else
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: rbd --id opennebula rm opennebula/one-109-762-0
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: fi" failed: bash: line 109: rbd: command not found
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: bash: line 221: rbd: command not found
Sat Jun  3 19:06:13 2017 [Z0][TM][E]: Error deleting opennebula/one-109-762-0 in one-ctrl-1.dprv.univ-nantes.prive
Sat Jun  3 19:06:13 2017 [Z0][TM][I]: ExitCode: 127
Sat Jun  3 19:06:13 2017 [Z0][TM][E]: Error executing image transfer script: Error deleting opennebula/one-109-762-0 in one-ctrl-1.dprv.univ-nantes.prive
Sat Jun  3 19:06:13 2017 [Z0][VM][I]: New LCM state is EPILOG_FAILURE

the ceph-image datastore :

root@one-ctrl-1:/var/log/one# onedatastore show 100
DATASTORE 100 INFORMATION                                                       
ID             : 100                 
NAME           : ceph-image          
USER           : oneadmin            
GROUP          : oneadmin            
CLUSTERS       : 0                   
TYPE           : IMAGE               
DS_MAD         : ceph                
TM_MAD         : ceph                
BASE PATH      : /var/lib/one//datastores/100
DISK_TYPE      : RBD                 
STATE          : READY               

DATASTORE CAPACITY                                                              
TOTAL:         : 57.4T               
FREE:          : 56.4T               
USED:          : 985.7G              
LIMIT:         : -                   

PERMISSIONS                                                                     
OWNER          : um-                 
GROUP          : u--                 
OTHER          : u--                 

DATASTORE TEMPLATE                                                              
BRIDGE_LIST="iaas-vm-1.u07.univ-nantes.prive iaas-vm-2.u07.univ-nantes.prive iaas-vm-3.u07.univ-nantes.prive iaas-vm-4.u07.univ-nantes.prive iaas-vm-5.u07.univ-nantes.prive iaas-vm-6.u07.univ-nantes.prive" 
CEPH_HOST="172.20.107.54:6789 172.20.106.54:6789 172.20.108.54:6789" 
CEPH_SECRET="6f5cab54-404b-4c63-b883-65ae350be8e7" 
CEPH_USER="opennebula" 
CLONE_TARGET="SELF" 
DATASTORE_CAPACITY_CHECK="YES" 
DISK_TYPE="RBD" 
DS_MAD="ceph" 
LN_TARGET="NONE" 
POOL_NAME="opennebula" 
TM_MAD="ceph" 
TYPE="IMAGE_DS" 


Related issues

Duplicated by Bug #5320: Nebula fails terminating undeployed machines using ceph Closed 08/24/2017

History

#1 Updated by Javi Fontan almost 4 years ago

  • Category set to Drivers - Storage
  • Status changed from Pending to Closed
  • Resolution set to wontfix

OpenNebula frontend also needs to be a ceph client. We've changed the documentation about it for 5.4.

#2 Updated by Javi Fontan almost 4 years ago

  • Duplicated by Bug #5320: Nebula fails terminating undeployed machines using ceph added

Also available in: Atom PDF