Bug #2789

Error deploy VM with xen4 (test on Centos 6.5)

Added by Andrés Arnáiz over 3 years ago. Updated over 3 years ago.

Status:ClosedStart date:03/20/2014
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:Drivers - VM
Target version:Release 4.6
Resolution:worksforme Pull request:
Affected Versions:OpenNebula 4.4, OpenNebula 4.6

Description

Error at deploy VM in OpenNebula 4.4.1 or 4.5.8 with xen4 hypernodo disk driver (trying change to file: ) Error boot

Wed Mar 19 15:42:07 2014 [LCM][I]: New VM state is PROLOG.
Wed Mar 19 15:42:07 2014 [VM][I]: Virtual Machine has no context
Wed Mar 19 15:49:50 2014 [LCM][I]: New VM state is BOOT
Wed Mar 19 15:49:50 2014 [VMM][I]: Generating deployment file: /var/lib/one/vms/25/deployment.1
Wed Mar 19 15:49:51 2014 [VMM][I]: ExitCode: 0
Wed Mar 19 15:49:51 2014 [VMM][I]: Successfully execute network driver operation: pre.
Wed Mar 19 15:49:52 2014 [VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/xen4/deploy '/var/lib/one//datastores/100/25/deployment.1' 'hypernodo1' 25 hypernodo1
Wed Mar 19 15:49:52 2014 [VMM][I]: xc: info: VIRTUAL MEMORY ARRANGEMENT:
Wed Mar 19 15:49:52 2014 [VMM][I]: Loader: 0000000000100000->000000000019bb44
Wed Mar 19 15:49:52 2014 [VMM][I]: TOTAL: 0000000000000000->000000007f800000
Wed Mar 19 15:49:52 2014 [VMM][I]: ENTRY ADDRESS: 0000000000100000
Wed Mar 19 15:49:52 2014 [VMM][I]: xc: info: PHYSICAL MEMORY ALLOCATION:
Wed Mar 19 15:49:52 2014 [VMM][I]: 4KB PAGES: 0x0000000000000200
Wed Mar 19 15:49:52 2014 [VMM][I]: 2MB PAGES: 0x00000000000003fb
Wed Mar 19 15:49:52 2014 [VMM][I]: 1GB PAGES: 0x0000000000000000
Wed Mar 19 15:49:52 2014 [VMM][I]: DEBUG libxl__blktap_devpath 37 aio:/var/lib/one//datastores/100/25/disk.0
Wed Mar 19 15:49:52 2014 [VMM][I]: libxl: error: libxl.c:1871:device_disk_add: failed to get blktap devpath for 0x1f03c80
Wed Mar 19 15:49:52 2014 [VMM][I]:
Wed Mar 19 15:49:52 2014 [VMM][I]: libxl: error: libxl_create.c:951:domcreate_launch_dm: unable to add disk devices
Wed Mar 19 15:49:52 2014 [VMM][I]: libxl: error: libxl_dm.c:1250:libxl__destroy_device_model: could not find device-model's pid for dom 2
Wed Mar 19 15:49:52 2014 [VMM][I]: libxl: error: libxl.c:1419:libxl__destroy_domid: libxl__destroy_device_model failed for 2
Wed Mar 19 15:49:52 2014 [VMM][E]: Unable
Wed Mar 19 15:49:52 2014 [VMM][I]: ExitCode: 3
Wed Mar 19 15:49:52 2014 [VMM][I]: Failed to execute virtualization driver operation: deploy.
Wed Mar 19 15:49:52 2014 [VMM][E]: Error deploying virtual machine: Unable
Wed Mar 19 15:49:52 2014 [DiM][I]: New VM state is FAILED

History

#1 Updated by Ruben S. Montero over 3 years ago

CAn you double check that disk.0 is properly created in the node?

#2 Updated by Andrés Arnáiz over 3 years ago

Ruben S. Montero wrote:

CAn you double check that disk.0 is properly created in the node?

Yes the file exits

with xend started (xen3 conf):
xm créate /var/lib/one//datastores/100/25/deployment.1'

Work

But with xend stopped and xl:

xl1 create var/lib/one//datastores/100/25/deployment.1

appeard the same error
xc: info: VIRTUAL MEMORY ARRANGEMENT:
0000000000100000->000000000019bb44
TOTAL: 0000000000000000->000000007f800000
ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
4KB PAGES: 0x0000000000000200
2MB PAGES: 0x00000000000003fb
1GB PAGES: 0x0000000000000000
DEBUG libxl__blktap_devpath 37 aio:/var/lib/one//datastores/100/25/disk.0
libxl: error: libxl.c:1871:device_disk_add: failed to get blktap devpath for 0x1f03c80

In deployment.x type disk driver it set file:

#3 Updated by Javi Fontan over 3 years ago

Can you send us both the file /var/lib/one/vms/25/deployment.1 and the output of:

onevm show 25 -a

#4 Updated by Andrés Arnáiz over 3 years ago

Javi Fontan wrote:

Can you send us both the file /var/lib/one/vms/25/deployment.1 and the output of:

[...]

Sorry i delete de 25 vm folder, but other can you see (using xen4 file: and 'tap2:tapdisk:aio': (Alwayas check thats files disk0, disk1 )exists

[oneadmin@opennebulafe one]# cat /var/lib/one/vms/2/deployment.0
name = 'one-2'
#O CPU_CREDITS = 256
memory = '2048'
builder = "hvm"
boot = "n"
disk = [
'file:/var/lib/one//datastores/100/2/disk.0,hdb,w',
'tap2:tapdisk:aio:/var/lib/one//datastores/100/2/disk.1,hda,r',
]
vif = [
]
vnc=1
vnclisten=0.0.0.0
vncunused=0
vncdisplay=2

VIRTUAL MACHINE TEMPLATE
AUTOMATIC_REQUIREMENTS="CLUSTER_ID = 100 & !(PUBLIC_CLOUD = YES)"
CONTEXT=[
DISK_ID="1",
NETWORK="YES",
TARGET="hda" ]
CPU="1"
GRAPHICS=[
LISTEN="0.0.0.0",
PORT="5902",
TYPE="VNC" ]
INPUT=[
BUS="xen",
TYPE="mouse" ]
INPUT=[
BUS="xen",
TYPE="tablet" ]
MEMORY="2048"
OS=[
ARCH="x86_64",
BOOT="network",
GUESTOS="windows7_64Guest" ]
TEMPLATE_ID="0"
VMID="2"

#5 Updated by Javi Fontan over 3 years ago

The problem seems to be using "tap2:tapdisk:aio" with xl. Is this set in /etc/one/vmm_exec/vmm_exec_xen4.conf?

You can try adding DRIVER="file:" in the context section to check if it fixes the problem.

#6 Updated by Andrés Arnáiz over 3 years ago

Javi Fontan wrote:

The problem seems to be using "tap2:tapdisk:aio" with xl. Is this set in /etc/one/vmm_exec/vmm_exec_xen4.conf?

You can try adding DRIVER="file:" in the context section to check if it fixes the problem.

Yes i changed in xen4/vmm_exec_xen4.conf DRIVER TO FILE: but in xl créate, show blktap_devpath 37 aio:/var/lib/one//datastores/100/25/disk.0

It musst shown file:/var/lib/one//datastores/100/25/disk.0 true ?

NOTE: Today change one.conf to used xen3 because with xen4 dont work for me :(

#7 Updated by Andrés Arnáiz over 3 years ago

Andrés Arnáiz wrote:

Javi Fontan wrote:

The problem seems to be using "tap2:tapdisk:aio" with xl. Is this set in /etc/one/vmm_exec/vmm_exec_xen4.conf?

You can try adding DRIVER="file:" in the context section to check if it fixes the problem.

Yes i changed in xen4/vmm_exec_xen4.conf DRIVER TO FILE: but in xl créate, show blktap_devpath 37 aio:/var/lib/one//datastores/100/25/disk.0

It musst shown file:/var/lib/one//datastores/100/25/disk.0 true ?

NOTE: Today change one.conf to used xen3 because with xen4 dont work for me :(

I will try tomorrow again


#VCPU  = 1
#OS    = [ kernel="/vmlinuz", initrd="/initrd.img", root="sda1", kernel_cmd="ro", hvm="yes" ]
#FEATURES = [ PAE = "no", ACPI = "yes", APIC = "yes" ]

CREDIT = 256
#DISK   = [ driver = "raw:" ]
DISK   = [ driver = file:" ]

#RAW   = [ type = "xen", data = "on_crash=destroy" ]
~

#8 Updated by Andrés Arnáiz over 3 years ago

#VCPU = 1
#OS = [ kernel="/vmlinuz", initrd="/initrd.img", root="sda1", kernel_cmd="ro", hvm="yes" ]
#FEATURES = [ PAE = "no", ACPI = "yes", APIC = "yes" ]

CREDIT = 256
#DISK = [ driver = "raw:" ]
DISK = [ driver = "file:" ]

#RAW = [ type = "xen", data = "on_crash=destroy" ]
~

#9 Updated by Andrés Arnáiz over 3 years ago

Post you more info about deploy win xl and xm

Template in sunstone (image datablock create succesfull with ntfs file system - ):

[oneadmin@opennebulafe ~]$ onetemplate show 0
TEMPLATE 0 INFORMATION
ID             : 0
NAME           : Windows 7 32 Bits
USER           : oneadmin
GROUP          : oneadmin
REGISTER TIME  : 03/19 18:36:17

PERMISSIONS
OWNER          : uma
GROUP          : um-
OTHER          : u--

TEMPLATE CONTENTS
CONTEXT=[
  NETWORK="YES" ]
CPU="1" 
DISK=[
  DRIVER="file:",
  IMAGE="Disco 30 Gb NTFS",
  IMAGE_UNAME="oneadmin" ]
FEATURES=[
  ACPI="yes",
  APIC="yes",
  DEVICE_MODEL="/usr/lib/xen/bin/qemu-dm",
  PAE="yes" ]
GRAPHICS=[
  TYPE="VNC" ]
INPUT=[
  BUS="xen",
  TYPE="mouse" ]
INPUT=[
  BUS="xen",
  TYPE="tablet" ]
MEMORY="2048" 
NIC=[
  NETWORK="RedLocal",
  NETWORK_UNAME="oneadmin" ]
OS=[
  ARCH="i686",
  BOOT="hd",
  GUESTOS="windows7Guest" ]
SCHED_REQUIREMENTS="ID=\"0\" | CLUSTER_ID=\"100\"" 
VCPU="2" 
[oneadmin@opennebulafe ~]$

[oneadmin@micronodo1 5]$ cat deployment.0_original
name = 'one-5'
#O CPU_CREDITS = 256
memory  = '2048'
vcpus  = '2'
builder = "hvm" 
boot = "c" 
disk = [
    'file:/var/lib/one//datastores/100/5/disk.0,hda,w',
    'tap:aio:/var/lib/one//datastores/100/5/disk.1,hdb,r',
]
vif = [
    ' mac=02:00:c0:a8:01:aa,ip=192.168.1.170,bridge=br0',
]
vnc=1
vncunused=0
vncdisplay=5
pae = 1
acpi = 1
apic = 1
device_model = '/usr/lib/xen/bin/qemu-dm'

[oneadmin@micronodo1 5]$ sudo xl -f create deployment.0_original
Parsing config from deployment.0_original
WARNING: ignoring device_model directive.
WARNING: Use "device_model_override" instead if you really want a non-default device_model
xc: info: VIRTUAL MEMORY ARRANGEMENT:
  Loader:        0000000000100000->000000000019bb44
  TOTAL:         0000000000000000->000000007f800000
  ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
  4KB PAGES: 0x0000000000000200
  2MB PAGES: 0x00000000000003fb
  1GB PAGES: 0x0000000000000000
DEBUG libxl__blktap_devpath 37 aio:/var/lib/one//datastores/100/5/disk.0
libxl: error: libxl.c:1871:device_disk_add: failed to get blktap devpath for 0xbbeca0

DEBUG libxl__blktap_devpath 37 aio:/var/lib/one//datastores/100/5/disk.1
libxl: error: libxl.c:1871:device_disk_add: failed to get blktap devpath for 0xbbed30

libxl: error: libxl_create.c:951:domcreate_launch_dm: unable to add disk devices
libxl: error: libxl_dm.c:1250:libxl__destroy_device_model: could not find device-model's pid for dom 88
libxl: error: libxl.c:1419:libxl__destroy_domid: libxl__destroy_device_model failed for 88

With xm:

[oneadmin@micronodo1 5]$ sudo xm create deployment.0_original
Using config file "./deployment.0_original".
Error: Device 0 (vif) could not be connected. Hotplug scripts not working.

I think that image datablock storage is created in aio format not ntfs because when créate vm always try load disk.0 like aio: , not file:
if i créate a file with command dd, its work.

[oneadmin@opennebulafe ~]$ oneimage list
  ID USER       GROUP      NAME            DATASTORE     SIZE TYPE PER STAT RVMS
   9 oneadmin   oneadmin   Disco 30 Gb NTF images_LUN   29.3G DB    No used    1
[oneadmin@opennebulafe ~]$ oneimage show 9
IMAGE 9 INFORMATION
ID             : 9
NAME           : Disco 30 Gb NTFS
USER           : oneadmin
GROUP          : oneadmin
DATASTORE      : images_LUN-Datos2
TYPE           : DATABLOCK
REGISTER TIME  : 03/21 11:46:16
PERSISTENT     : No
SOURCE         : /var/lib/one//datastores/101/3b1b3eb2c3669e3f4ada9b8bc5da8a91
FSTYPE         : ntfs
SIZE           : 29.3G
STATE          : used
RUNNING_VMS    : 1

PERMISSIONS
OWNER          : uma
GROUP          : um-
OTHER          : u--

IMAGE TEMPLATE
DESCRIPTION="Disco 30 Gb" 
DEV_PREFIX="hd" 
DRIVER="file:" 
TARGET="hda" 

VIRTUAL MACHINES

    ID USER     GROUP    NAME            STAT UCPU    UMEM HOST             TIME
     5 oneadmin oneadmin Windows 7 32 Bi fail    0      0K              0d 16h58

#10 Updated by Andrés Arnáiz over 3 years ago

Hello again,

I'm investigating why xl not start deployment.0 o vm on centos 6.5. I think that Its a trouble with the shared resource /var/lib/one when it has a iscsi connector for datastores.

When i copy disk.0 to local folder in hypernode1 it works ok, but it try to load from /var/lib/one/datastores/100/5/.... it give me the error and dont start de vm.
Can you explain me how can solve this issue? Thanks you

#11 Updated by Andrés Arnáiz over 3 years ago

Issue resolve!!

xl start vm in hypernode giving permission to disk.* : (chmod 777 disk.0 disk.1)


-rw-rw-r-- 1 oneadmin oneadmin         347 mar 26 18:28 deployment.0
-rw-rw-r-- 1 oneadmin oneadmin         387 mar 22 11:23 deployment.0_original
-rwxrwxrwx 1 oneadmin oneadmin 32463912960 mar 26 18:31 disk.0
-rwxrwxrwx 1 oneadmin oneadmin      372736 mar 22 05:09 disk.1
lrwxrwxrwx 1 oneadmin oneadmin          36 mar 22 05:09 disk.1.iso -> /var/lib/one/datastores/100/5/disk.1

[oneadmin@hipernodo1 5]$sudo xl list

Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0  1023     8     r-----   20211.1
one-5                                       29  2043     2     -b----      37.3

#12 Updated by Ruben S. Montero over 3 years ago

  • Status changed from Pending to Closed
  • Resolution set to worksforme

Andrés Arnáiz wrote:

Issue resolve!!

xl start vm in hypernode giving permission to disk.* : (chmod 777 disk.0 disk.1)

[...]

OK closing it...

#13 Updated by Marcello Lodi over 3 years ago

Hi,
I have the same problem while creating the context (disk.1)
The file permissions are:
rw-r--r- 1 oneadmin oneadmin 745472 May 16 10:07 disk.1

Could you explain how creating the file with (at least)
rw-rw-rw 1 oneadmin oneadmin 745472 May 16 10:07 disk.1
these permissions?

It seems that the context file creation does not care of
DEFAULT_UMASK in /etc/one/oned.conf
(mine is DEFAULT_UMASK 111)

#14 Updated by Ruben S. Montero over 3 years ago

DEFAULT_UMASK is for OpenNebula objects, the files created are created using the default umask. Update oneadmin account config

#15 Updated by Marcello Lodi over 3 years ago

You are right. Setting the umask (in my case 000 in order to have rwxrwxrwx),
the directories in datastore/0
are created with full permission, i.e.
drwxrwxrwx 2 oneadmin oneadmin 4096 Jun 6 13:00 171

Furthermore, the content of the directory is:
[root@one-frontend 171]# ls al
total 54300
drwxrwxrwx 2 oneadmin oneadmin 4096 Jun 6 13:00 .
drwxr-xr-x 3 oneadmin oneadmin 4096 Jun 6 13:00 ..
-rw-rw-rw
1 oneadmin oneadmin 451 Jun 6 13:00 deployment.0
lrwxrwxrwx 1 oneadmin oneadmin 21 Jun 6 13:00 disk.0 > /dev/vg-one/lv-one-12
-rw-r--r-
1 oneadmin oneadmin 745472 Jun 6 13:00 disk.1
lrwxrwxrwx 1 oneadmin oneadmin 36 Jun 6 13:00 disk.1.iso > /var/lib/one/datastores/0/171/disk.1
-rw-rw-rw
1 oneadmin oneadmin 49971712 Jun 6 13:00 initrd
rw-rw-rw 1 oneadmin oneadmin 4867280 Jun 6 13:00 kernel

My DEFAULT_UMASK (inside oned.conf) is 111
and it is clear that every file is created with rw-rw-rw permission,
but disk.1 (which is the context).

The strange thing is that from the sunstone server, if I run the vm
with xl create -c deployment.0
everything is ok.
But if I try from onother worker node, the process fails.

I am sure that there something wrong with nfs, or in general the shared filesystem;
but it could be useful to have the context file created with full permissions.

Also available in: Atom PDF