Bug #594

failed to restart vm

Added by Frederic Dreier about 10 years ago. Updated about 8 years ago.

Status:ClosedStart date:04/27/2011
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-
Resolution:fixed Pull request:
Affected Versions:OpenNebula 2.2

Description

After a reboot, openenbula failed to restart a VM.

in oned.log:
Wed Apr 27 09:25:05 2011 [ReM][D]: VirtualMachineAction invoked
Wed Apr 27 09:25:05 2011 [DiM][D]: Restarting VM 19
Wed Apr 27 09:25:05 2011 [TM][D]: Message received: LOG - 19 tm_delete.sh: Deleting

Wed Apr 27 09:25:05 2011 [TM][D]: Message received: LOG - 19 tm_delete.sh: Executed "rm -rf ".

Wed Apr 27 09:25:05 2011 [TM][D]: Message received: TRANSFER SUCCESS 19 -

Wed Apr 27 09:25:10 2011 [ReM][D]: HostPoolInfo method invoked

it is on ubuntu 11.4 with OpenNebula from GIT (27 may 2011).

History

#1 Updated by Frederic Dreier about 10 years ago

It seems to appear only when I kill the process on the node, restart OpenNebula on controller, wait VM to be set in "unknown" state and run "onevm restart <id>" command.

If I kill the process and restart the VM when in unknown state (skipping the daemon restart): the VM reboot properly.

I observed it once when I got a power problem on a controller and a node. And I was able to reproduce it twice on a test system (fresh install).

#2 Updated by Frederic Dreier about 10 years ago

I get an error when trying to resubmit the failed VM:

Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: DST: /srv/cloud/one/opennebula/var//2/images/disk.0
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Creating directory /srv/cloud/one/opennebula/var//2/images
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Executed "mkdir -p /srv/cloud/one/opennebula/var//2/images".
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Executed "chmod a+w /srv/cloud/one/opennebula/var//2/images".
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Cloning /srv/cloud/images/template-ubuntu10.4_64bits_10g.img
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Executed "cp -r /srv/cloud/images/template-ubuntu10.4_64bits_10g.img /srv/cloud/one/opennebula/var//2/images/disk.0".
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: ERROR: Command "chmod a+rw /srv/cloud/one/opennebula/var//2/images/disk.0" failed.
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: ERROR: chmod: changing permissions of `/srv/cloud/one/opennebula/var//2/images/disk.0': Operation not permitted
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: TRANSFER FAILURE 2 chmod: changing permissions of `/srv/cloud/one/opennebula/var//2/images/disk.0': Operation not permitted

When actually permissions look like that (before the failure, since then the images directory is deleted):
/srv/cloud/one/opennebula/var/2/images:
total 1452920
rw-r--r- 1 oneadmin cloud 787 2011-04-27 22:47 deployment.7
rw-rw-rw 1 root root 1487405056 2011-04-27 22:48 disk.0
rw-r--r- 1 libvirt-qemu kvm 374784 2011-04-27 22:44 disk.1

#3 Updated by Ruben S. Montero about 8 years ago

  • Status changed from New to Closed
  • Resolution set to fixed
  • Affected Versions OpenNebula 2.2 added

The restart cycle has been improved and should work by now

Also available in: Atom PDF