Bug #56

Livemigrate Bug

Added by himanshukhona-gmail-com - about 11 years ago. Updated about 11 years ago.

Status:ClosedStart date:
Priority:HighDue date:
Assignee:Ruben S. Montero% Done:

0%

Category:Core & System
Target version:Release 1.2
Resolution:worksforme Pull request:
Affected Versions:

Description

All -
I tried to run livemigrate and the VM itself disappeared.
I know the issue but it seems something like this should be immediately fixed as a bug or stated as pre-requisite.
The target system on which I was moving the VM had its root mounted on a logical volume whereas the one on which VM was running it was mounted on /dev/sda3

Here is the xend.log from the target system -
[2008-12-01 13:49:59 xend 2759] DEBUG (blkif:24) exception looking up device number for sda3: [Errno 2] No such file or directory: '/dev/sda3'

Here is the snippet of the vm.log -
tail -f vm.log
Memory: 261976
Net_TX: 43
Net_RX: 36642

Mon Dec 1 08:15:16 2008 [VMM][I]: Monitor Information:
CPU : 0
Memory: 262004
Net_TX: 43
Net_RX: 36644

Mon Dec 1 08:15:46 2008 [VMM][I]: Monitor Information:
CPU : 0
Memory: 261900
Net_TX: 43
Net_RX: 36645

Mon Dec 1 08:16:10 2008 [LCM][I]: New VM state is MIGRATE
Mon Dec 1 08:16:12 2008 [LCM][I]: New VM state is RUNNING
Mon Dec 1 08:16:16 2008 [VMM][I]: VM running but it was not found. Assuming it is done.

Mon Dec 1 08:16:16 2008 [LCM][I]: New VM state is EPILOG
Mon Dec 1 08:16:16 2008 [TM][I]: tm_delete.sh: Deleting /opt/opennebula-server//var/10/images
Mon Dec 1 08:16:17 2008 [TM][I]: tm_delete.sh: Executed "rm -rf /opt/opennebula-server//var/10/images".
Mon Dec 1 08:16:17 2008 [DiM][I]: New VM state is DONE

onevm list
ID NAME STAT CPU MEM HOSTNAME TIME
10 vm-examp runn 0 261956 vikhnode4.netma 04 21:13:06

onevm livemigrate 10 9
onevm list
ID NAME STAT CPU MEM HOSTNAME TIME
onevm list
ID NAME STAT CPU MEM HOSTNAME TIME
onevm list
ID NAME STAT CPU MEM HOSTNAME TIME
onevm list
ID NAME STAT CPU MEM HOSTNAME TIME

onevm list
ID NAME STAT CPU MEM HOSTNAME TIME

onevm list
ID NAME STAT CPU MEM HOSTNAME TIME
onevm list
ID NAME STAT CPU MEM HOSTNAME TIME

Thanks,
Himanshu

xend.log - xend.log (22.4 KB) Redmine Admin, 12/01/2008 10:21 AM

History

#1 Updated by himanshu khona - about 11 years ago

Hi -
Is there an update on this ticket?
I think this is critical. In an event livemigrate fails, the VM should be tried to bring back on the original physical server. In this case, it just disappears.
Thanks,
Himanshu

#2 Updated by Ruben S. Montero about 11 years ago

Hi,
Sorry for the delay, we had a problem with the trac notification system :(.

From, the logs you sent (please correct me if I am wrong)
  • You have a running VM in host A
  • You issue a livemigrate from A to B. The VM in A is shutdown and transfered (successfully, from Xen point of view) to B.
  • The VM fails in B, as it have no /dev/sda3. I guess that you can not deploy that VM in B.
  • OpenNebula looks for the VM in B, but it failed and OpenNebula could not find it. ONE assumes that it is done, and removes the VM.

From my point of view this is a mis-configuration of Xen in resource B. And OpenNebula handled the situation correctly. (it migrated the VM, the migration succeeded at the Xen level, and later on the VM failed on the new resource so OpenNebula removed the VM, as it no longer exists). Am I missing something here?

Anyway, I'd like to close this before 1.2 release

Regards and thanks for the feedback

#3 Updated by Ruben S. Montero about 11 years ago

  • Status changed from New to Closed
  • Resolution set to worksforme
Then there are two options:
  • Current behavior. The migration failed and in that process the VM was lost
  • Restart the machine in host A. In this case we can not guarantee that we are restarting the same VM, as the state will have changed since it was started. Specially if the VM disk images were cloned

I think that the current behavior is the correct one. I'll close the ticket and reopen it in case we come out with a better solution for this.

Also available in: Atom PDF