Bug #845
tm_mv.sh error with KVM in RC1 on Debian Squeeze
Status: | Closed | Start date: | 09/28/2011 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | Javi Fontan | % Done: | 0% | |
Category: | Drivers - Auth | |||
Target version: | Release 3.2 | |||
Resolution: | fixed | Pull request: | ||
Affected Versions: | OpenNebula 3.0 |
Description
Hi,
first: Please delete Bugs no. 842 and 844 (embarassing).
After a "cold migration" of a vm (which worked fine) the vm finally ended up in "FAILED" state.
Excerpt from the VM-Log (call of "scp -r" leads to endless recursion):
Wed Sep 28 22:12:48 2011 [DiM][I]: New VM state is PENDING. Wed Sep 28 22:12:53 2011 [DiM][I]: New VM state is ACTIVE. Wed Sep 28 22:12:53 2011 [LCM][I]: New VM state is PROLOG. Wed Sep 28 22:13:38 2011 [TM][I]: Command execution fail: /usr/lib/one/tm_commands/ssh/tm_mv.sh atlas1:/var/lib/one/21/images atlas1:/var/lib/one//21/images Wed Sep 28 22:13:38 2011 [TM][I]: tm_mv.sh: Moving /var/lib/one/21/images Wed Sep 28 22:13:38 2011 [TM][I]: tm_mv.sh: Executed "ssh atlas1 mkdir -p /var/lib/one//21". Wed Sep 28 22:13:38 2011 [TM][E]: tm_mv.sh: Command "scp -r atlas1:/var/lib/one/21/images atlas1:/var/lib/one//21/images" failed. Wed Sep 28 22:13:38 2011 [TM][E]: tm_mv.sh: /var/lib/one/21/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images /images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/im ages/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/image s/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/images/i mages/images/images/images/images/images: name too long Wed Sep 28 22:13:38 2011 [TM][E]: Could not copy atlas1:/var/lib/one/21/images to atlas1:/var/lib/one//21/images Wed Sep 28 22:13:38 2011 [TM][I]: ExitCode: 1 Wed Sep 28 22:13:40 2011 [TM][E]: Error excuting image transfer script: Could not copy atlas1:/var/lib/one/21/images to atlas1:/var/lib/one//21/images Wed Sep 28 22:13:40 2011 [DiM][I]: New VM state is FAILED
Output of "onevm show 21":
VIRTUAL MACHINE 21 INFORMATION ID : 21 NAME : one-21 USER : oneadmin GROUP : oneadmin STATE : ACTIVE LCM_STATE : RUNNING HOSTNAME : atlas1 START TIME : 09/28 20:42:48 END TIME : 09/28 22:13:40 DEPLOY ID : one-21 VIRTUAL MACHINE MONITORING NET_TX : 63761 NET_RX : 3700240 USED MEMORY : 2097152 USED CPU : 0 VIRTUAL MACHINE TEMPLATE CPU=1 DISK=[ BUS=scsi, DISK_ID=0, READONLY=no, SOURCE=/var/lib/one/images/opensuse11.4.img, TARGET=sda, TYPE=disk ] ERROR=[ MESSAGE="Error excuting image transfer script: Could not copy atlas1:/var/lib/one/21/images to atlas1:/var/lib/one//21/images", TIMESTAMP="Wed Sep 28 22:13:38 2011" ] ERROR=[ MESSAGE="Error shuting down VM: Timeout reached and VM one-21 is still alive", TIMESTAMP="Wed Sep 28 22:25:44 2011" ] ERROR=[ MESSAGE="Error shuting down VM: Timeout reached and VM one-21 is still alive", TIMESTAMP="Wed Sep 28 22:35:36 2011" ] FEATURES=[ ACPI=yes ] MEMORY=2048 NAME=one-21 OS=[ ARCH=x86_64, BOOT=hd, INITRD=/initrd.img, KERNEL=/vmlinuz, ROOT=sda2 ] REQUIREMENTS="NAME = \"atlas1\"" TEMPLATE_ID=0 VMID=21 VIRTUAL MACHINE HISTORY SEQ HOSTNAME REASON START TIME PTIME 0 atlas1 user 09/28 20:42:53 00 00:10:34 00 00:00:41 1 atlas2 stop 09/28 20:53:20 00 01:13:28 00 00:06:34 2 atlas1 erro 09/28 22:12:53 00 00:00:47 00 00:00:47 3 atlas1 none 09/28 22:15:23 00 00:58:14 00 00:00:41
Related issues
Associated revisions
History
#1 Updated by Javi Fontan almost 10 years ago
This problem usually happens when /var/lib/one is shared (using NFS or other means) and tm_ssh is used for transfering. Is this the case?
#2 Updated by Jochem Ippers almost 10 years ago
Hi Javi!
Well, no, it is "non-shared". ;-)
The problem is that source-host and destination-host are the same and so are the directories and that leads to an endless loop if you copy recursively. I guess nothing should happen at all in this case, especially not copying, so a test for identity of the two complete paths (= including the hostnames) should be enough to avoid this to happen.
Kind regards.
Jochem
#3 Updated by Jochem Ippers almost 10 years ago
Correction: The two paths are not completely identical. There are double slashes in the second directory path which of course have to be reduced to single ones before comparing the paths.
#4 Updated by Ruben S. Montero almost 10 years ago
- Target version deleted (
Release 3.0)
#5 Updated by Javi Fontan over 9 years ago
The bug should be fixed in master and one-3.0 repositories, can you confirm this?
#6 Updated by Jochem Ippers over 9 years ago
Hello Javi,
yes I can confirm this.
#7 Updated by Javi Fontan over 9 years ago
- Category changed from CLI to Drivers - Auth
- Status changed from New to Closed
- Assignee set to Javi Fontan
- Target version set to Release 3.4
- Resolution set to fixed
Thanks, I am closing these tickets.
#8 Updated by Ruben S. Montero over 9 years ago
- Target version changed from Release 3.4 to Release 3.2