Bug #4299

Ceph live snapshot - domfsfreeze returns false

Added by Guillaume Michaud over 5 years ago. Updated about 5 years ago.

Status:ClosedStart date:01/21/2016
Priority:NormalDue date:
Assignee:Javi Fontan% Done:

0%

Category:Drivers - Storage
Target version:Release 5.0
Resolution:fixed Pull request:
Affected Versions:OpenNebula 4.14

Description

Freeze/Thaw doesn't works on Ceph live snapshot because the if statement below returns false :

https://github.com/OpenNebula/one/blob/release-4.14.2/src/tm_mad/ceph/snap_create_live#L94

Tried with the code below. Got "before if" but never "inside if" in the temp file.

SNAP_CREATE_CMD=$(cat <<EOF
set -e

echo "before if" > /tmp/freeze-test
if virsh -c $LIBVIRT_URI domfsfreeze $DEPLOY_ID ; then
echo "inside if" > /tmp/freeze-test
trap "virsh -c $LIBVIRT_URI domfsthaw $DEPLOY_ID" EXIT TERM INT HUP
fi
[...]
EOF
)

Work if domfsfreeze is called directly on the host :
virsh -c qemu+tcp://localhost/system domfsfreeze one-546
Froze 1 filesystem(s)

(result in the guest)
[root@CentOS tmp]# cat freeze
I'm frozen

ONE-4.14.2
Frontends and Hosts on CentOS 7.2
qemu-kvm-ev-2.3.0-31.el7_2.4.1
libvirtd 1.2.17

0001-suggestion-for-bug-4299.patch Magnifier (2.03 KB) Anton Todorov, 03/20/2016 09:11 PM

Associated revisions

Revision d8faab4f
Added by Javi Fontan about 5 years ago

bug #4299: bugs freezing FS for ceph and qcow2 snaps

Revision 914b8884
Added by Javi Fontan about 5 years ago

bug #4299: use LIBVIRT_URI in qcow2 snapshot

History

#1 Updated by Ruben S. Montero over 5 years ago

Hi Guillaume,

Note that domfsfreeze is called directly in the host, at the end of the script there is a

https://github.com/OpenNebula/one/blob/release-4.14.2/src/tm_mad/ceph/snap_create_live#L107

line that is executed in the $SRC_HOST via ssh

Could it be different libvirt connection methods in your setup vs tests , i.e. $LIBVIRT_URI?

#2 Updated by Guillaume Michaud over 5 years ago

Hi Ruben,

The only thing I see about LIBVIRT_URI is that I changed it to "qemu+tcp://localhost/system" to support multiple action per host.

The ssh_exec_and_log_stdin command and SNAP_CREATE_CMD block should works because the snapshot is taken correctly. Only the virsh line seems not working. Maybe is just the if syntax or something like that. I will make other tests tonight.

rbd snap ls one/one-0-546-0
SNAPID NAME SIZE
99 0 10240 MB
100 1 10240 MB
101 2 10240 MB

#3 Updated by Ruben S. Montero over 5 years ago

OK... keep us updated :)

#4 Updated by Anton Todorov over 5 years ago

Hi,

Just checking the issues when I've hit this one. I am sorry for not observed it earlier.

Yuo can find attached a proposed patch.

Honestly I don't know what was the exact reason but in our addon I am escaping the LIBVIRT_URI to avoid the variable substitution in the front-end.

$LIBVIRT_URI --> \$LIBVIRT_URI

And the patch is fixing two separate issues in qcow2/snap_create_live (hard-coded connection that should be LIBVIRT_URI too, if I am not wrong.)

Cheers,
Anton Todorov

ps. Too late in the night to split on two patches. Please excuse me ;)

#5 Updated by Ruben S. Montero over 5 years ago

  • Category set to Drivers - Storage
  • Status changed from Pending to New
  • Target version set to Release 5.0

#6 Updated by Ruben S. Montero over 5 years ago

Great!!!! Thanks for the feedback, we'll take a look at it.....

Anton Todorov wrote:

Hi,

Just checking the issues when I've hit this one. I am sorry for not observed it earlier.

Yuo can find attached a proposed patch.

Honestly I don't know what was the exact reason but in our addon I am escaping the LIBVIRT_URI to avoid the variable substitution in the front-end.

$LIBVIRT_URI --> \$LIBVIRT_URI

And the patch is fixing two separate issues in qcow2/snap_create_live (hard-coded connection that should be LIBVIRT_URI too, if I am not wrong.)

Cheers,
Anton Todorov

ps. Too late in the night to split on two patches. Please excuse me ;)

#7 Updated by Ruben S. Montero about 5 years ago

  • Assignee set to Javi Fontan

#8 Updated by Javi Fontan about 5 years ago

  • Status changed from New to Closed
  • Resolution set to fixed

I don't understand how this can work as LIBVIRT_URI is not defined in the remote host. There was a bug in those drivers as it was not reading kvmrc from vmm driver. That way LIBVIRT_URI was undefinded. In ceph it also lacked DEPLOY_ID.

Its now solved in master. Thanks for the tips. Anton, I've also applied the other fix you have in the patch.

Also available in: Atom PDF