Bug #3937
After disk snapshot with suspend / resume vlan does not get (re-)tagged on openvswitch
Status: | Closed | Start date: | 08/14/2015 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 0% | |
Category: | Drivers - VM | |||
Target version: | Release 4.14 | |||
Resolution: | fixed | Pull request: | ||
Affected Versions: | OpenNebula 4.12 |
Description
The the vlan does not get (re-)tagged after the
VM gets resumed again (after a snapshot create action for a VM (with a RAW
image):
ovs-vsctls show:
Port "vnet1"
Interface "vnet1"
While it was
Port "vnet1"
tag: 228
Interface "vnet1"
before the "DISK_SNAPSHOT" / suspend action
Associated revisions
Bug #3937: After disk snapshot with suspend / resume vlan does not get
(re-)tagged on openvswitch
Bug #3937: Apply network drivers after disk-snapshot-revert
History
#1 Updated by Javi Fontan almost 6 years ago
- Category set to Drivers - VM
- Target version set to Release 4.14
#2 Updated by Jaime Melis almost 6 years ago
- Status changed from Pending to Closed
- Resolution set to fixed
- Affected Versions OpenNebula 4.12 added
#3 Updated by Stefan Kooman almost 6 years ago
I replaced /var/lib/one/remotes/vmm/one_vmm_exec.rb with this new version and did a "onehost sync --force" afterwards. Tag does not get re-applied. Besides that, it looks like the VM does not get a "poweroff --hard" before the revert ... leading to OS crash. Is this the right way to test this fix?
#4 Updated by Stefan Kooman almost 6 years ago
I just checked out master and recompiled / reinstalled (/usr/lib/one/mads/one_vmm_exec.rb is also the new version). The VM is shut down but seems to be resumed as there is no normal boot sequence when the VM is running again (bios -> boot). After a reboot the VM ends in a strack trace with ext4 inode errors ...
#5 Updated by Jaime Melis almost 6 years ago
- Status changed from Closed to Assigned
I have updated added a part that was missing, in order to re-apply also on revert, not only on create. It should not be necessary to run onehost sync or to reinstall, just replace the one_vmm_exec.rb and restart opennebula.
I don't understand exactly what you are doing. My workflow is as follows:
- VM is running
- onevm disk-snapshot-create <vmid> <diskid> <snapshot name>
- I observe how the VM disappears from libvirt for a second as it's being suspended, and then reappears and the vnm drivers are reapplied (testing with ovswitch)
- onevm disk-snapshot-revert ...
- I observe the same thing as with disk-snapshot-create
Note that there is no poweroff --hard involved here, I'm doing this while the VM is running, and after the operation the VM is running again. OpenNebula does the suspend behind the scenes, I only need to instruct it to do disk-snapshot-create.
Can you clarify what you mean with your previous comments? Maybe posting your workflow will help.
#6 Updated by Stefan Kooman almost 6 years ago
Note that there is no poweroff --hard involved here, I'm doing this while the VM is running, and after the operation the VM is running again. OpenNebula does the suspend behind the scenes, I only need to instruct it to do disk-snapshot-create.
I think that's the problem: if you replace the root disk of a VM (rootfs) with a previous snapshot and resume the VM again, you will end up with a corrupted filesystem. The system expects files / inodes / fscache at certain places ... and all of the sudden it's gone or somewhere else. I believe the correct way to do a "snapshot_revert_while_running" is to do a "poweroff --hard" -> onevm disk-snapshot-revert -> poweron.
#7 Updated by Stefan Kooman almost 6 years ago
TL;DR The vlan tags get applied nicely, so the bug is fixed.
I assumed a "poweroff --hard" would have been given ... to avoid fs corruption to the running fs. But apparently it's not how it's designed. Maybe a warning should be added that a revert for mounted filesystems is very dangerous and will lead to dataloss.
#8 Updated by Jaime Melis almost 6 years ago
- Status changed from Assigned to Closed