Bug #4048
Disk snapshots that fail delete the existing snapshot instead of the new (failed) snapshot
Status: | Closed | Start date: | 10/10/2015 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | Ruben S. Montero | % Done: | 0% | |
Category: | Drivers - Storage | |||
Target version: | Release 4.14.2 | |||
Resolution: | fixed | Pull request: | ||
Affected Versions: | OpenNebula 4.14 |
Description
The rollback action for creating disk snapshots is to delete the existing (base) snapshot rather than the snapshot that was failed to be created.
src/vmm/VirtualMachineManager.cc:
2156 rc = tm->snapshot_transfer_command( vm, "SNAP_CREATE", os); 2157 2158 snap_cmd = os.str(); 2159 2160 os.str(""); 2161 2162 rc += tm->snapshot_transfer_command( vm, "SNAP_DELETE", os); 2163 2164 snap_cmd_rollback = os.str();
The snapshot_transfer_command() (in src/tm/TransferManager.cc) function inserts the snap_id value and eventually passes that on to the transfer manager.
However, since the snap_id value is computed before any action is taken, it's the same snap_id as passed to "SNAP_CREATE".
The result is if you try to create a snapshot of disk 0's snapshot 0 and it fails then disk 0 snapshot 0 is deleted, while the new snapshot (disk 0 snapshot 1, which refers to disk 0 snapshot 0) is left alone being broken all by itself
I'm really not sure how this was ever expected to work.
Associated revisions
bug #4048: Removed snap_cmd_rollback to prevent VM disk corruption
bug #4048: Removed snap_cmd_rollback to prevent VM disk corruption
(cherry picked from commit bcf061f1ec4a068fd1e42d49882dfe4ce017ef5e)
bug #4048: take out TM ROLLBACK from snap create
bug #4048: take out TM ROLLBACK from snap create
(cherry picked from commit a35b4d483c1abb6e1c40aa8f1f72d407152be16b)
History
#1 Updated by Ruben S. Montero over 5 years ago
- Category set to Drivers - Storage
- Status changed from Pending to New
- Assignee set to Javi Fontan
- Target version set to Release 4.14.2
Hi Roy,
Thanks for the feedback. This may escape the integration tests. The reason is that each driver handles the snap_id logic in a different way. Just to confirm, are you using the qcow2 drivers?
Cheers
#2 Updated by Roy Keene over 5 years ago
Yes, this was observed specifically in the qcow2 driver -- however the Transfer Manager always sends the SNAP_DELETE for the same snap_id as was sent for the SNAP_CREATE, without any way for SNAP_DELETE to know what SNAP_CREATE did (since at the time SNAP_DELETE is constructed, SNAP_CREATE has not been run yet).
#3 Updated by Ruben S. Montero over 5 years ago
- Assignee changed from Javi Fontan to Ruben S. Montero
#4 Updated by Ruben S. Montero over 5 years ago
- Status changed from New to Closed
- Resolution set to fixed
Remove the rollback operation from core. The rollback operation must be addressed in the snap_create operation as there are multiple possible failure points that may need different rollback operations.