Disk snapshots that fail delete the existing snapshot instead of the new (failed) snapshot
|Assignee:||Ruben S. Montero||% Done:|
|Category:||Drivers - Storage|
|Target version:||Release 4.14.2|
|Affected Versions:||OpenNebula 4.14|
The rollback action for creating disk snapshots is to delete the existing (base) snapshot rather than the snapshot that was failed to be created.
2156 rc = tm->snapshot_transfer_command( vm, "SNAP_CREATE", os); 2157 2158 snap_cmd = os.str(); 2159 2160 os.str(""); 2161 2162 rc += tm->snapshot_transfer_command( vm, "SNAP_DELETE", os); 2163 2164 snap_cmd_rollback = os.str();
The snapshot_transfer_command() (in src/tm/TransferManager.cc) function inserts the snap_id value and eventually passes that on to the transfer manager.
However, since the snap_id value is computed before any action is taken, it's the same snap_id as passed to "SNAP_CREATE".
The result is if you try to create a snapshot of disk 0's snapshot 0 and it fails then disk 0 snapshot 0 is deleted, while the new snapshot (disk 0 snapshot 1, which refers to disk 0 snapshot 0) is left alone being broken all by itself
I'm really not sure how this was ever expected to work.
bug #4048: Removed snap_cmd_rollback to prevent VM disk corruption
(cherry picked from commit bcf061f1ec4a068fd1e42d49882dfe4ce017ef5e)
#1 Updated by Ruben S. Montero almost 5 years ago
- Category set to Drivers - Storage
- Status changed from Pending to New
- Assignee set to Javi Fontan
- Target version set to Release 4.14.2
Thanks for the feedback. This may escape the integration tests. The reason is that each driver handles the snap_id logic in a different way. Just to confirm, are you using the qcow2 drivers?
#2 Updated by Roy Keene almost 5 years ago
Yes, this was observed specifically in the qcow2 driver -- however the Transfer Manager always sends the SNAP_DELETE for the same snap_id as was sent for the SNAP_CREATE, without any way for SNAP_DELETE to know what SNAP_CREATE did (since at the time SNAP_DELETE is constructed, SNAP_CREATE has not been run yet).
#4 Updated by Ruben S. Montero almost 5 years ago
- Status changed from New to Closed
- Resolution set to fixed
Remove the rollback operation from core. The rollback operation must be addressed in the snap_create operation as there are multiple possible failure points that may need different rollback operations.