Bug #4
Wrong Failure Handling
Status: | Closed | Start date: | ||
---|---|---|---|---|
Priority: | High | Due date: | ||
Assignee: | Ruben S. Montero | % Done: | 0% | |
Category: | Core & System | |||
Target version: | Release 1.0 | |||
Resolution: | fixed | Pull request: | ||
Affected Versions: |
Description
When a migration or other action is performed VM failure should be detected through the monitor command:
- Trigger a monitor action when something fails
- Rollback actions needed (remove history records...)
- Include state monitoring in VM drivers
Associated revisions
Simplified history handling in VM
Solved some history timer issues
Improved life-cycle (address ticket #4)
Solved a couple of deadlocks in some RequestManager methods
git-svn-id: http://svn.opennebula.org/trunk@9 3034c82b-c49b-4eb3-8279-a7acafdc01c0
Improved life-cycle (address ticket #4), new action when monitor returns paused or error states. Reason attribute for history records is now updated
git-svn-id: http://svn.opennebula.org/trunk@10 3034c82b-c49b-4eb3-8279-a7acafdc01c0
History
#1 Updated by Ruben S. Montero about 13 years ago
Changes in the OpenNebula core include:
- The failure events of the DispatchManager will be unified. The LifeCycleManager will send just one failure notification when a unrecoverable failure occurs
- When a the VirtualMachineManager triggers a failure event the LCM will:
- Return to RUNNING state
- Trigger a Monitor action on the VMM
- If the failure occurs during a migration. A new record with the original resource will be added to the history (reason of migration = ERROR). Capacity and running VMs will be adjusted for both hosts
#2 Updated by Javi Fontan about 13 years ago
- a: alive, xen states r, b, s
- p: paused, xen state p
- e: error, any other xen state
#3 Updated by Ruben S. Montero about 13 years ago
- Handle state callbacks from polling actions
- If a VM does not exists the driver should return error.
#4 Updated by Ruben S. Montero about 13 years ago
Changeset r10 includes new actions for monitor callbacks (paused & error).
Also the reason attribute for history records is now updated. I'll leave the ticket open till we test the whole life-cycle transitions.
#5 Updated by Javi Fontan about 13 years ago
Replying to [comment:4 ruben]:
The DispatchManager events has been simplified. Also the life-cycle of the VM has been modified so when a failure occurs it returns to the RUNNING state (changeset r9). Still missing:
- Handle state callbacks from polling actions
- If a VM does not exists the driver should return error.
Driver now returns STATE=d if the VM does not exists. Added in r16.
#6 Updated by Ruben S. Montero about 13 years ago
- Resolution set to fixed
All changes committed and some preliminary tests performed. I am closing this ticket.