Feature #4809
Simplify HA management in OpenNebula
Status: | Closed | Start date: | 09/21/2016 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | Vlastimil Holer | % Done: | 0% | |
Category: | Core & System | |||
Target version: | Release 5.4 | |||
Resolution: | fixed | Pull request: |
Description
This issue is to simplify the process of setting OpenNebula services in HA. Currently the system is designed to be compatible with classical Linux cluster suites. In practice, this impose to many requirements (e.g. need of a reliable fencing) and leads to deployments that are not easy to maintain.
Related issues
Associated revisions
F #4809: Add a base class for extended template attributes
F #4809: Add zone server list to zone data
F #4809: Add Server to Zone API call
F #4809: AddServer zone API in federated setups
F #4809: Delete servers from zone
F #4809: Update Sql database interface to include read/write/bootstrap
operations
F #4809: Overwrite SqlDB db_exec_* methods in DB implementations
F #4809: Some notes on the implementation of the log replication for
zone servers
F #4809: Some implementation files. Fix compilation issues
F #4809: Template for the LogDBManager
F #4809: Work on log management and replication
F #4809: Update SqlDB method name. DO NOT replicate monitoring data
F #4809: Updated log structure
F #4809: Replication logic
F #4809: Added LogDBManger to Zone server
F #4809: Start replication on new exec_wr calls
F #4809: Notify clients on majority replication
feature #4809: Bootstrap LogDB tables in DB, initialize next_index
F #4809: Improve replication logic, new RaftManager to control server
states
F #4809: Restructure Raft implementation
F #4809: Solves some bugs
F #4809: replicate logic for leader
F #4809: ReplicateLog client logic
F #4809: Fix replication logic
F #4809: Make some Raft events synchronous with replication threads
F #4809: DO NOT monitor marketplaces, nor datastores in followers
F #4809: Log retention system
F #4809: ActionManager can set timers in in ns. Adjust Raft manager to ms timers. Include Hearbeat control from leader and follower. XML-RPC calls are now async. RaftManager db calls is now decoupled with pool classes.
F #4809: start/stop replica threads when adding/deleting servers
F #4809: Condifugre raft timers through oned.conf
F #4809: Persist raft state. Implement Vote API call
F #4809: Adds vote requests from candidate, fix some bugs
F #4809: Fix several issues accessing the log records
F #4809: Debug election process
F #4809: Method to query raft status one servers. Updated CLI to make
use of it
F #4809: Remove cache from core pools
F #4809: Keep last_index counters internally. Commit log records on
heartbeats
F #4809: Update API internal name to match public xml-rpc names. Do not log
heartbeat/replicate log entries
F #4809: Store the leader_id for the term
F #4809: API methods leader_only attribute. List, Info and Raft methods
are not leader only.
F #4809: Forward request to leader
F #4809: Reload ACL rule cache when a server becomes zone leader
F #4809: Do not schedule VMs on non-leader servers
F #4809: Less verbose output
F #4809: Generic replication serves. Solve minor bugs in election
process
F #4809: Moved zone server list to ZonePool. Added FedLogDB class
F #4809: fix compilation
F #4809: Handle properly if only one server in the zone
F #4809: Use int for object IDs. Added base federation replica manager
F #4809: Implement Hooks on RAFT events
F #4809: Move the FT hooks to its own folder
F #4809: Add RAFT hooks
feature #4809: Start the federation replica manager with OpenNebula
F #4809: Do not start replica threads for own zone
F #4809: Fix timers in federation replica manager
F #4809: Update server list on election
F #4809: First version of replication thread for federated zones
F #4809: Refresh zone list on server add/remove to keep an updated list
of servers
F #4809: Fix some bugs when replicating log entries to slave zones
F #4809: Initialize last_index of federated log when starting replica
threads
F #4809: Raft servers / zone slave lists updated on server-add in the
rigth place (leaders/masters)
F #4809: Do not accept empty records. Increase default xml-rpc timeouts
F #4809: Do not doble-unlock raft mutex in timer. Fix update of
heartbeat counter
F #4809: add migrator for logdb and fed_logdb tables
F #4809: Fix cache issues when updating PCI devices on VM
F #4809: Fix history update for non-cache vm objects
F #4809: Fully working vip hooks
F #4809: Pass arguments to the raft hooks
F #4809: Fix cache issue when the VM outated set is modified
F #4809: Better query to purge log records
F #4809: Compress DB commands.
F #4809: Do not write log in solo mode, update instead of insert for
timestamps
F #4809: sql is a reserved keyword in MySQL
F #4809: Tool to backup/restore the database (with federated option)
F #4809: Check log records are properly loaded from DB
F #4809: Fix check
F #4809: Do not use log after unlocking objetcs
F #4809: Better timeouts for client xmlrpc-c
F #4809: Execute follower hook on startup
F #4809: Better heartbeat management. A separated thread is created for
each follower
f #4809: Do not include log info on raft.status when operating in solo
mode
F #4809: Do not bootstrap DB before upgrading
F #4809: rename sql to sqlcmd in database_schema.rb
'sql' is reserved word in MariaDB
F #4809: Load Hook managet earlier to be able to execute raft hooks on
start
F #4809: Typo in the name of the hook
F #4809: Fix deadlock when stopping replica threads
F #4809: Fix error when deleting server from zone in standalone mode
F #4809: Start/Stop zone replication threads when adding/deleting zones
F #4809: sync zone update in server add when adding a HA follower in a
federation (for 2 server corner case). Check for empty records in
federated log replication
F #4809: Adding missing unlock()
F #4809: Add missing migrators
F #4809: Fix log rewrite. Disable curl timeouts.
F #4809: Better cancellatin of replica threads
F #4809: Enable curl timeouts
F #4809: Fix memory leak after disabling the pool cache
F #4809. Fix index in PoolSQL. Update interface in all classes
F #4809: Fix replication on failed zones, adjust timeout for replication
F #4809: Update VM in DB after generating context
This is an issue now because the cache has been
removed.
F #4809: Do not start Raft timer in solo mode
F #4809: Fix ns retries
F #4809: Fix non-persistent state of Hosts
F #4809: Cleanup VIP if oned dies
F #4809: Cleanup script
F #4809: Add replicated log index information on server zones
F #4809: Send gratuitous ARP to update VIP
F #4809: Get fed index from the DB (needed by followers in HA). Use Zone
ENDPOINT to replicate log instead of server list. Fix bug when replicate
fails in a zone.
F #4809: Remove unneeded update_zone calls when adding a server to a
zone
F #4809: Fix replica log for federation
F #4809: Better management of last index of federated log
F #4809: Fix divergence problems when replicated the fed log in a HA
zone
F #4809: Compress federated log
F #4809: Safer purge function for RAFT log
F #4809: Re-design replicated log structure
F #4809: Enable federated of solo Zones
F #4809: Log information to debug federated zones with HA clusters. THIS
COMMIT IS MEANT TO BE REVERTED
F #4809: Update onedb backup federated backup utility
Revert "F #4809: Log information to debug federated zones with HA clusters. THIS"
This reverts commit fab2a07f74f55528631fa5b6159e80c1fa884637.
F #4809: Pre-allocate lastoid to prevent stale id's in the pool in case
of leader failure
F #4809: Update migrator. There is no longer need to add servers to a
zone to configure a federation if not using HA
F #4809: Re-design replicated log structure
F #4809: Enable federated of solo Zones
F #4809: Log information to debug federated zones with HA clusters. THIS
COMMIT IS MEANT TO BE REVERTED
F #4809: Update onedb backup federated backup utility
Revert "F #4809: Log information to debug federated zones with HA clusters. THIS"
This reverts commit fab2a07f74f55528631fa5b6159e80c1fa884637.
F #4809: Pre-allocate lastoid to prevent stale id's in the pool in case
of leader failure
F #4809: Update migrator. There is no longer need to add servers to a
zone to configure a federation if not using HA
F #4809: Fix race condition updating next index in federated log.
History
#1 Updated by Ruben S. Montero almost 5 years ago
- Related to Bug #4796: When failing over HA controllers, hypervisor collectd probes do not switch to new controller added
#2 Updated by Ruben S. Montero about 4 years ago
- Assignee set to Ruben S. Montero
#3 Updated by Ruben S. Montero about 4 years ago
- Assignee changed from Ruben S. Montero to Vlastimil Holer
#4 Updated by Ruben S. Montero almost 4 years ago
- Status changed from New to Closed
- Resolution set to fixed