Feature #4809: Simplify HA management in OpenNebula - OpenNebula - OpenNebula Development pages

Feature #4809

Simplify HA management in OpenNebula

Added by Ruben S. Montero almost 5 years ago. Updated almost 4 years ago.

Status:

Closed

Start date:

09/21/2016

Priority:

Normal

Due date:

Assignee:

Vlastimil Holer

% Done:

Category:

Core & System

Target version:

Release 5.4

Resolution:

fixed

Pull request:

Description

This issue is to simplify the process of setting OpenNebula services in HA. Currently the system is designed to be compatible with classical Linux cluster suites. In practice, this impose to many requirements (e.g. need of a reliable fencing) and leads to deployments that are not easy to maintain.

Related issues

Associated revisions

Revision e1cb2c92
Added by Ruben S. Montero about 4 years ago

F #4809: Add a base class for extended template attributes

Revision d03ed158
Added by Ruben S. Montero about 4 years ago

F #4809: Add zone server list to zone data

Revision ebc810f4
Added by Ruben S. Montero about 4 years ago

F #4809: Add Server to Zone API call

Revision c360b015
Added by Ruben S. Montero about 4 years ago

F #4809: AddServer zone API in federated setups

Revision cd580714
Added by Ruben S. Montero about 4 years ago

F #4809: Delete servers from zone

Revision a6d4ab3c
Added by Ruben S. Montero about 4 years ago

F #4809: Update Sql database interface to include read/write/bootstrap
operations

Revision e4848c55
Added by Ruben S. Montero about 4 years ago

F #4809: Overwrite SqlDB db_exec_* methods in DB implementations

Revision bae57600
Added by Ruben S. Montero about 4 years ago

F #4809: Some notes on the implementation of the log replication for
zone servers

Revision c8981e82
Added by Ruben S. Montero about 4 years ago

F #4809: Some implementation files. Fix compilation issues

Revision 116425fc
Added by Ruben S. Montero about 4 years ago

F #4809: Template for the LogDBManager

Revision dd0598aa
Added by Ruben S. Montero about 4 years ago

F #4809: Work on log management and replication

Revision bca17f4e
Added by Ruben S. Montero about 4 years ago

F #4809: Update SqlDB method name. DO NOT replicate monitoring data

Revision ed0d64a2
Added by Ruben S. Montero about 4 years ago

F #4809: Updated log structure

Revision b26e5a71
Added by Ruben S. Montero about 4 years ago

F #4809: Replication logic

Revision f2039e02
Added by Ruben S. Montero about 4 years ago

F #4809: Added LogDBManger to Zone server

Revision 034e2316
Added by Ruben S. Montero about 4 years ago

F #4809: Start replication on new exec_wr calls

Revision 0db5478b
Added by Ruben S. Montero about 4 years ago

F #4809: Notify clients on majority replication

Revision 0c8299f1
Added by Ruben S. Montero about 4 years ago

feature #4809: Bootstrap LogDB tables in DB, initialize next_index

Revision 4c577126
Added by Ruben S. Montero about 4 years ago

F #4809: Improve replication logic, new RaftManager to control server
states

Revision 7c479b8e
Added by Ruben S. Montero about 4 years ago

F #4809: Restructure Raft implementation

Revision 900c37fd
Added by Ruben S. Montero about 4 years ago

F #4809: Solves some bugs

Revision cc08fd98
Added by Ruben S. Montero about 4 years ago

F #4809: replicate logic for leader

Revision 88513088
Added by Ruben S. Montero about 4 years ago

F #4809: ReplicateLog client logic

Revision 613ec634
Added by Ruben S. Montero about 4 years ago

F #4809: Fix replication logic

Revision 6f976c1e
Added by Ruben S. Montero about 4 years ago

F #4809: Make some Raft events synchronous with replication threads

Revision dd51b583
Added by Ruben S. Montero about 4 years ago

F #4809: DO NOT monitor marketplaces, nor datastores in followers

Revision 5441a2f8
Added by Ruben S. Montero about 4 years ago

F #4809: Log retention system

Revision 3be9f38d
Added by Ruben S. Montero about 4 years ago

F #4809: ActionManager can set timers in in ns. Adjust Raft manager to ms timers. Include Hearbeat control from leader and follower. XML-RPC calls are now async. RaftManager db calls is now decoupled with pool classes.

Revision 071d0d84
Added by Ruben S. Montero about 4 years ago

F #4809: start/stop replica threads when adding/deleting servers

Revision 3f9638be
Added by Ruben S. Montero about 4 years ago

F #4809: Condifugre raft timers through oned.conf

Revision 2a695bc8
Added by Ruben S. Montero about 4 years ago

F #4809: Persist raft state. Implement Vote API call

Revision c6a7500d
Added by Ruben S. Montero about 4 years ago

F #4809: Adds vote requests from candidate, fix some bugs

Revision 03c5698a
Added by Ruben S. Montero about 4 years ago

F #4809: Fix several issues accessing the log records

Revision 9e1fd142
Added by Ruben S. Montero about 4 years ago

F #4809: Debug election process

Revision c14d648e
Added by Ruben S. Montero about 4 years ago

F #4809: Method to query raft status one servers. Updated CLI to make
use of it

Revision c8f77ab8
Added by Ruben S. Montero about 4 years ago

F #4809: Remove cache from core pools

Revision 9d397346
Added by Ruben S. Montero about 4 years ago

F #4809: Keep last_index counters internally. Commit log records on
heartbeats

Revision 503b2835
Added by Ruben S. Montero about 4 years ago

F #4809: Update API internal name to match public xml-rpc names. Do not log
heartbeat/replicate log entries

Revision c1317ed6
Added by Ruben S. Montero about 4 years ago

F #4809: Store the leader_id for the term

Revision 59cf651d
Added by Ruben S. Montero about 4 years ago

F #4809: API methods leader_only attribute. List, Info and Raft methods
are not leader only.

Revision 590b3548
Added by Ruben S. Montero about 4 years ago

F #4809: Forward request to leader

Revision ac76ab45
Added by Ruben S. Montero about 4 years ago

F #4809: Reload ACL rule cache when a server becomes zone leader

Revision 5f6a627b
Added by Ruben S. Montero about 4 years ago

F #4809: Do not schedule VMs on non-leader servers

Revision 1bcfc2bd
Added by Ruben S. Montero about 4 years ago

F #4809: Less verbose output

Revision 741a55d1
Added by Ruben S. Montero about 4 years ago

F #4809: Generic replication serves. Solve minor bugs in election
process

Revision b38874a0
Added by Ruben S. Montero about 4 years ago

F #4809: Moved zone server list to ZonePool. Added FedLogDB class

Revision faa46d9c
Added by Ruben S. Montero about 4 years ago

F #4809: fix compilation

Revision cfe3b415
Added by Jaime Melis about 4 years ago

F #4809: Handle properly if only one server in the zone

Revision 5e92ec0a
Added by Ruben S. Montero about 4 years ago

F #4809: Use int for object IDs. Added base federation replica manager

Revision 85b6ed25
Added by Jaime Melis about 4 years ago

F #4809: Implement Hooks on RAFT events

Revision 94052286
Added by Jaime Melis about 4 years ago

F #4809: Move the FT hooks to its own folder

Revision 680d49a2
Added by Jaime Melis about 4 years ago

F #4809: Add RAFT hooks

Revision 2676d8b7
Added by Ruben S. Montero about 4 years ago

feature #4809: Start the federation replica manager with OpenNebula

Revision 4291e4a8
Added by Ruben S. Montero about 4 years ago

F #4809: Do not start replica threads for own zone

Revision 46d4e232
Added by Ruben S. Montero about 4 years ago

F #4809: Fix timers in federation replica manager

Revision c5a82aba
Added by Ruben S. Montero about 4 years ago

F #4809: Update server list on election

Revision c5f54f81
Added by Ruben S. Montero about 4 years ago

F #4809: First version of replication thread for federated zones

Revision 4a780941
Added by Ruben S. Montero about 4 years ago

F #4809: Refresh zone list on server add/remove to keep an updated list
of servers

Revision 31ba4d4d
Added by Ruben S. Montero about 4 years ago

F #4809: Fix some bugs when replicating log entries to slave zones

Revision afca14d7
Added by Ruben S. Montero about 4 years ago

F #4809: Initialize last_index of federated log when starting replica
threads

Revision a216bb9e
Added by Ruben S. Montero about 4 years ago

F #4809: Raft servers / zone slave lists updated on server-add in the
rigth place (leaders/masters)

Revision 98d26e9e
Added by Ruben S. Montero about 4 years ago

F #4809: Do not accept empty records. Increase default xml-rpc timeouts

Revision c8d88de0
Added by Ruben S. Montero about 4 years ago

F #4809: Do not doble-unlock raft mutex in timer. Fix update of
heartbeat counter

Revision 8488320e
Added by Javi Fontan about 4 years ago

F #4809: add migrator for logdb and fed_logdb tables

Revision 02b229d7
Added by Ruben S. Montero about 4 years ago

F #4809: Fix cache issues when updating PCI devices on VM

Revision 3d9686e3
Added by Ruben S. Montero about 4 years ago

F #4809: Fix history update for non-cache vm objects

Revision 9d763ef6
Added by Jaime Melis about 4 years ago

F #4809: Fully working vip hooks

Revision 54c1fbaa
Added by Jaime Melis about 4 years ago

F #4809: Pass arguments to the raft hooks

Revision 75b8889e
Added by Ruben S. Montero about 4 years ago

F #4809: Fix cache issue when the VM outated set is modified

Revision 234734a3
Added by Ruben S. Montero about 4 years ago

F #4809: Better query to purge log records

Revision f12fdfae
Added by Ruben S. Montero about 4 years ago

F #4809: Compress DB commands.

Revision 50880bb2
Added by Ruben S. Montero about 4 years ago

F #4809: Do not write log in solo mode, update instead of insert for
timestamps

Revision 50d0a6b4
Added by Jaime Melis about 4 years ago

F #4809: sql is a reserved keyword in MySQL

Revision ab63767e
Added by Jaime Melis about 4 years ago

F #4809: Tool to backup/restore the database (with federated option)

Revision d3c7a07a
Added by Ruben S. Montero about 4 years ago

F #4809: Check log records are properly loaded from DB

Revision 35da0ee1
Added by Ruben S. Montero about 4 years ago

F #4809: Fix check

Revision 654b384b
Added by Ruben S. Montero about 4 years ago

F #4809: Do not use log after unlocking objetcs

Revision e80386ef
Added by Ruben S. Montero about 4 years ago

F #4809: Better timeouts for client xmlrpc-c

Revision b768f896
Added by Ruben S. Montero about 4 years ago

F #4809: Execute follower hook on startup

Revision 0a2f18a7
Added by Ruben S. Montero about 4 years ago

F #4809: Better heartbeat management. A separated thread is created for
each follower

Revision 953a7f16
Added by Ruben S. Montero about 4 years ago

f #4809: Do not include log info on raft.status when operating in solo
mode

Revision bd4040d3
Added by Ruben S. Montero about 4 years ago

F #4809: Do not bootstrap DB before upgrading

Revision 486a4856
Added by Anton Todorov about 4 years ago

F #4809: rename sql to sqlcmd in database_schema.rb

'sql' is reserved word in MariaDB

Revision 169d153e
Added by Jaime Melis about 4 years ago

Merge pull request #324 from atodorov-storpool/patch-16

F #4809: rename sql to sqlcmd in database_schema.rb

Revision 978d1538
Added by Ruben S. Montero about 4 years ago

F #4809: Load Hook managet earlier to be able to execute raft hooks on
start

Revision 585cc130
Added by Jaime Melis about 4 years ago

F #4809: Typo in the name of the hook

Revision 6d61d510
Added by Ruben S. Montero about 4 years ago

F #4809: Fix deadlock when stopping replica threads

Revision f4ec2c07
Added by Ruben S. Montero about 4 years ago

F #4809: Fix error when deleting server from zone in standalone mode

Revision 2f5a3185
Added by Ruben S. Montero about 4 years ago

F #4809: Start/Stop zone replication threads when adding/deleting zones

Revision 97474282
Added by Ruben S. Montero about 4 years ago

F #4809: sync zone update in server add when adding a HA follower in a
federation (for 2 server corner case). Check for empty records in
federated log replication

Revision dff203fe
Added by Ruben S. Montero about 4 years ago

F #4809: Adding missing unlock()

Revision d00e3798
Added by Ruben S. Montero about 4 years ago

F #4809: Add missing migrators

Revision 0888ae48
Added by Ruben S. Montero about 4 years ago

F #4809: Fix log rewrite. Disable curl timeouts.

Revision c98457ae
Added by Ruben S. Montero about 4 years ago

F #4809: Better cancellatin of replica threads

Revision 468a104b
Added by Ruben S. Montero about 4 years ago

F #4809: Enable curl timeouts

Revision 83d158d6
Added by Ruben S. Montero about 4 years ago

F #4809: Fix memory leak after disabling the pool cache

Revision a2c5a4cb
Added by Ruben S. Montero about 4 years ago

F #4809. Fix index in PoolSQL. Update interface in all classes

Revision e4280206
Added by Ruben S. Montero about 4 years ago

F #4809: Fix replication on failed zones, adjust timeout for replication

Revision a09c6d48
Added by Jaime Melis about 4 years ago

F #4809: Update VM in DB after generating context

This is an issue now because the cache has been
removed.

Revision 47050433
Added by Ruben S. Montero about 4 years ago

F #4809: Do not start Raft timer in solo mode

Revision 15001cb1
Added by Ruben S. Montero about 4 years ago

F #4809: Fix ns retries

Revision ca2a1a42
Added by Ruben S. Montero about 4 years ago

F #4809: Fix non-persistent state of Hosts

Revision 3224a504
Added by Jaime Melis about 4 years ago

F #4809: Cleanup VIP if oned dies

Revision 8e19b566
Added by Jaime Melis about 4 years ago

F #4809: Cleanup script

Revision cc6cc460
Added by Ruben S. Montero about 4 years ago

F #4809: Add replicated log index information on server zones

Revision 509d98df
Added by Jaime Melis about 4 years ago

F #4809: Send gratuitous ARP to update VIP

Revision d5d6cb96
Added by Ruben S. Montero about 4 years ago

F #4809: Get fed index from the DB (needed by followers in HA). Use Zone
ENDPOINT to replicate log instead of server list. Fix bug when replicate
fails in a zone.

Revision dbc47c98
Added by Ruben S. Montero about 4 years ago

F #4809: Remove unneeded update_zone calls when adding a server to a
zone

Revision 192ce9db
Added by Ruben S. Montero about 4 years ago

F #4809: Fix replica log for federation

Revision 4f8195d2
Added by Ruben S. Montero about 4 years ago

F #4809: Better management of last index of federated log

Revision 0c9272b6
Added by Ruben S. Montero about 4 years ago

F #4809: Fix divergence problems when replicated the fed log in a HA
zone

Revision 361409e7
Added by Ruben S. Montero about 4 years ago

F #4809: Compress federated log

Revision c6286c03
Added by Ruben S. Montero about 4 years ago

F #4809: Safer purge function for RAFT log