KVM UDP-push Monitoring not working on bigger hosts - message too long
|Assignee:||Javi Fontan||% Done:|
|Category:||Drivers - Monitor|
|Target version:||Release 5.0|
|Affected Versions:||OpenNebula 4.14|
we have a setup with two different types of hosts:
- smaller ones that can host about 25 VMs (256 GB RAM 32 Cores)
- bigger ones that can host about 80 VMs (1.5TB RAM 120 Cores)
We are using OpenNebula 4.14.0 with the KVM Driver.
We observed that the KVM UDP-push Monitoring is not working on the bigger hosts. The oned log says "Monitoring host ..." instead of "Host ... successfully monitored"
We tried to debug and found an error with the UDP Packages. It says "message too long” - same as here:
See attached the oned.log. The bigger hosts are called pod1-blXX
We also observed that the Hosts stay in Status "UPDATE" after some time (both small & big). The workaround is to restart opennebula via "service opennebula restart" - via root user.
feature #4296: Compress output of run_probes and uncompress it in oned
to better deal with the UDP message size limit.
#1 Updated by Ruben S. Montero over 3 years ago
- Category set to Drivers - Monitor
- Status changed from Pending to New
- Assignee set to Javi Fontan
- Target version set to Release 5.0
This is a problem, in deed... Solution is to split VM messages in several chunks. This "pagination" can be done in the probes. I'll keep this issue to implement this...
#2 Updated by Ruben S. Montero about 3 years ago
Let's try a simpler solution that could be enough for the amount of data used in monitor, considering also that we are moving disk info to the DS monitoring process. Information is compressed using zlib and uncompressed upon arrival at oned. We are getting 80% reduction on message size, so this will give us space for a lot more info...