Bug #4296

KVM UDP-push Monitoring not working on bigger hosts - message too long

Added by Tobias Fischer over 5 years ago. Updated about 5 years ago.

Status:ClosedStart date:01/19/2016
Priority:NormalDue date:
Assignee:Javi Fontan% Done:

0%

Category:Drivers - Monitor
Target version:Release 5.0
Resolution:fixed Pull request:
Affected Versions:OpenNebula 4.14

Description

Hello,

we have a setup with two different types of hosts:
- smaller ones that can host about 25 VMs (256 GB RAM 32 Cores)
- bigger ones that can host about 80 VMs (1.5TB RAM 120 Cores)

We are using OpenNebula 4.14.0 with the KVM Driver.
We observed that the KVM UDP-push Monitoring is not working on the bigger hosts. The oned log says "Monitoring host ..." instead of "Host ... successfully monitored"
We tried to debug and found an error with the UDP Packages. It says "message too long” - same as here:

http://stackoverflow.com/questions/9853099/how-to-solve-sending-udp-packet-using-sendto-got-message-too-long

See attached the oned.log. The bigger hosts are called pod1-blXX

We also observed that the Hosts stay in Status "UPDATE" after some time (both small & big). The workaround is to restart opennebula via "service opennebula restart" - via root user.
Thanks!

Best,
Tobi

oned.log (13.6 MB) Tobias Fischer, 01/19/2016 11:07 AM

Associated revisions

Revision dd0a3b61
Added by Ruben S. Montero about 5 years ago

feature #4296: Compress output of run_probes and uncompress it in oned
to better deal with the UDP message size limit.

Revision 0687cda3
Added by Javi Fontan about 5 years ago

bug #4296: only monitor vga pci devices by default

Revision 959435c4
Added by Javi Fontan about 5 years ago

bug #4296: disable pci monitoring by default

History

#1 Updated by Ruben S. Montero over 5 years ago

  • Category set to Drivers - Monitor
  • Status changed from Pending to New
  • Assignee set to Javi Fontan
  • Target version set to Release 5.0

This is a problem, in deed... Solution is to split VM messages in several chunks. This "pagination" can be done in the probes. I'll keep this issue to implement this...

#2 Updated by Ruben S. Montero about 5 years ago

Let's try a simpler solution that could be enough for the amount of data used in monitor, considering also that we are moving disk info to the DS monitoring process. Information is compressed using zlib and uncompressed upon arrival at oned. We are getting 80% reduction on message size, so this will give us space for a lot more info...

#3 Updated by Tobias Fischer about 5 years ago

Hi Ruben,

thanks for the update :-)

Best,
Tobi

#4 Updated by Javi Fontan about 5 years ago

Also, remember to filter PCI devices not used in `/var/lib/one/remotes/im/kvm-probes.d/pci.rb`. This information can be very big. In case PCI passthrough is not used it can be disabled with:

```
FILTER = "0000:0000"
```

#5 Updated by Javi Fontan about 5 years ago

PCI monitor now only gets info about VGA devices by default

#6 Updated by Ruben S. Montero about 5 years ago

  • Status changed from New to Closed
  • Resolution set to fixed

Also available in: Atom PDF