Bug #5203

Empty list of Zombie VM's

Added by Rogier Dikkes 4 months ago. Updated about 1 month ago.

Status:ClosedStart date:06/26/2017
Priority:NormalDue date:
Assignee:Juan Jose Montiel Cano% Done:

100%

Category:Core & System
Target version:Release 5.4.1
Resolution:fixed Pull request:
Affected Versions:OpenNebula 5.2

Description

We have an issue currently where the host has more allocated Memory than we allow through Reserved Memory (8G reserved). In the web interface we found that there are 3 Zombie VM's, however the tab is empty and has a comment at the bottom: "Showing 1 to 3 of 3 entries". When we try to retrieve the list of Zombie VM's with onehost show <<id>> we get an empty list of zombie VM's:

onehost show 6
HOST 6 INFORMATION
ID : 6
NAME : node03
CLUSTER : HPC
STATE : MONITORED
IM_MAD : kvm
VM_MAD : kvm
LAST MONITORING TIME : 06/26 10:34:44

HOST SHARES
TOTAL MEM : 243.9G
USED MEM (REAL) : 225G
USED MEM (ALLOCATED) : 243G
TOTAL CPU : 6400
USED CPU (REAL) : 448
USED CPU (ALLOCATED) : 6400
RUNNING VMS : 17

LOCAL SYSTEM DATASTORE #103 CAPACITY
TOTAL: : 2.9T
USED: : 594G
FREE: : 2.1T

MONITORING INFORMATION
ARCH="x86_64"
CPUSPEED="3325"
HOSTNAME="node03"
HYPERVISOR="kvm"
IM_MAD="kvm"
MODELNAME="Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz"
NETRX="23027003809633"
NETTX="11794078511505"
RESERVED_CPU=""
RESERVED_MEM="8388608"
TOTAL_ZOMBIES="3"
VERSION="5.2.1"
VM_MAD="kvm"
ZOMBIES=", , "

WILD VIRTUAL MACHINES

NAME IMPORT_ID CPU MEMORY

We figured this could be related to data corruption in the database, we did a onedb fsck and the state did not change. We tried onehost sync --force and nothing changed either.
We have multiple hosts affected with this. When we compare the running VM's on the host (virsh list) with those in opennebula we get no differences. We figured that maybe this could be an lingering KVM process, however the amount of these processes match the number in virsh. This is purely incorrect data in the database.

We checked the changelog of 5.4 and found no remarks of this being fixed.


Related issues

Related to Bug #5003: VMs wrongly reported as ZOMBIES Closed 02/01/2017

History

#1 Updated by Javi Fontan 2 months ago

  • Related to Bug #5003: VMs wrongly reported as ZOMBIES added

#2 Updated by Javi Fontan 2 months ago

Can you send us the output of onehost show -x 6 from the frontend and /var/tmp/one/vmm/kvm/poll -t in the host?

#3 Updated by Javi Fontan about 1 month ago

  • Category set to Core & System
  • Assignee set to Juan Jose Montiel Cano
  • Target version set to Release 5.4.1

#4 Updated by Rogier Dikkes about 1 month ago

<HOST>
<ID>6</ID>
<NAME>node03</NAME>
<STATE>2</STATE>
<IM_MAD><![CDATA[kvm]]></IM_MAD>
<VM_MAD><![CDATA[kvm]]></VM_MAD>
<LAST_MON_TIME>1504019201</LAST_MON_TIME>
<CLUSTER_ID>107</CLUSTER_ID>
<CLUSTER>Private</CLUSTER>
<HOST_SHARE>
<DISK_USAGE>0</DISK_USAGE>
<MEM_USAGE>67108864</MEM_USAGE>
<CPU_USAGE>1600</CPU_USAGE>
<MAX_DISK>3001996</MAX_DISK>
<MAX_MEM>255728672</MAX_MEM>
<MAX_CPU>6400</MAX_CPU>
<FREE_DISK>2780124</FREE_DISK>
<FREE_MEM>233522720</FREE_MEM>
<FREE_CPU>6400</FREE_CPU>
<USED_DISK>69358</USED_DISK>
<USED_MEM>30594564</USED_MEM>
<USED_CPU>0</USED_CPU>
<RUNNING_VMS>1</RUNNING_VMS>
<DATASTORES>
<DS>
<FREE_MB><![CDATA2780124]></FREE_MB>
<ID><![CDATA103]></ID>
<TOTAL_MB><![CDATA3001996]></TOTAL_MB>
<USED_MB><![CDATA69358]></USED_MB>
</DS>
<DS>
<FREE_MB><![CDATA2780124]></FREE_MB>
<ID><![CDATA124]></ID>
<TOTAL_MB><![CDATA3001996]></TOTAL_MB>
<USED_MB><![CDATA69358]></USED_MB>
</DS>
</DATASTORES>
<PCI_DEVICES/>
</HOST_SHARE>
<VMS>
<ID>30842</ID>
</VMS>
<TEMPLATE>
<ARCH><![CDATA[x86_64]]></ARCH>
<CPUSPEED><![CDATA3263]></CPUSPEED>
<HOSTNAME><![CDATA[node03]]></HOSTNAME>
<HYPERVISOR><![CDATA[kvm]]></HYPERVISOR>
<IM_MAD><![CDATA[kvm]]></IM_MAD>
<MODELNAME><![CDATA[Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz]]></MODELNAME>
<NETRX><![CDATA23342899821050]></NETRX>
<NETTX><![CDATA12436207005835]></NETTX>
<RESERVED_CPU><![CDATA[]]></RESERVED_CPU>
<RESERVED_MEM><![CDATA8388608]></RESERVED_MEM>
<TOTAL_ZOMBIES><![CDATA3]></TOTAL_ZOMBIES>
<VERSION><![CDATA[5.2.1]]></VERSION>
<VM_MAD><![CDATA[kvm]]></VM_MAD>
<ZOMBIES><![CDATA[, , ]]></ZOMBIES>
</TEMPLATE>
</HOST>

VM_POLL=YES
VM=[
ID=30842,
DEPLOY_ID=one-30842,
POLL="STATE=a CPU=13.0 MEMORY=67108864 NETRX=56550691537 NETTX=143293908" ]

We have since then changed the cluster, but the zombies remain. We could always delete and recreate the host.

#5 Updated by Juan Jose Montiel Cano about 1 month ago

  • % Done changed from 0 to 100

#6 Updated by Ruben S. Montero about 1 month ago

  • Status changed from Pending to Closed
  • Resolution set to fixed

Also available in: Atom PDF