Bug #5032
Datastores with TARGET = SELF (e.g. Ceph) needs to properly account image DS usage (as done in disk-resize operations for persistent images)
Status: | Closed | Start date: | 02/18/2017 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 0% | |
Category: | Drivers - Storage | |||
Target version: | Release 5.4 | |||
Resolution: | fixed | Pull request: | ||
Affected Versions: | OpenNebula 5.2 |
Description
Hello,
I'm using OpenNebula 5.2.1, CentOS 7.3, Ceph datastore. I found a bug that user can create a VM with overdisk quota (please check video).
Associated revisions
B #5032: Add datastore capacity usage in quota calculations for storage
drivers that clone to SELF (e.g. Ceph)
B #5032: Further fixes for SELF DS (e.g. Ceph) for disks with resizes
and snapshots. Also updates delete-recreate quota computation
History
#1 Updated by Arnaud Abélard over 4 years ago
Same problem here.
Although my users have a quota affected to the ceph system datastore, they can create system disk larger than their quota.
The problem doesn't occur with volatile disks which are properly accounted in the user's quota.
When I create a 2TB system disk from a ceph based image, no system disk is actually affected to my user:
~# onevm show 409
VIRTUAL MACHINE 409 INFORMATION
ID : 409
NAME : testaa3
USER : abelard-a
...
VM DISKS
ID DATASTORE TARGET IMAGE SIZE TYPE SAVE
0 ceph-image vda debian8.4-univnantes-v7 /2T rbd NO - -
1 - hda CONTEXT -/
~# oneuser show abelard-a
NUMBER OF VMS MEMORY CPU SYSTEM_DISK_SIZE
1 / 5 1024M / 9.8G 1.00 / 8.00 0M / 1.9T
DATASTORE ID IMAGES SIZE
100 1 / - 2.2G / -
I'm puzzled by the fact the datastore 100 is the ceph image pool and there's no mention of the datastore 101 which is the ceph system datastore:
~# onedatastore show 100
DATASTORE 100 INFORMATION
ID : 100
NAME : ceph-image
USER : oneadmin
GROUP : oneadmin
CLUSTERS : 0
TYPE : IMAGE
DS_MAD : ceph
TM_MAD : ceph
BASE PATH : /var/lib/one//datastores/100
DISK_TYPE : RBD
STATE : READY
DATASTORE CAPACITY
TOTAL: : 57.5T
FREE: : 56.8T
USED: : 761.3G
LIMIT: : -
PERMISSIONS
OWNER : um-
GROUP : u--
OTHER : u--
DATASTORE TEMPLATE
BRIDGE_LIST="..."
CEPH_HOST="..."
CEPH_SECRET="..."
CEPH_USER="opennebula"
CLONE_TARGET="SELF"
DATASTORE_CAPACITY_CHECK="YES"
DISK_TYPE="RBD"
DS_MAD="ceph"
LN_TARGET="NONE"
POOL_NAME="opennebula"
TM_MAD="ceph"
TYPE="IMAGE_DS"
IMAGES
17
18
19
42
46
49
58
59
62
63
65
67
69
71
~# onedatastore show 101
DATASTORE 101 INFORMATION
ID : 101
NAME : ceph-system
USER : oneadmin
GROUP : oneadmin
CLUSTERS : 0
TYPE : SYSTEM
DS_MAD : -
TM_MAD : ceph
BASE PATH : /var/lib/one//datastores/101
DISK_TYPE : RBD
STATE : READY
DATASTORE CAPACITY
TOTAL: : 57.5T
FREE: : 56.8T
USED: : 760.5G
LIMIT: : -
PERMISSIONS
OWNER : uma
GROUP : u--
OTHER : ---
DATASTORE TEMPLATE
BRIDGE_LIST="..."
CEPH_HOST="..."
CEPH_SECRET="..."
CEPH_USER="opennebula"
DATASTORE_CAPACITY_CHECK="YES"
DISK_TYPE="RBD"
DS_MIGRATE="NO"
POOL_NAME="opennebula"
RESTRICTED_DIRS="/"
SAFE_DIRS="/var/tmp"
SHARED="YES"
TM_MAD="ceph"
TYPE="SYSTEM_DS"
IMAGES
#2 Updated by Ruben S. Montero over 4 years ago
- Category set to Drivers - Storage
- Target version set to Release 5.4
#3 Updated by Ruben S. Montero over 4 years ago
- Status changed from Pending to Closed
- Resolution set to worksforme
OK, I've the opportunity to look at the video. Note that System Disks is for the disks created in the system datastore, in ceph disks are created using the same ceph pool as the images, so only volatile disks use storage from the system datastore. To limit the size, you need to add a quota to the Datastore.
#4 Updated by Arnaud Abélard over 4 years ago
actually I do have a quota on all my datastores:
~# oneuser show abelard-a RESOURCE USAGE & QUOTAS NUMBER OF VMS MEMORY CPU SYSTEM_DISK_SIZE 2 / 5 2G / 9.8G 2.00 / 8.00 0M / 1.9T DATASTORE ID IMAGES SIZE 0 0 / 0 0M / 1.9T 1 0 / - 0M / 1.9T 2 0 / - 0M / 1.9T 101 0 / - 0M / 1.9T 100 0 / - 0M / 1.9T
But as you can notice, none of my VM actual consume anything
even though I have a 2TB VM running:
~# onevm show 409 VIRTUAL MACHINE 409 INFORMATION ID : 409 NAME : testaa3 USER : abelard-a ... VM DISKS ID DATASTORE TARGET IMAGE SIZE TYPE SAVE 0 ceph-image vda debian8.4-univnantes-v7 -/2T rbd NO 1 - hda CONTEXT -/- - -
Notice that the used size is not available...
Anyway, I do use the 2TB so the volume should take some place in my quota,
root@testaa3:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 2,0T 2,0T 0 100% /
ceph-image being the datastore 100:
~# onedatastore show 100 DATASTORE 100 INFORMATION ID : 100 NAME : ceph-image USER : oneadmin GROUP : oneadmin CLUSTERS : 0 TYPE : IMAGE DS_MAD : ceph TM_MAD : ceph BASE PATH : /var/lib/one//datastores/100 DISK_TYPE : RBD STATE : READY ... IMAGES 17 18 19 42 46 49 59 62 63 65 67 69 71 79 83
my VM is using the image with id 42 on that datastore:
root@one-ctrl-1:~# oneimage show 42 IMAGE 42 INFORMATION ID : 42 NAME : debian8.4-univnantes-v7 USER : oneadmin GROUP : oneadmin DATASTORE : ceph-image TYPE : OS REGISTER TIME : 11/10 16:28:31 PERSISTENT : No SOURCE : opennebula/one-42 PATH : /var/tmp/debian8-5.0.1-univnantes-v7.qcow2 FSTYPE : raw SIZE : 2G STATE : used RUNNING_VMS : 22 PERMISSIONS OWNER : um- GROUP : u-- OTHER : u-- IMAGE TEMPLATE DEV_PREFIX="vd" DRIVER="raw" VIRTUAL MACHINES ID USER GROUP NAME STAT UCPU UMEM HOST TIME ... 409 abelard- DSI-IRTS testaa3 runn 0.0 1.1G iaas-vm-3. 1d 01h34 ...
As you can see, I have 2TB of data on my ceph-image datastore that aren't being accounted at all. I might have missed something, but I really wonder why since volatile disk are properly accounted.
Thanks
#5 Updated by Ruben S. Montero over 4 years ago
- Status changed from Closed to Pending
Let me double check this, I remember we fix something related with quotas for TM drivers using SELF (like ceph). So maybe you are being hit by that issue.
#6 Updated by Ruben S. Montero over 4 years ago
- Subject changed from Can't limit system disk with Ceph datastore to Datastores with TARGET = SELF (e.g. Ceph) needs to properly account image DS usage (as done in disk-resize operations for persistent images)
- Status changed from Pending to New
- Resolution deleted (
worksforme)
TL;DR There is a bug that only affects VM creation with a resize. Ceph (an other clone to SELF DS) control DS usage through the VM (max. number of virtual machines). As the scheduler is already using the right check I'm reopening the issue to properly account this datastores.
Sorry, my bad. I was confused about the bug I mentioned in my previous comment.
The idea is the following:
- The space used by VMs in the image datastore (as in the case of Ceph) is controlled by the VMs quota number. So you can "indirectly" limit the space consumed by the users in the image datastore (by creating virtual machines) with "NUMBER OF VMS".
- The space used by Images is controlled by the SIZE quota of the DATASTORE.
However I found a bug, in the resize operation. In allocation for Ceph (any target SELF datastore, for the matter of fact) and non-persistent images:
- You can create VMs and request any size for the disk (with non-persistent images), effectively bypassing the VM quota. (Note that persistent images cannot be resized)
Note that this only occurs when allocating a VM, the resize operation is properly controlled.
As a side note, the usage of system_ds is aggregated in SYSTEM_DS_USAGE, that's the reason you do not see it under the Datastore list (which is intended for images).
#7 Updated by Ruben S. Montero over 4 years ago
This is now fixed in master, a migrator for next release is still needed to recompute the quotas for TARGET=SELF Datastores
#8 Updated by Ruben S. Montero about 4 years ago
- Status changed from New to Closed
- Resolution set to fixed