Bug #3733
xmlrpc-c / scheduler problems with big XML-Sets
Status: | Closed | Start date: | 04/01/2015 | |
---|---|---|---|---|
Priority: | High | Due date: | ||
Assignee: | - | % Done: | 0% | |
Category: | Scheduler | |||
Target version: | Release 4.12.1 | |||
Resolution: | fixed | Pull request: | ||
Affected Versions: | OpenNebula 4.10 |
Description
We experiencing a strange behavior with the scheduler and the xmlrpc-c lib.
We have one opennebula 4.10 instance on CentOS 6 which is managing 10 server and 1200 VMs. We are extending and want to do up to 2000 VMs.
At the moment we can't do more than 1300 VMs. If we do more VMs they will end in pendig state and not be deployed.
We use XMLRPC to communicate with ONe. It seems we get xml-sets which are greater than 10mb and with those the xmlrpc-c lib breaks.
The Log says then something like this:
- /var/log/one/sched.log
Tue Mar 31 16:31:20 2015 [Z0][VM][E]: Exception raised: Response XML from server is not valid XML-RPC response. Unable to find XML-RPC response in what server sent back. Not valid XML. XML parsing failed Tue Mar 31 16:31:20 2015 [Z0][POOL][E]: Could not retrieve pool info from ONE
- oned.conf
MAX_CONN = 240 MAX_CONN_BACKLOG = 480 KEEPALIVE_TIMEOUT = 150 KEEPALIVE_MAX_CONN = 300 TIMEOUT = 150 #temp enabled for debugging RPC_LOG = YES MESSAGE_SIZE = 1073741824
Associated revisions
Bug #3733: xmlrpc-c / scheduler problems with big XML-Sets - switched from xmlParseMemory to xmlReadMemory; xmlReadMemory allows parameter XML_PARSE_HUGE which adds support for files >10MB
(cherry picked from commit b98a2cdc67c5330b0dbc6ba63e67e90fb956996d)
Bug #3733: xmlrpc-c / scheduler problems with big XML-Sets - switched from xmlParseMemory to xmlReadMemory; xmlReadMemory allows parameter XML_PARSE_HUGE which adds support for files >10MB
(cherry picked from commit b98a2cdc67c5330b0dbc6ba63e67e90fb956996d)
History
#1 Updated by Ruben S. Montero over 6 years ago
Hi Robert,
We have also MESSAGE_SIZE, in sched.conf for the client side, could you double check that one?
#2 Updated by Anonymous over 6 years ago
Hi Ruben...
our sched.conf looks like this
MESSAGE_SIZE = 1073741824 ONED_PORT = 2633 SCHED_INTERVAL = 30 MAX_VM = 5000 MAX_DISPATCH = 30 MAX_HOST = 1 LIVE_RESCHEDS = 0 DEFAULT_SCHED = [ policy = 1 ] DEFAULT_DS_SCHED = [ policy = 1 ] LOG = [ system = "file", debug_level = 3 ]
#3 Updated by Ruben S. Montero over 6 years ago
ok, that should be more than enough for thousands of VMs. Does other client tools works? like onevm list and onehost list?
#4 Updated by Anonymous over 6 years ago
yes onevm and other commands work. onevm needs up to 8s to execute.
#5 Updated by Ruben S. Montero over 6 years ago
that's also too much, probably you are missing some ruby gems. Try running install_gems again to install nokogiri etc...
About the scheduler problem we need to reproduce this...
#6 Updated by Ruben S. Montero over 6 years ago
Hi Robert,
I'm now able to reproduce this, will keep you posted.
Cheers
#7 Updated by Anonymous over 6 years ago
Hi Ruben,
oh, cool... we thought maybe we are handling the xmlrpc api wrong...
but when you can reproduce the behavior this seems to be a real bug or something
#8 Updated by Anonymous over 6 years ago
colleages of mine have tested something...
maybe there ist a problem with libxml2 which has a hard limit of 10mb for text nodes. this is since version 2.7.3.
the following should address the problem:
diff --git a/src/xml/ObjectXML.cc b/src/xml/ObjectXML.cc
index 446a0a2..e57c999 100644 --- a/src/xml/ObjectXML.cc +++ b/src/xml/ObjectXML.cc @@ -564,7 +564,7 @@ int ObjectXML::validate_xml(const string &xml_doc) void ObjectXML::xml_parse(const string &xml_doc) { - xml = xmlParseMemory (xml_doc.c_str(),xml_doc.length()); + xml = xmlReadMemory (xml_doc.c_str(),xml_doc.length(),0,0,XML_PARSE_HUGE); if (xml == 0) {
what do you think?
#9 Updated by Anonymous over 6 years ago
we tested this now... we will do a pull request...
#10 Updated by Ruben S. Montero over 6 years ago
I can confirm that the problem is with the libxml2 parsing,... however I think this error is happening at the libxmlrpc-c parsing, here:
http://sourceforge.net/p/xmlrpc-c/code/HEAD/tree/advanced/src/xmlrpc_libxml2.c#l452
(the ""XML parsing failed"" string is the one output in the last place in the error message).
I was looking for a method to pass XML_PARSE_HUGE to the libxmlrpc-c or set globally that option for parsers, but I could not find a API call for either two :( If this is not an option we can easily overwrite the call method of the client XML_RPC class to set the option.
Thanks for the help
#11 Updated by Gerald Schmidt over 6 years ago
You are absolutely right. For our test we patched libxml2 with options |= XML_PARSE_HUGE.
We can confirm that this solves the problem, but as you say this fix requires a change in xmlrpc_libxml2 (perhaps adding a param options to xml_parse?) or a patched libxml2 (our rather ugly interim fix).
#12 Updated by Ruben S. Montero over 6 years ago
- Status changed from Pending to Closed
- Target version set to Release 4.12.1
- Resolution set to fixed
So the final solution is two-folded:
1.- Apply the patch sent by Robert (already in master and 4.12)
2.- Use XML_PARSE_HUGE in the xmlrpc-c client. This patch is applied to the libxmlrpc used to build the opennebula packages. We are already linking statically with a compiled xmlrpc-c library because some important distributions ships a very old version or compile against insecure xml parsers.
I'm closing this and will be released in 4.12.1
Thanks guys for the feedback :)