Stephan Peijnik
2010-Oct-08 10:52 UTC
[Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is unmaintained for several years now. xmlproc is used only for validating XML documents against a DTD file. This patch replaces the pyxml/xmlproc based XML validation with code based on lxml, which is actively maintained. Signed-off-by: Stephan Peijnik <spe@anexia.at> diff -r 6e0ffcd2d9e0 -r 7082ce86e492 tools/python/xen/xm/xenapi_create.py --- a/tools/python/xen/xm/xenapi_create.py Fri Sep 17 17:06:57 2010 +0100 +++ b/tools/python/xen/xm/xenapi_create.py Fri Oct 08 12:31:18 2010 +0200 @@ -14,13 +14,15 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA #=========================================================================== # Copyright (C) 2007 Tom Wilkie <tom.wilkie@gmail.com> +# Copyright (C) 2010 ANEXIA Internetdienstleistungs GmbH +# Author: Stephan Peijnik <spe@anexia.at> #=========================================================================== """Domain creation using new XenAPI """ from xen.xm.main import server, get_default_SR from xml.dom.minidom import parse, getDOMImplementation -from xml.parsers.xmlproc import xmlproc, xmlval, xmldtd +from lxml import etree from xen.xend import sxp from xen.xend.XendAPIConstants import XEN_API_ON_NORMAL_EXIT, \ XEN_API_ON_CRASH_BEHAVIOUR @@ -35,6 +37,7 @@ from os.path import join import traceback import re +import warnings # Used by lxml-based validator def log(_, msg): #print "> " + msg @@ -118,62 +121,58 @@ Use this if possible as it gives nice error messages """ - dtd = xmldtd.load_dtd(self.dtd) - parser = xmlproc.XMLProcessor() - parser.set_application(xmlval.ValidatingApp(dtd, parser)) - parser.dtd = dtd - parser.ent = dtd - parser.parse_resource(file) - + try: + dtd = etree.DTD(open(self.dtd, ''r'')) + except IOError: + # The old code did neither raise an exception here, nor + # did it report an error. For now we issue a warning. + # TODO: How to handle a missing dtd file? + # --sp + warnings.warn(''DTD file %s not found.'' % (self.dtd), + UserWarning) + return + + tree = etree.parse(file) + root = tree.getroot() + if not dtd.validate(root): + self.handle_dtd_errors(dtd) + def check_dom_against_dtd(self, dom): """ Check DOM again DTD. Doesn''t give as nice error messages. (no location info) """ - dtd = xmldtd.load_dtd(self.dtd) - app = xmlval.ValidatingApp(dtd, self) - app.set_locator(self) - self.dom2sax(dom, app) + try: + dtd = etree.DTD(open(self.dtd, ''r'')) + except IOError: + # The old code did neither raise an exception here, nor + # did it report an error. For now we issue a warning. + # TODO: How to handle a missing dtd file? + # --sp + warnings.warn(''DTD file %s not found.'' % (self.dtd), + UserWarning) + return - # Get errors back from ValidatingApp - def report_error(self, number, args=None): - self.errors = xmlproc.errors.english - try: - msg = self.errors[number] - if args != None: - msg = msg % args - except KeyError: - msg = self.errors[4002] % number # Unknown err msg :-) - print msg + # XXX: This may be a bit slow. Maybe we should use another way + # of getting an etree root element from the minidom DOM tree... + # -- sp + root = etree.XML(dom.toxml()) + if not dtd.validate(root): + self.handle_dtd_errors(dtd) + + # Do the same that was done in report_error before. This is directly + # called by check_dtd and check_dom_against_dtd. + # We are using sys.stderr instead of print though (python3k clean). + def handle_dtd_errors(self, dtd): + # XXX: Do we really want to bail out here? + # -- sp + for err in dtd.error_log: + err_str = ''ERROR: %s\n'' % (str(err),) + sys.stderr.write(err_str) + sys.stderr.flush() sys.exit(-1) - # Here for compatibility with ValidatingApp - def get_line(self): - return -1 - - def get_column(self): - return -1 - - def dom2sax(self, dom, app): - """ - Take a dom tree and tarverse it, - issuing SAX calls to app. - """ - for child in dom.childNodes: - if child.nodeType == child.TEXT_NODE: - data = child.nodeValue - app.handle_data(data, 0, len(data)) - else: - app.handle_start_tag( - child.nodeName, - self.attrs_to_dict(child.attributes)) - self.dom2sax(child, app) - app.handle_end_tag(child.nodeName) - - def attrs_to_dict(self, attrs): - return dict(attrs.items()) - # # Checks which cannot be done with dtd # _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Oct-11 09:37 UTC
Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
On Fri, 8 Oct 2010, Stephan Peijnik wrote:> Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is > unmaintained for several years now. xmlproc is used only for validating > XML documents against a DTD file. > > This patch replaces the pyxml/xmlproc based XML validation with code > based on lxml, which is actively maintained. > > Signed-off-by: Stephan Peijnik <spe@anexia.at> >the patch has been mangled by your MUA, could you please send it again with the correct line breaks? A good doc on how to configure email clients to send inline patches is available here: http://kerneltrap.org/Linux/Email_Clients_and_Patches _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-11 09:41 UTC
Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
On Fri, 2010-10-08 at 11:52 +0100, Stephan Peijnik wrote:> Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is > unmaintained for several years now. xmlproc is used only for validating > XML documents against a DTD file. > > This patch replaces the pyxml/xmlproc based XML validation with code > based on lxml, which is actively maintained.I guess an update is also needed to README to direct people to the correct new dependencies. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephan Peijnik
2010-Oct-11 16:18 UTC
Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
On Mon, 2010-10-11 at 10:41 +0100, Ian Campbell wrote: On Fri, 2010-10-08 at 11:52 +0100, Stephan Peijnik wrote:> > This patch replaces the pyxml/xmlproc based XML validation with code > > based on lxml, which is actively maintained. > > I guess an update is also needed to README to direct people to the > correct new dependencies. >Will update README and send new patch. Regards, Stephan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephan Peijnik
2010-Oct-11 16:20 UTC
Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
On Mon, 2010-10-11 at 10:37 +0100, Stefano Stabellini wrote: the patch has been mangled by your MUA, could you please send it again> with the correct line breaks? > > A good doc on how to configure email clients to send inline patches is > available here: > > http://kerneltrap.org/Linux/Email_Clients_and_PatchesThanks for the hint, will re-send unmangled patch as soon as I''ve updated the README to reflect the dependency changes. Regards, Stephan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephan Peijnik
2010-Oct-11 16:48 UTC
Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
Below is a newly generated patch, now also updating README with information on the new dependencies. Additionally the patch should hopefully not be mangled anymore. However, I also attached the patch this time in case there are problems again. The wording in the README is obviously up to discussion. I felt the need to change the "many distros" wording to "some" as more recently distributions like Debian and Ubuntu are including minidom in their core python packages. To be honest I have not checked how other distros handle this nowadays, hence the "some". I also wanted to point out that I do have a xen-4.0-testing hg repository where these changes can directly be pulled from at [0]. From my own experience this can also be merged into xen-unstable painlessly. Regards, Stephan [0] http://bitbucket.org/sp/xen-4.0-testing-sp -- Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is unmaintained for several years now. xmlproc is used only for validating XML documents against a DTD file. This patch replaces the pyxml/xmlproc based XML validation with code based on lxml, which is actively maintained. Signed-off-by: Stephan Peijnik <spe@anexia.at> diff -r 6e0ffcd2d9e0 -r 7082ce86e492 tools/python/xen/xm/xenapi_create.py --- a/tools/python/xen/xm/xenapi_create.py Fri Sep 17 17:06:57 2010 +0100 +++ b/tools/python/xen/xm/xenapi_create.py Fri Oct 08 12:31:18 2010 +0200 @@ -14,13 +14,15 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA #=========================================================================== # Copyright (C) 2007 Tom Wilkie <tom.wilkie@gmail.com> +# Copyright (C) 2010 ANEXIA Internetdienstleistungs GmbH +# Author: Stephan Peijnik <spe@anexia.at> #=========================================================================== """Domain creation using new XenAPI """ from xen.xm.main import server, get_default_SR from xml.dom.minidom import parse, getDOMImplementation -from xml.parsers.xmlproc import xmlproc, xmlval, xmldtd +from lxml import etree from xen.xend import sxp from xen.xend.XendAPIConstants import XEN_API_ON_NORMAL_EXIT, \ XEN_API_ON_CRASH_BEHAVIOUR @@ -35,6 +37,7 @@ from os.path import join import traceback import re +import warnings # Used by lxml-based validator def log(_, msg): #print "> " + msg @@ -118,62 +121,58 @@ Use this if possible as it gives nice error messages """ - dtd = xmldtd.load_dtd(self.dtd) - parser = xmlproc.XMLProcessor() - parser.set_application(xmlval.ValidatingApp(dtd, parser)) - parser.dtd = dtd - parser.ent = dtd - parser.parse_resource(file) - + try: + dtd = etree.DTD(open(self.dtd, ''r'')) + except IOError: + # The old code did neither raise an exception here, nor + # did it report an error. For now we issue a warning. + # TODO: How to handle a missing dtd file? + # --sp + warnings.warn(''DTD file %s not found.'' % (self.dtd), + UserWarning) + return + + tree = etree.parse(file) + root = tree.getroot() + if not dtd.validate(root): + self.handle_dtd_errors(dtd) + def check_dom_against_dtd(self, dom): """ Check DOM again DTD. Doesn''t give as nice error messages. (no location info) """ - dtd = xmldtd.load_dtd(self.dtd) - app = xmlval.ValidatingApp(dtd, self) - app.set_locator(self) - self.dom2sax(dom, app) + try: + dtd = etree.DTD(open(self.dtd, ''r'')) + except IOError: + # The old code did neither raise an exception here, nor + # did it report an error. For now we issue a warning. + # TODO: How to handle a missing dtd file? + # --sp + warnings.warn(''DTD file %s not found.'' % (self.dtd), + UserWarning) + return - # Get errors back from ValidatingApp - def report_error(self, number, args=None): - self.errors = xmlproc.errors.english - try: - msg = self.errors[number] - if args != None: - msg = msg % args - except KeyError: - msg = self.errors[4002] % number # Unknown err msg :-) - print msg + # XXX: This may be a bit slow. Maybe we should use another way + # of getting an etree root element from the minidom DOM tree... + # -- sp + root = etree.XML(dom.toxml()) + if not dtd.validate(root): + self.handle_dtd_errors(dtd) + + # Do the same that was done in report_error before. This is directly + # called by check_dtd and check_dom_against_dtd. + # We are using sys.stderr instead of print though (python3k clean). + def handle_dtd_errors(self, dtd): + # XXX: Do we really want to bail out here? + # -- sp + for err in dtd.error_log: + err_str = ''ERROR: %s\n'' % (str(err),) + sys.stderr.write(err_str) + sys.stderr.flush() sys.exit(-1) - # Here for compatibility with ValidatingApp - def get_line(self): - return -1 - - def get_column(self): - return -1 - - def dom2sax(self, dom, app): - """ - Take a dom tree and tarverse it, - issuing SAX calls to app. - """ - for child in dom.childNodes: - if child.nodeType == child.TEXT_NODE: - data = child.nodeValue - app.handle_data(data, 0, len(data)) - else: - app.handle_start_tag( - child.nodeName, - self.attrs_to_dict(child.attributes)) - self.dom2sax(child, app) - app.handle_end_tag(child.nodeName) - - def attrs_to_dict(self, attrs): - return dict(attrs.items()) - # # Checks which cannot be done with dtd # diff -r 3ce0d5dc606f -r 76fd774f7cd1 README --- a/README Sat Oct 09 22:19:41 2010 +0200 +++ b/README Mon Oct 11 18:31:36 2010 +0200 @@ -137,12 +137,15 @@ Xend (the Xen daemon) has the following runtime dependencies: * Python 2.3 or later. - In many distros, the XML-aspects to the standard library + In some distros, the XML-aspects to the standard library (xml.dom.minidom etc) are broken out into a separate python-xml package. This is also required. + In more recent versions of Debian and Ubuntu the XML-aspects are included + in the base python package however (python-xml has been removed + from Debian in squeeze and from Ubuntu in intrepid). URL: http://www.python.org/ - Debian: python, python-xml + Debian: python * For optional SSL support, pyOpenSSL: URL: http://pyopenssl.sourceforge.net/ @@ -153,8 +156,9 @@ Debian: python-pam * For optional XenAPI support in XM, PyXML: - URL: http://pyxml.sourceforge.net - YUM: PyXML + URL: http://codespeak.net/lxml/ + Debian: python-lxml + YUM: python-lxml Intel(R) Trusted Execution Technology Support _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2010-Oct-11 16:55 UTC
Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
On Mon, 11 Oct 2010, Stephan Peijnik wrote:> Below is a newly generated patch, now also updating README with > information on the new dependencies. Additionally the patch should > hopefully not be mangled anymore. However, I also attached the patch > this time in case there are problems again. > > The wording in the README is obviously up to discussion. I felt > the need to change the "many distros" wording to "some" as more > recently distributions like Debian and Ubuntu are including > minidom in their core python packages. To be honest I have > not checked how other distros handle this nowadays, hence the > "some". > > I also wanted to point out that I do have a xen-4.0-testing > hg repository where these changes can directly be pulled from > at [0]. From my own experience this can also be merged into > xen-unstable painlessly. > > Regards, > > Stephan > > [0] http://bitbucket.org/sp/xen-4.0-testing-sp > > -- > Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is > unmaintained for several years now. xmlproc is used only for validating > XML documents against a DTD file. > > This patch replaces the pyxml/xmlproc based XML validation with code > based on lxml, which is actively maintained. > > Signed-off-by: Stephan Peijnik <spe@anexia.at>applied, thanks _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stephan Peijnik
2010-Oct-12 07:39 UTC
Re: [Xen-devel] [PATCH] Replace pyxml/xmlproc-based XML validator with lxml based one.
On Mon, 2010-10-11 at 17:55 +0100, Stefano Stabellini wrote:> > Pyxml/xmlproc is being used in tools/xen/xm/xenapi_create.py but is > > unmaintained for several years now. xmlproc is used only for validating > > XML documents against a DTD file. > > > > This patch replaces the pyxml/xmlproc based XML validation with code > > based on lxml, which is actively maintained. > > > > Signed-off-by: Stephan Peijnik <spe@anexia.at> > > applied, thanksThanks for accepting the patch. Regards, Stephan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel