Stefan Berger
2007-Nov-15 03:25 UTC
[Xen-devel] [PATCH] [Xend] host.get_log() : clean log of non-printable characters
When retrieving the log via host.get_log() the python parser on the receiving side gets upset about non-printable characters ("\b"). Those stem from libxc/xc_domain_restore:xc_domain_restore(). Signed-off-by: Stefan Berger <stefanb@us.ibm.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2007-Nov-15 14:38 UTC
Re: [Xen-devel] [PATCH] [Xend] host.get_log() : clean log of non-printable characters
Stefan Berger writes ("[Xen-devel] [PATCH] [Xend] host.get_log() : clean log of non-printable characters"):> When retrieving the log via host.get_log() the python parser on the > receiving side gets upset about non-printable characters ("\b"). Those > stem from libxc/xc_domain_restore:xc_domain_restore().It is a shame that we are forced into writing a lossy log retrieval method by braindamage in XMLRPC and XML 1.0. Perhaps we should in future think about a get_log_lossless function which uses a binary encoding. It''d have to be base64 :-/.> - return xen_api_success(log_buffer) > + i = 0 > + res = "" > + while i < len(log_buffer): > + c = ord(log_buffer[i]) > + if (c < 32 or c > 126) and (c < 10 or c > 13): > + res += " " > + else: > + res += log_buffer[i] > + i += 1 > + return xen_api_success(res)This is a strange way of doing things and will be quite slow. It''s also wrong in that it replaces tabs. In theory it would be best to try to map away all Unicode characters which are not in XML 1.0''s Char. However, this would involve explicitly interpreting the logfile as UTF-8 and it''s not clear to me that it always is. If it isn''t, it''s probably better to let the caller get whatever the logfile byte string is and hope they don''t choke - at least until we know under what circumstances this arises. So it''s better just to map away the character we know is causing problems. \r will be OK because it''s allowed in XML as an encoding of newline, which will do. I''ve chosen below to replace \f as well since I suspect they may appear at some point. Ian. diff -r ba69fe2dce91 tools/python/xen/xend/XendAPI.py --- a/tools/python/xen/xend/XendAPI.py Tue Nov 13 20:13:50 2007 +0000 +++ b/tools/python/xen/xend/XendAPI.py Thu Nov 15 14:32:33 2007 +0000 @@ -994,6 +994,8 @@ class XendAPI(object): def host_get_log(self, session, host_ref): log_file = open(XendLogging.getLogFilename()) log_buffer = log_file.read() + log_buffer = log_buffer.replace(''\b'','' '') + log_buffer = log_buffer.replace(''\f'',''\n'') return xen_api_success(log_buffer) def host_send_debug_keys(self, _, host_ref, keys): _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Berger
2007-Nov-16 03:04 UTC
Re: [Xen-devel] [PATCH] [Xend] host.get_log() : clean log of non-printable characters
xen-devel-bounces@lists.xensource.com wrote on 11/15/2007 09:38:37 AM:> Stefan Berger writes ("[Xen-devel] [PATCH] [Xend] host.get_log() : > clean log of non-printable characters"): > > When retrieving the log via host.get_log() the python parser on the > > receiving side gets upset about non-printable characters ("\b"). Those > > stem from libxc/xc_domain_restore:xc_domain_restore(). > > It is a shame that we are forced into writing a lossy log retrieval > method by braindamage in XMLRPC and XML 1.0. Perhaps we should in > future think about a get_log_lossless function which uses a binary > encoding. It''d have to be base64 :-/. > > > - return xen_api_success(log_buffer) > > + i = 0 > > + res = "" > > + while i < len(log_buffer): > > + c = ord(log_buffer[i]) > > + if (c < 32 or c > 126) and (c < 10 or c > 13): > > + res += " " > > + else: > > + res += log_buffer[i] > > + i += 1 > > + return xen_api_success(res) > > This is a strange way of doing things and will be quite slow.I agree. log_buffer[i] = '' '' unfortunately does not wokr with python. The solution you show below is probably the right one. Stefan> It''s also wrong in that it replaces tabs. > > In theory it would be best to try to map away all Unicode characters > which are not in XML 1.0''s Char. However, this would involve > explicitly interpreting the logfile as UTF-8 and it''s not clear to me > that it always is. If it isn''t, it''s probably better to let the > caller get whatever the logfile byte string is and hope they don''t > choke - at least until we know under what circumstances this arises. > > So it''s better just to map away the character we know is causing > problems. \r will be OK because it''s allowed in XML as an encoding of > newline, which will do. I''ve chosen below to replace \f as well since > I suspect they may appear at some point. > > Ian. > > diff -r ba69fe2dce91 tools/python/xen/xend/XendAPI.py > --- a/tools/python/xen/xend/XendAPI.py Tue Nov 13 20:13:50 2007 +0000 > +++ b/tools/python/xen/xend/XendAPI.py Thu Nov 15 14:32:33 2007 +0000 > @@ -994,6 +994,8 @@ class XendAPI(object): > def host_get_log(self, session, host_ref): > log_file = open(XendLogging.getLogFilename()) > log_buffer = log_file.read() > + log_buffer = log_buffer.replace(''\b'','' '') > + log_buffer = log_buffer.replace(''\f'',''\n'') > return xen_api_success(log_buffer) > > def host_send_debug_keys(self, _, host_ref, keys): > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel