Ian Jackson
2009-Nov-16 12:10 UTC
[Xen-devel] Re: [PATCH] libxl: check for early failures of qemu-dm (v2)
This patch makes xl create check whether qemu-dm has started correctly, and causes it to fail immediately with appropriate errors if not. There are other bugfixes too. More specifically: * libxl_create_device_model forks twice rather than once so that the process which calls libxl does not end up being the actual parent of qemu. That avoids the need for the qemu-dm process to be reaped at some indefinite time in the future. * The first fork generates an intermediate process which is responsible for writing the qemu-dm pid to xenstore and then merely waits to collect and report on qemu-dm''s exit status during startup. New arguments to libxl_create_device_model allow the preservation of its pid so that a later call can check whether the startup is successful. * xl.c''s create_domain checks for errors in its libxl calls. Consequential changes: * libxl_wait_for_device_model now takes a callback function parameter which is called repeatedly in the loop iteration and allows the caller to abort the wait. * libxl_exec no longer calls fork; there is a new libxl_fork. * osdep.[ch] new use #define _GNU_SOURCE. The provided asprintf declarations are suppressed when not needed (currently, always). * There is a hook to override waitpid, which will be necessary for some callers. Remaining problems and other issues I noticed or we found: * The error handling is rather inconsistent still and lacking in places. * xl_logv is declared but not defined. * _GNU_SOURCE should be used throughout. The asprintf implementation should be disabled in favour of the system one. * XL_LOG_ERROR_ERRNO needs to actually print the errno value. * destroy_device_model can kill random dom0 processes (!) * struct libxl_ctx should be defined in libxl_internal.h. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> (Changes since v1: * Remove new error log level; should be in a future patch * Properly fixed the asprintf problem * Indentation no uses literal tabs) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Vincent Hanquez
2009-Nov-16 14:52 UTC
Re: [Xen-devel] Re: [PATCH] libxl: check for early failures of qemu-dm (v2)
Ian Jackson wrote:> This patch makes xl create check whether qemu-dm has started > correctly, and causes it to fail immediately with appropriate errors > if not. There are other bugfixes too. > > More specifically: > > * libxl_create_device_model forks twice rather than once so that the > process which calls libxl does not end up being the actual parent > of qemu. That avoids the need for the qemu-dm process to be reaped > at some indefinite time in the future. > > * The first fork generates an intermediate process which is > responsible for writing the qemu-dm pid to xenstore and then merely > waits to collect and report on qemu-dm''s exit status during > startup. New arguments to libxl_create_device_model allow the > preservation of its pid so that a later call can check whether the > startup is successful. >Did you have the qemu-dm ready patch in qemu ?> * xl.c''s create_domain checks for errors in its libxl calls. > > Consequential changes: > > * libxl_wait_for_device_model now takes a callback function parameter > which is called repeatedly in the loop iteration and allows the > caller to abort the wait. > > * libxl_exec no longer calls fork; there is a new libxl_fork. >this is not libxl goal to provide wrapper for syscalls. the exec should do the double fork+exec itself, no need to have a fork function.> * osdep.[ch] new use #define _GNU_SOURCE. The provided asprintf > declarations are suppressed when not needed (currently, always). > > * There is a hook to override waitpid, which will be necessary for > some callers. >why ?> Remaining problems and other issues I noticed or we found: > > * The error handling is rather inconsistent still and lacking in > places. > > * xl_logv is declared but not defined. > * _GNU_SOURCE should be used throughout. The asprintf implementation > should be disabled in favour of the system one. >osdeps was just suppose to be available for netbsd. not sure why the contrary happens on christoph''s patch.> * XL_LOG_ERROR_ERRNO needs to actually print the errno value. >there shouldn''t be a LOG_ERROR_ERRNO value. what you want is the different log function that will embedd an errno automatically in the fmt, not a different logging level. not sure what you meant later by "remove new error level, should be in future patch".> * destroy_device_model can kill random dom0 processes (!) > > * struct libxl_ctx should be defined in libxl_internal.h. >no, otherwise the init ctx need to allocate itself the memory of the context, instead of having the caller "allocate it" itself on the fct stack.> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> > > (Changes since v1: > * Remove new error log level; should be in a future patch > * Properly fixed the asprintf problem > * Indentation no uses literal tabs-- Vincent _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2009-Nov-16 15:13 UTC
Re: [Xen-devel] Re: [PATCH] libxl: check for early failures of qemu-dm (v2)
Vincent Hanquez writes ("Re: [Xen-devel] Re: [PATCH] libxl: check for early failures of qemu-dm (v2)"):> Ian Jackson wrote: > > This patch makes xl create check whether qemu-dm has started > > correctly, and causes it to fail immediately with appropriate errors > > if not. There are other bugfixes too....> > * The first fork generates an intermediate process which is > > responsible for writing the qemu-dm pid to xenstore and then merely > > waits to collect and report on qemu-dm''s exit status during > > startup. New arguments to libxl_create_device_model allow the > > preservation of its pid so that a later call can check whether the > > startup is successful. > > Did you have the qemu-dm ready patch in qemu ?This is not relevant because my qemu currently dies before it gets to that point. Stefano tested v1 of my patch and it worked for him.> > * libxl_exec no longer calls fork; there is a new libxl_fork. > > this is not libxl goal to provide wrapper for syscalls. > the exec should do the double fork+exec itself, no need to have a fork > function.No, it can''t, because there is actual functionality in the intermediate process. The libxl_fork function is not provided for the benefit of "providing wrapper for syscalls". It is there to factor out the common error handling for the two instances of fork in libxl.c.> > * There is a hook to override waitpid, which will be necessary for > > some callers. > > why ?Why is it necessary ? Some applications have an event loop or SIGCHLD handler which automatically reaps all children. In such an application in order to collect a child process exit status we need to allow the application to look the process up in its own table of reaped processes.> > * _GNU_SOURCE should be used throughout. The asprintf implementation > > should be disabled in favour of the system one. > > > osdeps was just suppose to be available for netbsd. not sure why the > contrary happens on christoph''s patch.The osdeps arrangements were broken and backwards. We were compiling them in on Linux, which isn''t necessary. It''s not necessary on BSD either, but BSD presumably barfed on it because it declared the system asprintf without special handling.> there shouldn''t be a LOG_ERROR_ERRNO value. what you want is the > different log function > that will embedd an errno automatically in the fmt, not a different > logging level. not sure what you meant later by "remove new error level, > should be in future patch".I mean that I''m going to do exactly that in a followup patch. The comment about LOG_ERROR_ERRNO was left over in my changeset comment by mistake.> > * destroy_device_model can kill random dom0 processes (!) > > > > * struct libxl_ctx should be defined in libxl_internal.h. > > no, otherwise the init ctx need to allocate itself the memory of the > context, instead of having > the caller "allocate it" itself on the fct stack.If you do that then libxl can''t be linked dynamically. We should talk about this. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Vincent Hanquez
2009-Nov-16 15:59 UTC
Re: [Xen-devel] Re: [PATCH] libxl: check for early failures of qemu-dm (v2)
Ian Jackson wrote:>> >> Did you have the qemu-dm ready patch in qemu ? >> > > This is not relevant because my qemu currently dies before it gets to > that point. Stefano tested v1 of my patch and it worked for him. >it''s not irrelevant. as heavy discuss before, fixing the ready problem fix the qemu died in the meantime problem too (not ideally since you end-up waiting timeout, but it does). what you did is just an optimisation of the qemu died problem, which leave one problem open.> No, it can''t, because there is actual functionality in the > intermediate process. The libxl_fork function is not provided for the > benefit of "providing wrapper for syscalls". It is there to factor > out the common error handling for the two instances of fork in > libxl.c. >you can have a middle process callback without any problem within the libxl_exec call.> Why is it necessary ? Some applications have an event loop or SIGCHLD > handler which automatically reaps all children. In such an > application in order to collect a child process exit status we need to > allow the application to look the process up in its own table of > reaped processes. >sounds a bit premature, since we don''t even have such an application yet nor that we ever had in the past either.> The osdeps arrangements were broken and backwards. We were compiling > them in on Linux, which isn''t necessary. It''s not necessary on BSD > either, but BSD presumably barfed on it because it declared the system > asprintf without special handling. >yep>> no, otherwise the init ctx need to allocate itself the memory of the >> context, instead of having >> the caller "allocate it" itself on the fct stack. >> > > If you do that then libxl can''t be linked dynamically. We should talk > about this. >it *can* be linked dynamically. the only "shortcoming" is the context structure can''t grow dynamically. the same happens to all structures that we use for argument passing. -- Vincent _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel