there are four choices for using ao_how in libvirt. (a), ao without callback and set libxl_sigchld_owner_mainloop (b), ao without callback and set libxl_sigchld_owner_libxl_always (c), ao with callback and set libxl_sigchld_owner_mainloop (d), ao with callback and set libxl_sigchld_owner_libxl_always i got some trouble when i try (b) and (c). i still not decided which one is better choice for implement in libvirt. it would be easier if i could try above four choice. Bamvor Jian Zhang (2): expose child fd in order to handle child exit in libvirt call ao_how callback explicitly tools/libxl/libxl.h | 2 ++ tools/libxl/libxl_event.c | 18 ++++++++++++++++++ tools/libxl/libxl_fork.c | 5 +++++ tools/libxl/libxl_internal.h | 1 + 4 files changed, 26 insertions(+) -- 1.8.1.4
Bamvor Jian Zhang
2013-Nov-05 09:22 UTC
[PATCH 1/2] expose child fd in order to handle child exit in libvirt
libvirt could handle fd and timeout event through libxl_osevent_hooks. either of these will not inform the child exit if libvirt set the libxl_sigchld_owner_libxl_always. add a function for returning the sigchld_selfpipe in order to handle the child exit in libvirt. meanwhile, there is only one pipe in ctx, it seems that it is not worth to add this to libxl_osevent_hooks. Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com> --- tools/libxl/libxl.h | 2 ++ tools/libxl/libxl_fork.c | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 1c6675d..11e1bc3 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1071,6 +1071,8 @@ int libxl_flask_loadpolicy(libxl_ctx *ctx, void *policy, uint32_t size); int libxl_fd_set_cloexec(libxl_ctx *ctx, int fd, int cloexec); int libxl_fd_set_nonblock(libxl_ctx *ctx, int fd, int nonblock); +int libxl_get_pipe_handle(libxl_ctx *ctx, int num); + #include <libxl_event.h> #endif /* LIBXL_H */ diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index 044ddad..1989258 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -418,6 +418,11 @@ int libxl__ev_child_xenstore_reopen(libxl__gc *gc, const char *what) { return rc; } +int libxl_get_pipe_handle(libxl_ctx *ctx, int num) +{ + return ctx->sigchld_selfpipe[num]; +} + /* * Local variables: * mode: C -- 1.8.1.4
ao_how callback is not called after inserted into list. in my test, i directly call it in libxl__ao_complete_check_progress_reports. i know it should not work like this. Could you give some suggestion about it? Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com> --- tools/libxl/libxl_event.c | 18 ++++++++++++++++++ tools/libxl/libxl_internal.h | 1 + 2 files changed, 19 insertions(+) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 0fe4428..8320547 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -1578,6 +1578,23 @@ void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc) libxl__ao_complete_check_progress_reports(egc, ao); } +void libxl__ao_occurred(libxl__egc *egc) +{ + EGC_GC; + libxl__ao *ao, *ao_tmp; + LIBXL_TAILQ_FOREACH_SAFE(ao, &egc->aos_for_callback, + entry_for_callback, ao_tmp) { + LIBXL_TAILQ_REMOVE(&egc->aos_for_callback, ao, entry_for_callback); + LOG(DEBUG,"ao %p: completion callback", ao); + ao->how.callback(CTX, ao->rc, ao->how.u.for_callback); + CTX_LOCK; + ao->notified = 1; + if (!ao->in_initiator) + libxl__ao__destroy(CTX, ao); + CTX_UNLOCK; + } +} + void libxl__ao_complete_check_progress_reports(libxl__egc *egc, libxl__ao *ao) { /* @@ -1601,6 +1618,7 @@ void libxl__ao_complete_check_progress_reports(libxl__egc *egc, libxl__ao *ao) } else if (ao->how.callback) { LIBXL__LOG(ctx, XTL_DEBUG, "ao %p: complete for callback",ao); LIBXL_TAILQ_INSERT_TAIL(&egc->aos_for_callback, ao, entry_for_callback); + libxl__ao_occurred(egc); } else { libxl_event *ev; ev = NEW_EVENT(egc, OPERATION_COMPLETE, ao->domid, ao->how.u.for_event); diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 4f92522..bed775d 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1760,6 +1760,7 @@ _hidden int libxl__ao_inprogress(libxl__ao *ao, const char *file, int line, const char *func); /* temporarily unlocks */ _hidden void libxl__ao_abort(libxl__ao *ao); _hidden void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc); +_hidden void libxl__ao_occurred(libxl__egc *egc); _hidden libxl__gc *libxl__ao_inprogress_gc(libxl__ao *ao); /* Can be called at any time. Use is essential for any aop user. */ -- 1.8.1.4
Bamvor Jian Zhang writes ("[PATCH 2/2] call ao_how callback explicitly"):> ao_how callback is not called after inserted into list. > in my test, i directly call it in > libxl__ao_complete_check_progress_reports. i know it should not > work like this. Could you give some suggestion about it?I''m not sure what you mean. Under what conditions do you not see the callback happen ? The existing machinery is supposed to take care of that. Specifically, in the patch you''re changing, libxl__ao_complete_check_progress_reports puts the ao onto egc->aos_for_callback. On the return path from the libxl event function back to the application, we are supposed to call libxl__egc_cleanup, which in turn calls egc_run_callbacks, which should pick up the aos on aos_for_callback. libxl__egc_cleanup is called from the event loops in libxl_event_wait and libxl__ao_inprogress and from the macro EGC_FREE. It should be impossible for a path to be missed because inside libxl one needs an egc to call libxl__ao_complete. What kind of an ao_how is your application passing to libxl ? Is your application multithreaded ? Ian.
Ian Jackson
2013-Nov-05 16:05 UTC
Re: [PATCH 0/2] question on libxl ao [and 1 more messages]
Bamvor Jian Zhang writes ("[PATCH 0/2] question on libxl ao"):> there are four choices for using ao_how in libvirt. > (a), ao without callback and set libxl_sigchld_owner_mainloop > (b), ao without callback and set libxl_sigchld_owner_libxl_always > (c), ao with callback and set libxl_sigchld_owner_mainloop > (d), ao with callback and set libxl_sigchld_owner_libxl_always > > i got some trouble when i try (b) and (c). > i still not decided which one is better choice for implement in > libvirt. it would be easier if i could try above four choice.I''m not sure I follow. I think the right answer for libvirt is probably aos specifying callback, and libxl_sigchld_owner_mainloop. I''m assuming that libvirt has something for handling children already. But I haven''t looked. If it doesn''t then libxl_sigchld_owner_always would probably be best. Bamvor Jian Zhang writes ("[PATCH 1/2] expose child fd in order to handle child exit in libvirt"):> libvirt could handle fd and timeout event through > libxl_osevent_hooks. either of these will not inform the child > exit if libvirt set the libxl_sigchld_owner_libxl_always.I don''t exactly follow, but I think you have found a bug. If libvirt specifies libxl_sigchld_owner_libxl_always then libxl will install a SIGCHLD handler and to the self-pipe trick. The self-pipe fd should be registered with the fd event machinery in the normal way. However, this doesn''t actually appear to happen. Instead the beforepoll/afterpoll functions handle that fd ad-hoc. The result is that, probably, nothing will notice the selfpipe becoming writable. I will see about writing a patch to fix this.> add a function for returning the sigchld_selfpipe in order to > handle the child exit in libvirt. > meanwhile, there is only one pipe in ctx, it seems that it is > not worth to add this to libxl_osevent_hooks.This approach is not right, though. There is no need to add additional interfaces to libxl. The bug just needs to be fixed. Thanks, Ian.
libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always and does not always work correctly with the fd registration system. 1/3 libxl: event system: pass gc, not just ctx, to internal 2/3 libxl: event system: Make 3/3 libxl: event system: properly register the SIGCHLD Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com>
Ian Jackson
2013-Nov-05 19:19 UTC
[PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators
We are going to want the gc for libxl__ev_fd_register. No functional change in this patch. Simply change the argument types, and the actual arguments from ctx to gc. Inside these functions, use CTX (the macro which uses gc) rather than the old formal parameter ctx. Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> --- tools/libxl/libxl.c | 2 +- tools/libxl/libxl_fork.c | 36 ++++++++++++++++++------------------ tools/libxl/libxl_internal.h | 4 ++-- 3 files changed, 21 insertions(+), 21 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 0f0f56c..d0a87d2 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -171,7 +171,7 @@ int libxl_ctx_free(libxl_ctx *ctx) /* If we have outstanding children, then the application inherits * them; we wish the application good luck with understanding * this if and when it reaps them. */ - libxl__sigchld_removehandler(ctx); + libxl__sigchld_removehandler(gc); if (ctx->sigchld_selfpipe[0] >= 0) { close(ctx->sigchld_selfpipe[0]); diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index 044ddad..f392d67 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -173,23 +173,23 @@ static void sigchld_removehandler_core(void) sigchld_owner = 0; } -void libxl__sigchld_removehandler(libxl_ctx *ctx) /* non-reentrant */ +void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */ { atfork_lock(); - if (sigchld_owner == ctx) + if (sigchld_owner == CTX) sigchld_removehandler_core(); atfork_unlock(); } -int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */ +int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ { int r, rc; - if (ctx->sigchld_selfpipe[0] < 0) { - r = pipe(ctx->sigchld_selfpipe); + if (CTX->sigchld_selfpipe[0] < 0) { + r = pipe(CTX->sigchld_selfpipe); if (r) { - ctx->sigchld_selfpipe[0] = -1; - LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, + CTX->sigchld_selfpipe[0] = -1; + LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR, "failed to create sigchld pipe"); rc = ERROR_FAIL; goto out; @@ -197,11 +197,11 @@ int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */ } atfork_lock(); - if (sigchld_owner != ctx) { + if (sigchld_owner != CTX) { struct sigaction ours; assert(!sigchld_owner); - sigchld_owner = ctx; + sigchld_owner = CTX; memset(&ours,0,sizeof(ours)); ours.sa_handler = sigchld_handler; @@ -239,11 +239,11 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx) return ctx->sigchld_selfpipe[0]; } -static void perhaps_removehandler(libxl_ctx *ctx) +static void perhaps_removehandler(libxl__gc *gc) { - if (LIBXL_LIST_EMPTY(&ctx->children) && - ctx->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always) - libxl__sigchld_removehandler(ctx); + if (LIBXL_LIST_EMPTY(&CTX->children) && + CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always) + libxl__sigchld_removehandler(gc); } static int childproc_reaped(libxl__egc *egc, pid_t pid, int status) @@ -263,7 +263,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status) ch->pid = -1; ch->callback(egc, ch, pid, status); - perhaps_removehandler(CTX); + perhaps_removehandler(gc); return 0; } @@ -331,7 +331,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch, int rc; if (chldmode_ours(CTX)) { - rc = libxl__sigchld_installhandler(CTX); + rc = libxl__sigchld_installhandler(gc); if (rc) goto out; } @@ -361,7 +361,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch, rc = pid; out: - perhaps_removehandler(CTX); + perhaps_removehandler(gc); CTX_UNLOCK; return rc; } @@ -383,10 +383,10 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks, switch (ctx->childproc_hooks->chldowner) { case libxl_sigchld_owner_mainloop: case libxl_sigchld_owner_libxl: - libxl__sigchld_removehandler(ctx); + libxl__sigchld_removehandler(gc); break; case libxl_sigchld_owner_libxl_always: - libxl__sigchld_installhandler(ctx); + libxl__sigchld_installhandler(gc); break; default: abort(); diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 4729566..bd2eeed 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -838,8 +838,8 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p); /* Internal to fork and child reaping machinery */ extern const libxl_childproc_hooks libxl__childproc_default_hooks; -int libxl__sigchld_installhandler(libxl_ctx *ctx); /* non-reentrant;logs errs */ -void libxl__sigchld_removehandler(libxl_ctx *ctx); /* non-reentrant */ +int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */ +void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */ int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */ void libxl__fork_selfpipe_woken(libxl__egc *egc); int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */ -- 1.7.10.4
Ian Jackson
2013-Nov-05 19:19 UTC
[PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work
Previously, libxl_sigchld_owner_libxl_always was mishandled. It would result in libxl paying no attention to the sigchld self pipe. Fix this by fixing chldmode_ours so that it returns true iff we are supposed to be handling SIGCHLD. Additionally, we arrange to use chldmode_ours everywhere where we are installing/removing signal handlers and/or deciding whether to check the self pipe, etc. This means it needs a new "creating" flag argument for the benefit of libxl__ev_child_fork, which needs to install the signal handler in libxl_sigchld_owner_libxl even if there are not currently any children. ctx->childproc_hooks->chldowner is now interpreted only by chldmode_ours. Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com> --- tools/libxl/libxl_fork.c | 51 ++++++++++++++++++++++++---------------------- 1 file changed, 27 insertions(+), 24 deletions(-) diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index f392d67..335ca40 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -222,18 +222,23 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ return rc; } -static int chldmode_ours(libxl_ctx *ctx) +static bool chldmode_ours(libxl_ctx *ctx, bool creating) { - return ctx->childproc_hooks->chldowner == libxl_sigchld_owner_libxl; + switch (ctx->childproc_hooks->chldowner) { + case libxl_sigchld_owner_libxl: + return creating || !LIBXL_LIST_EMPTY(&ctx->children); + case libxl_sigchld_owner_mainloop: + return 0; + case libxl_sigchld_owner_libxl_always: + return 1; + } + abort(); } int libxl__fork_selfpipe_active(libxl_ctx *ctx) { /* Returns the fd to read, or -1 */ - if (!chldmode_ours(ctx)) - return -1; - - if (LIBXL_LIST_EMPTY(&ctx->children)) + if (!chldmode_ours(ctx, 0)) return -1; return ctx->sigchld_selfpipe[0]; @@ -241,11 +246,21 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx) static void perhaps_removehandler(libxl__gc *gc) { - if (LIBXL_LIST_EMPTY(&CTX->children) && - CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always) + if (chldmode_ours(CTX, 0)) libxl__sigchld_removehandler(gc); } +static int perhaps_installhandler(libxl__gc *gc, bool creating) +{ + int rc; + + if (chldmode_ours(CTX, creating)) { + rc = libxl__sigchld_installhandler(gc); + if (rc) return rc; + } + return 0; +} + static int childproc_reaped(libxl__egc *egc, pid_t pid, int status) { EGC_GC; @@ -284,7 +299,7 @@ void libxl__fork_selfpipe_woken(libxl__egc *egc) * ctx must be locked EXACTLY ONCE */ EGC_GC; - while (chldmode_ours(CTX) /* in case the app changes the mode */) { + while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) { int status; pid_t pid = waitpid(-1, &status, WNOHANG); @@ -330,10 +345,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch, CTX_LOCK; int rc; - if (chldmode_ours(CTX)) { - rc = libxl__sigchld_installhandler(gc); - if (rc) goto out; - } + perhaps_installhandler(gc, 1); pid_t pid CTX->childproc_hooks->fork_replacement @@ -380,17 +392,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks, ctx->childproc_hooks = hooks; ctx->childproc_user = user; - switch (ctx->childproc_hooks->chldowner) { - case libxl_sigchld_owner_mainloop: - case libxl_sigchld_owner_libxl: - libxl__sigchld_removehandler(gc); - break; - case libxl_sigchld_owner_libxl_always: - libxl__sigchld_installhandler(gc); - break; - default: - abort(); - } + perhaps_removehandler(gc); + perhaps_installhandler(gc, 0); /* idempotent, ok to ignore errors for now */ CTX_UNLOCK; GC_FREE; -- 1.7.10.4
Ian Jackson
2013-Nov-05 19:19 UTC
[PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe
An application which uses libxl_osevent_register_hooks, and doesn''t use libxl_sigchld_owner_mainloop, would never properly experience the deaths of its (libxl) children. This is because the SIGCHLD self pipe would be handled using ad-hoc code in beforepoll_internal and afterpoll_internal. However, this is no good if application is using the fd registration system instead; in that case these functions would not be called and nothing would deal with readability of the self pipe. Fix this as follows: The SIGCHLD self pipe now is now dealt with by a standard libxl__ev_fd handler, which is registered and deregistered along with the SIGCHLD handler itself. The handler function subsumes the ad-hoc response code removed from beforepoll_internal, and the functionality of libxl__fork_selfpipe_woken. This is also tidier as the SIGCHLD self pipe is no longer touched outside libxl_fork.c other than in ctx initialisation and teardown. (The ad-hoc arrangements for poller->wakeup_pipe in beforepoll_internal and afterpoll_internal are OK because the libxl__poller mechanism exists to wake up threads which are sitting inside libxl''s poll loop, so is not applicable to the application''s event loop.) Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com> --- tools/libxl/libxl.c | 2 ++ tools/libxl/libxl_event.c | 12 ---------- tools/libxl/libxl_fork.c | 52 ++++++++++++++++++++++++++++++++---------- tools/libxl/libxl_internal.h | 3 +-- 4 files changed, 43 insertions(+), 26 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index d0a87d2..23e5a40 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -65,6 +65,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version, ctx->childproc_user = 0; ctx->sigchld_selfpipe[0] = -1; + libxl__ev_fd_init(&ctx->sigchld_selfpipe_efd); /* The mutex is special because we can''t idempotently destroy it */ @@ -146,6 +147,7 @@ int libxl_ctx_free(libxl_ctx *ctx) for (i = 0; i < ctx->watch_nslots; i++) assert(!libxl__watch_slot_contents(gc, i)); libxl__ev_fd_deregister(gc, &ctx->watch_efd); + libxl__ev_fd_deregister(gc, &ctx->sigchld_selfpipe_efd); /* Now there should be no more events requested from the application: */ diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 0fe4428..b732816 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -774,10 +774,6 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller, \ REQUIRE_FD(poller->wakeup_pipe[0], POLLIN, BODY); \ \ - int selfpipe = libxl__fork_selfpipe_active(CTX); \ - if (selfpipe >= 0) \ - REQUIRE_FD(selfpipe, POLLIN, BODY); \ - \ }while(0) #define REQUIRE_FD(req_fd_, req_events_, BODY) do{ \ @@ -999,14 +995,6 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller, if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0); } - int selfpipe = libxl__fork_selfpipe_active(CTX); - if (selfpipe >= 0 && - afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN)) { - int e = libxl__self_pipe_eatall(selfpipe); - if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0); - libxl__fork_selfpipe_woken(egc); - } - for (;;) { libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes); if (!etime) diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index 335ca40..aad1652 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -155,6 +155,9 @@ int libxl__carefd_fd(const libxl__carefd *cf) * Actual child process handling */ +static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev, + int fd, short events, short revents); + static void sigchld_handler(int signo) { int e = libxl__self_pipe_wakeup(sigchld_owner->sigchld_selfpipe[1]); @@ -175,10 +178,18 @@ static void sigchld_removehandler_core(void) void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */ { + int rc; + atfork_lock(); if (sigchld_owner == CTX) sigchld_removehandler_core(); atfork_unlock(); + + if (libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) { + rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, 0); + if (rc) + libxl__ev_fd_deregister(gc, &CTX->sigchld_selfpipe_efd); + } } int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ @@ -195,6 +206,15 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ goto out; } } + if (!libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) { + rc = libxl__ev_fd_register(gc, &CTX->sigchld_selfpipe_efd, + sigchld_selfpipe_handler, + CTX->sigchld_selfpipe[0], POLLIN); + if (rc) goto out; + } else { + rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, POLLIN); + if (rc) goto out; + } atfork_lock(); if (sigchld_owner != CTX) { @@ -235,18 +255,9 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating) abort(); } -int libxl__fork_selfpipe_active(libxl_ctx *ctx) -{ - /* Returns the fd to read, or -1 */ - if (!chldmode_ours(ctx, 0)) - return -1; - - return ctx->sigchld_selfpipe[0]; -} - static void perhaps_removehandler(libxl__gc *gc) { - if (chldmode_ours(CTX, 0)) + if (!chldmode_ours(CTX, 0)) libxl__sigchld_removehandler(gc); } @@ -293,12 +304,29 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status) return rc; } -void libxl__fork_selfpipe_woken(libxl__egc *egc) +static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev, + int fd, short events, short revents) { /* May make callbacks into the application for child processes. - * ctx must be locked EXACTLY ONCE */ + * So, this function may unlock and relock the CTX. This is OK + * because event callback functions are always called with the CTX + * locked exactly once, and from code which copes with reentrancy. + * (See also the comment in afterpoll_internal.) */ EGC_GC; + int selfpipe = CTX->sigchld_selfpipe[0]; + + if (revents & ~POLLIN) { + LOG(ERROR, "unexpected poll event 0x%x on SIGCHLD self pipe", revents); + LIBXL__EVENT_DISASTER(egc, + "unexpected poll event on SIGCHLD self pipe", + 0, 0); + } + assert(revents & POLLIN); + + int e = libxl__self_pipe_eatall(selfpipe); + if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0); + while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) { int status; pid_t pid = waitpid(-1, &status, WNOHANG); diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index bd2eeed..1e60333 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -353,6 +353,7 @@ struct libxl__ctx { const libxl_childproc_hooks *childproc_hooks; void *childproc_user; int sigchld_selfpipe[2]; /* [0]==-1 means handler not installed */ + libxl__ev_fd sigchld_selfpipe_efd; LIBXL_LIST_HEAD(, libxl__ev_child) children; libxl_version_info version_info; @@ -840,8 +841,6 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p); extern const libxl_childproc_hooks libxl__childproc_default_hooks; int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */ void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */ -int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */ -void libxl__fork_selfpipe_woken(libxl__egc *egc); int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */ int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */ -- 1.7.10.4
Ian Jackson
2013-Nov-05 19:23 UTC
Re: [PATCH 0/2] question on libxl ao [and 1 more messages] [and 1 more messages]
Ian Jackson writes ("Re: [PATCH 0/2] question on libxl ao [and 1 more messages]"):> Bamvor Jian Zhang writes ("[PATCH 1/2] expose child fd in order to handle child exit in libvirt"): > > libvirt could handle fd and timeout event through > > libxl_osevent_hooks. either of these will not inform the child > > exit if libvirt set the libxl_sigchld_owner_libxl_always. > > I don''t exactly follow, but I think you have found a bug.... Can you try these three patches: Ian Jackson writes ("[PATCH 0/3] libxl: event system: SIGCHLD related fixes"):> libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always > and does not always work correctly with the fd registration system. > > 1/3 libxl: event system: pass gc, not just ctx, to internal > 2/3 libxl: event system: Make > 3/3 libxl: event system: properly register the SIGCHLD > > Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> > Cc: Bamvor Jian Zhang <bjzhang@suse.com> > Cc: Ian Campbell <ian.campbell@citrix.com> > Cc: Jim Fehlig <jfehlig@suse.com>I''ve pushed them to git://xenbits.xen.org/people/iwj/xen.git#bamvor-sigchld-fixes-v1 http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=shortlog;h=refs/heads/bamvor-sigchld-fixes-v1 Thanks, Ian.
Ian Campbell
2013-Nov-06 10:30 UTC
Re: [PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators
On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:> We are going to want the gc for libxl__ev_fd_register. > > No functional change in this patch. Simply change the argument types, > and the actual arguments from ctx to gc. Inside these functions, use > CTX (the macro which uses gc) rather than the old formal parameter > ctx. > > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>Acked-by: Ian Campbell <ian.campbell@citrix.com> (this series makes me wonder if your ''i'' key was sticky when you wrote all this stuff, chld indeed ;-)) Ian.
Ian Campbell
2013-Nov-06 10:34 UTC
Re: [PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work
On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:> Previously, libxl_sigchld_owner_libxl_always was mishandled. It would > result in libxl paying no attention to the sigchld self pipe. > > Fix this by fixing chldmode_ours so that it returns true iff we are > supposed to be handling SIGCHLD. > > Additionally, we arrange to use chldmode_ours everywhere where we are > installing/removing signal handlers and/or deciding whether to check > the self pipe, etc. This means it needs a new "creating" flag > argument for the benefit of libxl__ev_child_fork, which needs to > install the signal handler in libxl_sigchld_owner_libxl even if there > are not currently any children. > > ctx->childproc_hooks->chldowner is now interpreted only by > chldmode_ours. > > Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> > Cc: Bamvor Jian Zhang <bjzhang@suse.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell
2013-Nov-06 10:36 UTC
Re: [PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe
On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:> An application which uses libxl_osevent_register_hooks, and doesn''t > use libxl_sigchld_owner_mainloop, would never properly experience the > deaths of its (libxl) children. > > This is because the SIGCHLD self pipe would be handled using ad-hoc > code in beforepoll_internal and afterpoll_internal. However, this is > no good if application is using the fd registration system instead; in > that case these functions would not be called and nothing would deal > with readability of the self pipe. > > Fix this as follows: > > The SIGCHLD self pipe now is now dealt with by a standard libxl__ev_fd > handler, which is registered and deregistered along with the SIGCHLD > handler itself. The handler function subsumes the ad-hoc response > code removed from beforepoll_internal, and the functionality of > libxl__fork_selfpipe_woken. > > This is also tidier as the SIGCHLD self pipe is no longer touched > outside libxl_fork.c other than in ctx initialisation and teardown. > > (The ad-hoc arrangements for poller->wakeup_pipe in > beforepoll_internal and afterpoll_internal are OK because the > libxl__poller mechanism exists to wake up threads which are sitting > inside libxl''s poll loop, so is not applicable to the application''s > event loop.) > > Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> > Cc: Bamvor Jian Zhang <bjzhang@suse.com>Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell
2013-Nov-06 10:38 UTC
Re: [PATCH 0/3] libxl: event system: SIGCHLD related fixes
On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote:> libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always > and does not always work correctly with the fd registration system. > > 1/3 libxl: event system: pass gc, not just ctx, to internal > 2/3 libxl: event system: Make > 3/3 libxl: event system: properly register the SIGCHLD > > Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> > Cc: Bamvor Jian Zhang <bjzhang@suse.com> > Cc: Ian Campbell <ian.campbell@citrix.com> > Cc: Jim Fehlig <jfehlig@suse.com>I acked all three of these, but I think it probably wants an (Acked| Tested|Something)-by from Jim and/or Bamvor before it gets committed. Ian.
Ian Jackson
2013-Nov-06 12:08 UTC
Re: [PATCH 0/3] libxl: event system: SIGCHLD related fixes
Ian Campbell writes ("Re: [PATCH 0/3] libxl: event system: SIGCHLD related fixes"):> On Tue, 2013-11-05 at 19:19 +0000, Ian Jackson wrote: > > libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always > > and does not always work correctly with the fd registration system. > > > > 1/3 libxl: event system: pass gc, not just ctx, to internal > > 2/3 libxl: event system: Make > > 3/3 libxl: event system: properly register the SIGCHLD...> I acked all three of these, but I think it probably wants an (Acked| > Tested|Something)-by from Jim and/or Bamvor before it gets committed.Thanks. Yes, indeed. I have tested that xl still works but I don''t have a test harness for the fd registration machinery. Ian.
Bamvor Jian Zhang
2013-Nov-07 03:00 UTC
Re: [PATCH 0/2] question on libxl ao [and 1 more messages]
>>>Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > Bamvor Jian Zhang writes ("[PATCH 0/2] question on libxl ao"): >> there are four choices for using ao_how in libvirt. >> (a), ao without callback and set libxl_sigchld_owner_mainloop >> (b), ao without callback and set libxl_sigchld_owner_libxl_always >> (c), ao with callback and set libxl_sigchld_owner_mainloop >> (d), ao with callback and set libxl_sigchld_owner_libxl_always >> >> i got some trouble when i try (b) and (c). >> i still not decided which one is better choice for implement in >> libvirt. it would be easier if i could try above four choice. > > I''m not sure I follow. I think the right answer for libvirt is > probably aos specifying callback, and libxl_sigchld_owner_mainloop. > I''m assuming that libvirt has something for handling children > already. But I haven''t looked. If it doesn''t then > libxl_sigchld_owner_always would probably be best.AFAICS, no child handle in libvirt. i will try your patches. thanks. bamvor> > Bamvor Jian Zhang writes ("[PATCH 1/2] expose child fd in order to handle > child exit in libvirt"): >> libvirt could handle fd and timeout event through >> libxl_osevent_hooks. either of these will not inform the child >> exit if libvirt set the libxl_sigchld_owner_libxl_always. > > I don''t exactly follow, but I think you have found a bug. > > If libvirt specifies libxl_sigchld_owner_libxl_always then libxl will > install a SIGCHLD handler and to the self-pipe trick. The self-pipe > fd should be registered with the fd event machinery in the normal way. > However, this doesn''t actually appear to happen. Instead the > beforepoll/afterpoll functions handle that fd ad-hoc. The result is > that, probably, nothing will notice the selfpipe becoming writable. > > I will see about writing a patch to fix this. > >> add a function for returning the sigchld_selfpipe in order to >> handle the child exit in libvirt. >> meanwhile, there is only one pipe in ctx, it seems that it is >> not worth to add this to libxl_osevent_hooks. > > This approach is not right, though. There is no need to add > additional interfaces to libxl. The bug just needs to be fixed. > > Thanks, > Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
> Bamvor Jian Zhang writes ("[PATCH 2/2] call ao_how callback explicitly"): > > ao_how callback is not called after inserted into list. > > in my test, i directly call it in > > libxl__ao_complete_check_progress_reports. i know it should not > > work like this. Could you give some suggestion about it? > > I''m not sure what you mean. Under what conditions do you not see the > callback happen ? The existing machinery is supposed to take care of > that. Specifically, in the patch you''re changing, > libxl__ao_complete_check_progress_reports puts the ao onto > egc->aos_for_callback. On the return path from the libxl event > function back to the application, we are supposed to call > libxl__egc_cleanup, which in turn calls egc_run_callbacks, which > should pick up the aos on aos_for_callback. libxl__egc_cleanup is > called from the event loops in libxl_event_wait and > libxl__ao_inprogress and from the macro EGC_FREE. It should be > impossible for a path to be missed because inside libxl one needs an > egc to call libxl__ao_complete.the libvirt libxl driver register the event handler through libxl_event_register_callbacks, so the libxl_event_wait could not get the event. and if the ao_how is used, the libxl__egc_cleanup in "if ( poller )" statement will not be called either. even if i could expose the libxl_egc_cleanup to libvirt, i still do not know when should i call it? i do not know if there is a event triggered before the ao_how callback should be called. (is there a fd event when the async operation complete?). diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index b732816..c216651 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -1743,6 +1743,13 @@ void libxl__ao_progress_report(libxl__egc *egc, libxl__ao *ao, } } +void libxl_egc_cleanup(libxl_ctx *ctx) +{ + EGC_INIT(ctx); + libxl__egc_cleanup(egc); + EGC_FREE; + return rc; +} /* * Local variables: diff --git a/tools/libxl/libxl_event.h b/tools/libxl/libxl_event.h index 27a65dc..f0f7376 100644 --- a/tools/libxl/libxl_event.h +++ b/tools/libxl/libxl_event.h @@ -555,6 +555,7 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t, int status) */ void libxl_postfork_child_noexec(libxl_ctx *ctx); +void libxl_egc_cleanup(libxl_ctx *ctx); #endif> > What kind of an ao_how is your application passing to libxl ?ao_how with callback set.> Is your > application multithreaded ?yes.> > Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Bamvor Jian Zhang writes ("Re: [Xen-devel] [PATCH 2/2] call ao_how callback explicitly"): ...> > I''m not sure what you mean. Under what conditions do you not see the > > callback happen ? The existing machinery is supposed to take care of > > that. Specifically, in the patch you''re changing, > > libxl__ao_complete_check_progress_reports puts the ao onto > > egc->aos_for_callback. On the return path from the libxl event > > function back to the application, we are supposed to call > > libxl__egc_cleanup, which in turn calls egc_run_callbacks, which > > should pick up the aos on aos_for_callback. libxl__egc_cleanup is > > called from the event loops in libxl_event_wait and > > libxl__ao_inprogress and from the macro EGC_FREE. It should be > > impossible for a path to be missed because inside libxl one needs an > > egc to call libxl__ao_complete....> the libvirt libxl driver register the event handler through > libxl_event_register_callbacks,Right.> so the libxl_event_wait could not get the event.I understand from this that you''re calling libxl_event_wait ? (Rather than that you''re saying that you''re not calling libxl_event_wait, but doing so wouldn''t help.) But, as I understand it, you''re passing an ao_how which specifies a callback, rather than the generation of an event. So I think libxl_event_wait is irrelevant.> and if the ao_how is used, the libxl__egc_cleanup in > "if ( poller )" statement will not be called either. > even if i could expose the libxl_egc_cleanup to libvirt, i still do not > know when should i call it?No, you shouldn''t need to call it explicitly and doing so won''t help.> i do not know if there is a event triggered > before the ao_how callback should be called. (is there a fd event when > the async operation complete?).It''s the other way around. The async operation completes, probably, because of an fd event. So I think the call flow should be: 1. Your application notices the fd is readable (say) 2. Your application calls libxl_osevent_occurred_fd. 3. libxl_osevent_occurred_fd allocates an egc on entry. 4. In libxl, libxl_osevent_occurred_fd calls whatever internal functions are supposed to respond to the fd. If appropriate, this will cause some ao to complete. All of these calls will receive the same egc as created in step 3. I.e.: 5. In libxl, something calls libxl__ao_complete. 6. libxl__ao_complete puts the ao on the egc''s aos_for_callback list. 7. The call stack unwinds inside libxl, eventually returning to libxl_osevent_occurred_fd. 8. libxl_osevent_occurred_fd uses EGC_FREE on exit, and EGC_FREE contains a call to libxl__egc_cleanup. 9. libxl__egc_cleanup calls egc_run_callbacks which finds the ao on the aos_for_callback list and makes your application callback. The purpose of the "poller" machinery is so that if at step 5 we decide to return an event structure, rather than making a callback, a concurrent call to libxl_event_wait (on a different thread) can be woken up. Can you turn on libxl''s debugging and send me a trace of it losing the event ? Ian.
>>>Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > Bamvor Jian Zhang writes ("Re: [Xen-devel] [PATCH 2/2] call ao_how callback > explicitly"): > ... > > > I''m not sure what you mean. Under what conditions do you not see the > > > callback happen ? The existing machinery is supposed to take care of > > > that. Specifically, in the patch you''re changing, > > > libxl__ao_complete_check_progress_reports puts the ao onto > > > egc->aos_for_callback. On the return path from the libxl event > > > function back to the application, we are supposed to call > > > libxl__egc_cleanup, which in turn calls egc_run_callbacks, which > > > should pick up the aos on aos_for_callback. libxl__egc_cleanup is > > > called from the event loops in libxl_event_wait and > > > libxl__ao_inprogress and from the macro EGC_FREE. It should be > > > impossible for a path to be missed because inside libxl one needs an > > > egc to call libxl__ao_complete. > ... > > the libvirt libxl driver register the event handler through > > libxl_event_register_callbacks, > > Right. > > > so the libxl_event_wait could not get the event. > > I understand from this that you''re calling libxl_event_wait ? (Rather > than that you''re saying that you''re not calling libxl_event_wait, but > doing so wouldn''t help.)i am talking about why i could not call libxl_event_wait.> > But, as I understand it, you''re passing an ao_how which specifies a > callback, rather than the generation of an event. So I think > libxl_event_wait is irrelevant.agree.> > > and if the ao_how is used, the libxl__egc_cleanup in > > "if ( poller )" statement will not be called either. > > even if i could expose the libxl_egc_cleanup to libvirt, i still do not > > know when should i call it? > > No, you shouldn''t need to call it explicitly and doing so won''t help. > > > i do not know if there is a event triggered > > before the ao_how callback should be called. (is there a fd event when > > the async operation complete?). > > It''s the other way around. The async operation completes, probably, > because of an fd event. So I think the call flow should be: > > 1. Your application notices the fd is readable (say) > 2. Your application calls libxl_osevent_occurred_fd. > 3. libxl_osevent_occurred_fd allocates an egc on entry. > 4. In libxl, libxl_osevent_occurred_fd calls whatever internal > functions are supposed to respond to the fd. If appropriate, > this will cause some ao to complete. All of these calls > will receive the same egc as created in step 3. I.e.: > 5. In libxl, something calls libxl__ao_complete. > 6. libxl__ao_complete puts the ao on the egc''s aos_for_callback > list. > 7. The call stack unwinds inside libxl, eventually returning > to libxl_osevent_occurred_fd. > 8. libxl_osevent_occurred_fd uses EGC_FREE on exit, and EGC_FREE > contains a call to libxl__egc_cleanup. > 9. libxl__egc_cleanup calls egc_run_callbacks which finds the ao > on the aos_for_callback list and makes your application > callback.with log and gdb, i could see above flow except the ao in egc->aops_for_callback is empty. and from gdb i could know that the egc is different between libxl__ao_complete_check_progress_reports and fd event(here is watchfd_callback). libxl__ao_complete_check_progress_reports (gdb) p egc $1 = (libxl__egc *) 0x7fb1fd03f500 watchfd_callback (gdb) p egc $4 = (libxl__egc *) 0x7ffffb291960 after check the code in libxl_osevent_occurred_fd, the egc is created in watchfd_callback. so, EGC_FREE could not handle the ao added by libxl__ao_complete_check_progress_reports. the latter egc is created by AO_CREATE in do_domain_create. is it correct? if so, could we use the same egc all the time? here is the full libxl.log: libxl: debug: libxl_create.c:1230:do_domain_create: ao 0x7fb1d0001e50: create: how=0x7fb1f6448e80 callback=0x7fb1f6210fb0 poller=(nil) libxl: debug: libxl_device.c:257:libxl__device_disk_set_backend: Disk vdev=hda spec.backend=tap libxl: debug: libxl_create.c:675:initiate_domain_create: running bootloader libxl: debug: libxl_bootloader.c:321:libxl__bootloader_run: not a PV domain, skipping bootloader libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d0003078: deregister unregistered libxl: debug: libxl_numa.c:475:libxl__get_numa_candidate: New best NUMA placement candidate found: nr_nodes=1, nr_cpus=8, nr_vcpus=9, free_memkb=908 libxl: detail: libxl_dom.c:195:numa_place_domain: NUMA placement candidate with 1 nodes, 8 cpus and 908 KB free selected xc: detail: elf_parse_binary: phdr: paddr=0x100000 memsz=0x9f0c8 xc: detail: elf_parse_binary: memory: 0x100000 -> 0x19f0c8 xc: info: VIRTUAL MEMORY ARRANGEMENT: Loader: 0000000000100000->000000000019f0c8 Modules: 0000000000000000->0000000000000000 TOTAL: 0000000000000000->000000001f800000 ENTRY ADDRESS: 0000000000100000 xc: info: PHYSICAL MEMORY ALLOCATION: 4KB PAGES: 0x0000000000000200 2MB PAGES: 0x00000000000000fb 1GB PAGES: 0x0000000000000000 xc: detail: elf_load_binary: phdr 0 at 0x7fb1f2f48000 -> 0x7fb1f2fddf4d libxl: debug: libxl_device.c:257:libxl__device_disk_set_backend: Disk vdev=hda spec.backend=tap libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d0002558 wpath=/local/domain/0/backend/vbd/2/768/state token=3/0: register slotnum=3 libxl: debug: libxl_create.c:1243:do_domain_create: ao 0x7fb1d0001e50: inprogress: poller=(nil), flags=i libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d0002558 wpath=/local/domain/0/backend/vbd/2/768/state token=3/0: event epath=/local/domain/0/backend/vbd/2/768/state libxl: debug: libxl_event.c:643:devstate_watch_callback: backend /local/domain/0/backend/vbd/2/768/state wanted state 2 ok libxl: debug: libxl_event.c:596:libxl__ev_xswatch_deregister: watch w=0x7fb1d0002558 wpath=/local/domain/0/backend/vbd/2/768/state token=3/0: deregister slotnum=3 libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d0002558: deregister unregistered libxl: debug: libxl_device.c:959:device_hotplug: calling hotplug script: /etc/xen/scripts/block add libxl: debug: libxl_dm.c:1218:libxl__spawn_local_dm: Spawning device-model /usr/lib/xen/bin/qemu-system-i386 with arguments: libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: /usr/lib/xen/bin/qemu-system-i386 libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -xen-domid libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: 2 libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -chardev libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-2,server,nowait libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -mon libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: chardev=libxl-cmd,mode=control libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -name libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: bjz_04_sles11_sp2 libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -vnc libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: 127.0.0.1:0,to=99 libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -global libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: isa-fdc.driveAlibxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -vga libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: cirrus libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -global libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: vga.vram_size_mb=8 libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -boot libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: order=c libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -net libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: none libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -M libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: xenfv libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -m libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: 504 libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: -drive libxl: debug: libxl_dm.c:1220:libxl__spawn_local_dm: file=/var/lib/xen/images_2/bjz_04_sles11_sp2/disk0.raw,if=ide,index=0,media=disk,format=raw,cache=writeback libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d00032b0 wpath=/local/domain/0/device-model/2/state token=3/1: register slotnum=3 libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d00032b0 wpath=/local/domain/0/device-model/2/state token=3/1: event epath=/local/domain/0/device-model/2/state libxl: debug: libxl_event.c:596:libxl__ev_xswatch_deregister: watch w=0x7fb1d00032b0 wpath=/local/domain/0/device-model/2/state token=3/1: deregister slotnum=3 libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/device-model/2/state token=3/1: empty slot libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d00032b0: deregister unregistered libxl: debug: libxl_qmp.c:707:libxl__qmp_initialize: connected to /var/run/xen/qmp-libxl-2 libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: qmp libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: ''{ "execute": "qmp_capabilities", "id": 1 } '' libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: ''{ "execute": "query-chardev", "id": 2 } '' libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return libxl: debug: libxl_qmp.c:555:qmp_send_prepare: next qmp command: ''{ "execute": "query-vnc", "id": 3 } '' libxl: debug: libxl_qmp.c:299:qmp_handle_response: message type: return libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d0007b58 wpath=/local/domain/0/backend/vif/2/0/state token=3/2: register slotnum=3 libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d0007b58 wpath=/local/domain/0/backend/vif/2/0/state token=3/2: event epath=/local/domain/0/backend/vif/2/0/state libxl: debug: libxl_event.c:643:devstate_watch_callback: backend /local/domain/0/backend/vif/2/0/state wanted state 2 ok libxl: debug: libxl_event.c:596:libxl__ev_xswatch_deregister: watch w=0x7fb1d0007b58 wpath=/local/domain/0/backend/vif/2/0/state token=3/2: deregister slotnum=3 libxl: debug: libxl_event.c:608:libxl__ev_xswatch_deregister: watch w=0x7fb1d0007b58: deregister unregistered libxl: debug: libxl_device.c:959:device_hotplug: calling hotplug script: /etc/xen/scripts/vif-bridge online libxl: debug: libxl_event.c:472:watchfd_callback: watch epath=/local/domain/0/backend/vif/2/0/state token=3/2: empty slot libxl: debug: libxl_event.c:1737:libxl__ao_progress_report: ao 0x7fb1d0001e50: progress report: ignored libxl: debug: libxl_event.c:1569:libxl__ao_complete: ao 0x7fb1d0001e50: complete, rc=0 libxl: debug: libxl_event.c:1599:libxl__ao_complete_check_progress_reports: ao 0x7fb1d0001e50: complete for callback libxl: debug: libxl_event.c:1173:egc_run_callbacks: ao 0x7fb1d0001e50: completion callback libxl: debug: libxl_event.c:1541:libxl__ao__destroy: ao 0x7fb1d0001e50: destroy libxl: debug: libxl_event.c:559:libxl__ev_xswatch_register: watch w=0x7fb1d00029b0 wpath=@releaseDomain token=3/3: register slotnum=3 libxl: debug: libxl_event.c:503:watchfd_callback: watch w=0x7fb1d00029b0 wpath=@releaseDomain token=3/3: event epath=@releaseDomain libxl: debug: libxl.c:1000:domain_death_xswatch_callback: [evg=0x7fb1d0006330:2] from domid=2 nentries=1 rc=1 libxl: debug: libxl.c:1011:domain_death_xswatch_callback: [evg=0x7fb1d0006330:2] got=domaininfos[0] got->domain=2 libxl: debug: libxl.c:1038:domain_death_xswatch_callback: exists shutdown_reported=0 dominf.flags=ffff0022 libxl: debug: libxl.c:1004:domain_death_xswatch_callback: [evg=0] all reported libxl: debug: libxl.c:1068:domain_death_xswatch_callback: domain death search done xc: debug: hypercall buffer: total allocations:0 total releases:0 xc: debug: hypercall buffer: current allocations:0 maximum allocations:0 xc: debug: hypercall buffer: cache current size:0 xc: debug: hypercall buffer: cache hits:0 misses:0 toobig:0 best regards bamvor> The purpose of the "poller" machinery is so that if at step 5 we > decide to return an event structure, rather than making a callback, a > concurrent call to libxl_event_wait (on a different thread) can be > woken up. > > Can you turn on libxl''s debugging and send me a trace of it losing the > event ? > > Ian. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > >
Bamvor Jian Zhang writes ("Re: [Xen-devel] [PATCH 2/2] call ao_how callback explicitly"):> Ian Jackson <Ian.Jackson@eu.citrix.com> wrote: > > It''s the other way around. The async operation completes, probably, > > because of an fd event. So I think the call flow should be: > > > > 1. Your application notices the fd is readable (say) > > 2. Your application calls libxl_osevent_occurred_fd. > > 3. libxl_osevent_occurred_fd allocates an egc on entry. > > 4. In libxl, libxl_osevent_occurred_fd calls whatever internal > > functions are supposed to respond to the fd. If appropriate, > > this will cause some ao to complete. All of these calls > > will receive the same egc as created in step 3. I.e.: > > 5. In libxl, something calls libxl__ao_complete. > > 6. libxl__ao_complete puts the ao on the egc''s aos_for_callback > > list. > > 7. The call stack unwinds inside libxl, eventually returning > > to libxl_osevent_occurred_fd. > > 8. libxl_osevent_occurred_fd uses EGC_FREE on exit, and EGC_FREE > > contains a call to libxl__egc_cleanup. > > 9. libxl__egc_cleanup calls egc_run_callbacks which finds the ao > > on the aos_for_callback list and makes your application > > callback. > > with log and gdb, i could see above flow except the ao in egc->aops_for_callback is empty.The ao is "empty" ? I don''t follow.> and from gdb i could know that the egc is different between libxl__ao_complete_check_progress_reports and fd event(here is watchfd_callback). > libxl__ao_complete_check_progress_reports > (gdb) p egc > $1 = (libxl__egc *) 0x7fb1fd03f500 > > watchfd_callback > (gdb) p egc > $4 = (libxl__egc *) 0x7ffffb291960Every entry into libxl''s event system has its own egc. So if you are entering two of these functions at once on different threads, each will have its own egc.> after check the code in libxl_osevent_occurred_fd, the egc is > created in watchfd_callback.No, watchfd_callback doesn''t create the egc. The EGC_GC macro doesn''t create the egc. It only extracts the gc from it for the benefit of code inside that function. The egc will have been passed to watchfd_callback from its caller. The caller might be afterpoll_internal called from libxl_osevent_afterpoll (but you imply you''re not using that) or from eventloop_iteration, which might be called from libxl_event_wait (which you''re not using) or libxl__ao_inprogress (but only in the synchronous case which you''re not using). Or it might be libxl_osevent_occurred_fd, which does create the egc but runs the callbacks on exit.> so, EGC_FREE could not handle the ao added by > libxl__ao_complete_check_progress_reports. the latter egc is created > by AO_CREATE in do_domain_create. is it correct? if so, could we > use the same egc all the time?No, each egc corresponds to one entry into libxl. That''s what they''re for.> here is the full libxl.log:This shows only one ao. Is that right ? It starts here:> libxl: debug: libxl_create.c:1230:do_domain_create: ao 0x7fb1d0001e50: create: how=0x7fb1f6448e80 callback=0x7fb1f6210fb0 poller=(nil)And then, later, it completes:> libxl: debug: libxl_event.c:1569:libxl__ao_complete: ao 0x7fb1d0001e50: complete, rc=0 > libxl: debug: libxl_event.c:1599:libxl__ao_complete_check_progress_reports: ao 0x7fb1d0001e50: complete for callback > libxl: debug: libxl_event.c:1173:egc_run_callbacks: ao 0x7fb1d0001e50: completion callback > libxl: debug: libxl_event.c:1541:libxl__ao__destroy: ao 0x7fb1d0001e50: destroyHere we see the callback being made. That message "completion callback" comes from this code in egc_run_callbacks: LOG(DEBUG,"ao %p: completion callback", ao); ao->how.callback(CTX, ao->rc, ao->how.u.for_callback); So this debug trace appears to show the callback actually happening. What makes you think it isn''t ? Thanks, Ian.
Ian Jackson
2013-Nov-12 18:27 UTC
[PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always and does not always work correctly with the fd registration system. 1/3 libxl: event system: pass gc, not just ctx, to internal 2/3 libxl: event system: Make 3/3 libxl: event system: properly register the SIGCHLD This is v2 of this series because when I reviewed it myself I found that there was a critical missing ! in perhaps_removehandler, which would remove the handler iff it was needed. IMO this shows the weakness of my testing arrangements. I''d really appreciate some feedback on this from Bamvor or Jim. Thanks, Ian.
Ian Jackson
2013-Nov-12 18:27 UTC
[PATCH 1/3] libxl: event system: pass gc, not just ctx, to internal sigchld manipulators
We are going to want the gc for libxl__ev_fd_register. No functional change in this patch. Simply change the argument types, and the actual arguments from ctx to gc. Inside these functions, use CTX (the macro which uses gc) rather than the old formal parameter ctx. Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> --- tools/libxl/libxl.c | 2 +- tools/libxl/libxl_fork.c | 36 ++++++++++++++++++------------------ tools/libxl/libxl_internal.h | 4 ++-- 3 files changed, 21 insertions(+), 21 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 0de1112..2ebba98 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -171,7 +171,7 @@ int libxl_ctx_free(libxl_ctx *ctx) /* If we have outstanding children, then the application inherits * them; we wish the application good luck with understanding * this if and when it reaps them. */ - libxl__sigchld_removehandler(ctx); + libxl__sigchld_removehandler(gc); if (ctx->sigchld_selfpipe[0] >= 0) { close(ctx->sigchld_selfpipe[0]); diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index b6f0b2d..ae6184c 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -175,23 +175,23 @@ static void sigchld_removehandler_core(void) sigchld_owner = 0; } -void libxl__sigchld_removehandler(libxl_ctx *ctx) /* non-reentrant */ +void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */ { atfork_lock(); - if (sigchld_owner == ctx) + if (sigchld_owner == CTX) sigchld_removehandler_core(); atfork_unlock(); } -int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */ +int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ { int r, rc; - if (ctx->sigchld_selfpipe[0] < 0) { - r = pipe(ctx->sigchld_selfpipe); + if (CTX->sigchld_selfpipe[0] < 0) { + r = pipe(CTX->sigchld_selfpipe); if (r) { - ctx->sigchld_selfpipe[0] = -1; - LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, + CTX->sigchld_selfpipe[0] = -1; + LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR, "failed to create sigchld pipe"); rc = ERROR_FAIL; goto out; @@ -199,11 +199,11 @@ int libxl__sigchld_installhandler(libxl_ctx *ctx) /* non-reentrant */ } atfork_lock(); - if (sigchld_owner != ctx) { + if (sigchld_owner != CTX) { struct sigaction ours; assert(!sigchld_owner); - sigchld_owner = ctx; + sigchld_owner = CTX; memset(&ours,0,sizeof(ours)); ours.sa_handler = sigchld_handler; @@ -241,11 +241,11 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx) return ctx->sigchld_selfpipe[0]; } -static void perhaps_removehandler(libxl_ctx *ctx) +static void perhaps_removehandler(libxl__gc *gc) { - if (LIBXL_LIST_EMPTY(&ctx->children) && - ctx->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always) - libxl__sigchld_removehandler(ctx); + if (LIBXL_LIST_EMPTY(&CTX->children) && + CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always) + libxl__sigchld_removehandler(gc); } static int childproc_reaped(libxl__egc *egc, pid_t pid, int status) @@ -265,7 +265,7 @@ static int childproc_reaped(libxl__egc *egc, pid_t pid, int status) ch->pid = -1; ch->callback(egc, ch, pid, status); - perhaps_removehandler(CTX); + perhaps_removehandler(gc); return 0; } @@ -333,7 +333,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch, int rc; if (chldmode_ours(CTX)) { - rc = libxl__sigchld_installhandler(CTX); + rc = libxl__sigchld_installhandler(gc); if (rc) goto out; } @@ -363,7 +363,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch, rc = pid; out: - perhaps_removehandler(CTX); + perhaps_removehandler(gc); CTX_UNLOCK; return rc; } @@ -385,10 +385,10 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks, switch (ctx->childproc_hooks->chldowner) { case libxl_sigchld_owner_mainloop: case libxl_sigchld_owner_libxl: - libxl__sigchld_removehandler(ctx); + libxl__sigchld_removehandler(gc); break; case libxl_sigchld_owner_libxl_always: - libxl__sigchld_installhandler(ctx); + libxl__sigchld_installhandler(gc); break; default: abort(); diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 4ad6ad3..d567a10 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -838,8 +838,8 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p); /* Internal to fork and child reaping machinery */ extern const libxl_childproc_hooks libxl__childproc_default_hooks; -int libxl__sigchld_installhandler(libxl_ctx *ctx); /* non-reentrant;logs errs */ -void libxl__sigchld_removehandler(libxl_ctx *ctx); /* non-reentrant */ +int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */ +void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */ int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */ void libxl__fork_selfpipe_woken(libxl__egc *egc); int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */ -- 1.7.10.4
Ian Jackson
2013-Nov-12 18:27 UTC
[PATCH 2/3] libxl: event system: Make libxl_sigchld_owner_libxl_always work
Previously, libxl_sigchld_owner_libxl_always was mishandled. It would result in libxl paying no attention to the sigchld self pipe. Fix this by fixing chldmode_ours so that it returns true iff we are supposed to be handling SIGCHLD. Additionally, we arrange to use chldmode_ours everywhere where we are installing/removing signal handlers and/or deciding whether to check the self pipe, etc. This means it needs a new "creating" flag argument for the benefit of libxl__ev_child_fork, which needs to install the signal handler in libxl_sigchld_owner_libxl even if there are not currently any children. ctx->childproc_hooks->chldowner is now interpreted only by chldmode_ours. Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com> --- v2: Get sense of chldmode test right in perhaps_removehandler (!) --- tools/libxl/libxl_fork.c | 51 ++++++++++++++++++++++++---------------------- 1 file changed, 27 insertions(+), 24 deletions(-) diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index ae6184c..a581763 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -224,18 +224,23 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ return rc; } -static int chldmode_ours(libxl_ctx *ctx) +static bool chldmode_ours(libxl_ctx *ctx, bool creating) { - return ctx->childproc_hooks->chldowner == libxl_sigchld_owner_libxl; + switch (ctx->childproc_hooks->chldowner) { + case libxl_sigchld_owner_libxl: + return creating || !LIBXL_LIST_EMPTY(&ctx->children); + case libxl_sigchld_owner_mainloop: + return 0; + case libxl_sigchld_owner_libxl_always: + return 1; + } + abort(); } int libxl__fork_selfpipe_active(libxl_ctx *ctx) { /* Returns the fd to read, or -1 */ - if (!chldmode_ours(ctx)) - return -1; - - if (LIBXL_LIST_EMPTY(&ctx->children)) + if (!chldmode_ours(ctx, 0)) return -1; return ctx->sigchld_selfpipe[0]; @@ -243,11 +248,21 @@ int libxl__fork_selfpipe_active(libxl_ctx *ctx) static void perhaps_removehandler(libxl__gc *gc) { - if (LIBXL_LIST_EMPTY(&CTX->children) && - CTX->childproc_hooks->chldowner != libxl_sigchld_owner_libxl_always) + if (!chldmode_ours(CTX, 0)) libxl__sigchld_removehandler(gc); } +static int perhaps_installhandler(libxl__gc *gc, bool creating) +{ + int rc; + + if (chldmode_ours(CTX, creating)) { + rc = libxl__sigchld_installhandler(gc); + if (rc) return rc; + } + return 0; +} + static int childproc_reaped(libxl__egc *egc, pid_t pid, int status) { EGC_GC; @@ -286,7 +301,7 @@ void libxl__fork_selfpipe_woken(libxl__egc *egc) * ctx must be locked EXACTLY ONCE */ EGC_GC; - while (chldmode_ours(CTX) /* in case the app changes the mode */) { + while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) { int status; pid_t pid = waitpid(-1, &status, WNOHANG); @@ -332,10 +347,7 @@ pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *ch, CTX_LOCK; int rc; - if (chldmode_ours(CTX)) { - rc = libxl__sigchld_installhandler(gc); - if (rc) goto out; - } + perhaps_installhandler(gc, 1); pid_t pid CTX->childproc_hooks->fork_replacement @@ -382,17 +394,8 @@ void libxl_childproc_setmode(libxl_ctx *ctx, const libxl_childproc_hooks *hooks, ctx->childproc_hooks = hooks; ctx->childproc_user = user; - switch (ctx->childproc_hooks->chldowner) { - case libxl_sigchld_owner_mainloop: - case libxl_sigchld_owner_libxl: - libxl__sigchld_removehandler(gc); - break; - case libxl_sigchld_owner_libxl_always: - libxl__sigchld_installhandler(gc); - break; - default: - abort(); - } + perhaps_removehandler(gc); + perhaps_installhandler(gc, 0); /* idempotent, ok to ignore errors for now */ CTX_UNLOCK; GC_FREE; -- 1.7.10.4
Ian Jackson
2013-Nov-12 18:27 UTC
[PATCH 3/3] libxl: event system: properly register the SIGCHLD self-pipe
An application which uses libxl_osevent_register_hooks, and doesn''t use libxl_sigchld_owner_mainloop, would never properly experience the deaths of its (libxl) children. This is because the SIGCHLD self pipe would be handled using ad-hoc code in beforepoll_internal and afterpoll_internal. However, this is no good if application is using the fd registration system instead; in that case these functions would not be called and nothing would deal with readability of the self pipe. Fix this as follows: The SIGCHLD self pipe now is now dealt with by a standard libxl__ev_fd handler, which is registered and deregistered along with the SIGCHLD handler itself. The handler function subsumes the ad-hoc response code removed from beforepoll_internal, and the functionality of libxl__fork_selfpipe_woken. This is also tidier as the SIGCHLD self pipe is no longer touched outside libxl_fork.c other than in ctx initialisation and teardown. (The ad-hoc arrangements for poller->wakeup_pipe in beforepoll_internal and afterpoll_internal are OK because the libxl__poller mechanism exists to wake up threads which are sitting inside libxl''s poll loop, so is not applicable to the application''s event loop.) Reported-by: Bamvor Jian Zhang <bjzhang@suse.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Bamvor Jian Zhang <bjzhang@suse.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jim Fehlig <jfehlig@suse.com> --- tools/libxl/libxl.c | 2 ++ tools/libxl/libxl_event.c | 12 ---------- tools/libxl/libxl_fork.c | 50 ++++++++++++++++++++++++++++++++---------- tools/libxl/libxl_internal.h | 3 +-- 4 files changed, 42 insertions(+), 25 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index 2ebba98..3efc564 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -65,6 +65,7 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version, ctx->childproc_user = 0; ctx->sigchld_selfpipe[0] = -1; + libxl__ev_fd_init(&ctx->sigchld_selfpipe_efd); /* The mutex is special because we can''t idempotently destroy it */ @@ -146,6 +147,7 @@ int libxl_ctx_free(libxl_ctx *ctx) for (i = 0; i < ctx->watch_nslots; i++) assert(!libxl__watch_slot_contents(gc, i)); libxl__ev_fd_deregister(gc, &ctx->watch_efd); + libxl__ev_fd_deregister(gc, &ctx->sigchld_selfpipe_efd); /* Now there should be no more events requested from the application: */ diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 54d15db..bdef7ac 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -774,10 +774,6 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller, \ REQUIRE_FD(poller->wakeup_pipe[0], POLLIN, BODY); \ \ - int selfpipe = libxl__fork_selfpipe_active(CTX); \ - if (selfpipe >= 0) \ - REQUIRE_FD(selfpipe, POLLIN, BODY); \ - \ }while(0) #define REQUIRE_FD(req_fd_, req_events_, BODY) do{ \ @@ -999,14 +995,6 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller, if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0); } - int selfpipe = libxl__fork_selfpipe_active(CTX); - if (selfpipe >= 0 && - afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN)) { - int e = libxl__self_pipe_eatall(selfpipe); - if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0); - libxl__fork_selfpipe_woken(egc); - } - for (;;) { libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes); if (!etime) diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c index a581763..4ae9f94 100644 --- a/tools/libxl/libxl_fork.c +++ b/tools/libxl/libxl_fork.c @@ -155,6 +155,9 @@ int libxl__carefd_fd(const libxl__carefd *cf) * Actual child process handling */ +static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev, + int fd, short events, short revents); + static void sigchld_handler(int signo) { int esave = errno; @@ -177,10 +180,18 @@ static void sigchld_removehandler_core(void) void libxl__sigchld_removehandler(libxl__gc *gc) /* non-reentrant */ { + int rc; + atfork_lock(); if (sigchld_owner == CTX) sigchld_removehandler_core(); atfork_unlock(); + + if (libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) { + rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, 0); + if (rc) + libxl__ev_fd_deregister(gc, &CTX->sigchld_selfpipe_efd); + } } int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ @@ -197,6 +208,15 @@ int libxl__sigchld_installhandler(libxl__gc *gc) /* non-reentrant */ goto out; } } + if (!libxl__ev_fd_isregistered(&CTX->sigchld_selfpipe_efd)) { + rc = libxl__ev_fd_register(gc, &CTX->sigchld_selfpipe_efd, + sigchld_selfpipe_handler, + CTX->sigchld_selfpipe[0], POLLIN); + if (rc) goto out; + } else { + rc = libxl__ev_fd_modify(gc, &CTX->sigchld_selfpipe_efd, POLLIN); + if (rc) goto out; + } atfork_lock(); if (sigchld_owner != CTX) { @@ -237,15 +257,6 @@ static bool chldmode_ours(libxl_ctx *ctx, bool creating) abort(); } -int libxl__fork_selfpipe_active(libxl_ctx *ctx) -{ - /* Returns the fd to read, or -1 */ - if (!chldmode_ours(ctx, 0)) - return -1; - - return ctx->sigchld_selfpipe[0]; -} - static void perhaps_removehandler(libxl__gc *gc) { if (!chldmode_ours(CTX, 0)) @@ -295,12 +306,29 @@ int libxl_childproc_reaped(libxl_ctx *ctx, pid_t pid, int status) return rc; } -void libxl__fork_selfpipe_woken(libxl__egc *egc) +static void sigchld_selfpipe_handler(libxl__egc *egc, libxl__ev_fd *ev, + int fd, short events, short revents) { /* May make callbacks into the application for child processes. - * ctx must be locked EXACTLY ONCE */ + * So, this function may unlock and relock the CTX. This is OK + * because event callback functions are always called with the CTX + * locked exactly once, and from code which copes with reentrancy. + * (See also the comment in afterpoll_internal.) */ EGC_GC; + int selfpipe = CTX->sigchld_selfpipe[0]; + + if (revents & ~POLLIN) { + LOG(ERROR, "unexpected poll event 0x%x on SIGCHLD self pipe", revents); + LIBXL__EVENT_DISASTER(egc, + "unexpected poll event on SIGCHLD self pipe", + 0, 0); + } + assert(revents & POLLIN); + + int e = libxl__self_pipe_eatall(selfpipe); + if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe", e, 0); + while (chldmode_ours(CTX, 0) /* in case the app changes the mode */) { int status; pid_t pid = waitpid(-1, &status, WNOHANG); diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index d567a10..b1f5f81 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -353,6 +353,7 @@ struct libxl__ctx { const libxl_childproc_hooks *childproc_hooks; void *childproc_user; int sigchld_selfpipe[2]; /* [0]==-1 means handler not installed */ + libxl__ev_fd sigchld_selfpipe_efd; LIBXL_LIST_HEAD(, libxl__ev_child) children; libxl_version_info version_info; @@ -840,8 +841,6 @@ _hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p); extern const libxl_childproc_hooks libxl__childproc_default_hooks; int libxl__sigchld_installhandler(libxl__gc*); /* non-reentrant; logs errs */ void libxl__sigchld_removehandler(libxl__gc*); /* non-reentrant */ -int libxl__fork_selfpipe_active(libxl_ctx *ctx); /* returns read fd or -1 */ -void libxl__fork_selfpipe_woken(libxl__egc *egc); int libxl__self_pipe_wakeup(int fd); /* returns 0 or -1 setting errno */ int libxl__self_pipe_eatall(int fd); /* returns 0 or -1 setting errno */ -- 1.7.10.4
Ian Campbell
2013-Nov-13 10:02 UTC
Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
On Tue, 2013-11-12 at 18:27 +0000, Ian Jackson wrote:> libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always > and does not always work correctly with the fd registration system. > > 1/3 libxl: event system: pass gc, not just ctx, to internal > 2/3 libxl: event system: Make > 3/3 libxl: event system: properly register the SIGCHLD > > This is v2 of this series because when I reviewed it myself I found > that there was a critical missing ! in perhaps_removehandler, which > would remove the handler iff it was needed. IMO this shows the > weakness of my testing arrangements.Sorry for also missing this. FWIW my ack from last time can still apply. Ian.
Jim Fehlig
2013-Nov-20 04:28 UTC
Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
Ian Jackson wrote:> libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always > and does not always work correctly with the fd registration system. > > 1/3 libxl: event system: pass gc, not just ctx, to internal > 2/3 libxl: event system: Make > 3/3 libxl: event system: properly register the SIGCHLD > > This is v2 of this series because when I reviewed it myself I found > that there was a critical missing ! in perhaps_removehandler, which > would remove the handler iff it was needed. IMO this shows the > weakness of my testing arrangements. > > I''d really appreciate some feedback on this from Bamvor or Jim. >Hrm, I haven''t followed this thread and am not familiar with the details... Bamvor, have you reviewed/tested Ian''s patches to see if they resolve the issue you originally raised in this thread? Thanks, Jim
Bamvor Jian Zhang
2013-Nov-20 15:22 UTC
答复: Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
hi,>>> Jim Fehlig 2013-11-20 下午 12:29 >>> > Ian Jackson wrote: > > libxl's SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always > > and does not always work correctly with the fd registration system. > > > > 1/3 libxl: event system: pass gc, not just ctx, to internal > > 2/3 libxl: event system: Make > > 3/3 libxl: event system: properly register the SIGCHLD > > > > This is v2 of this series because when I reviewed it myself I found > > that there was a critical missing ! in perhaps_removehandler, which > > would remove the handler iff it was needed. IMO this shows the > > weakness of my testing arrangements. > > > > I'd really appreciate some feedback on this from Bamvor or Jim. > >> Hrm, I haven't followed this thread and am not familiar with the details... > > Bamvor, have you reviewed/tested Ian's patches to see if they resolve > the issue you originally raised in this thread?sorry for late response. i write a patch in libvirt in which use ao_how without callback and let libxl manage the child. so fat i test create, save, restore successful. there is a dead-lock in libvirt when i issue shutdown, i will deal with it tomorrow. bamvor> Thanks, > Jim_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Jackson
2013-Nov-20 16:40 UTC
Re: 答复: Re: [PAT CH v2 0/3] libxl: event system: SIGCHLD related fixes
Bamvor Jian Zhang writes ("答复: Re: [Xen-devel] [PAT CH v2 0/3] libxl: event system: SIGCHLD related fixes"):> Jim Fehlig 2013-11-20 下午 12:29 >>> > > Bamvor, have you reviewed/tested Ian's patches to see if they resolve > > the issue you originally raised in this thread? > > sorry for late response. > i write a patch in libvirt in which use ao_how without callback and let libxl > manage the child.Err, OK. I think, though, that at least some of what you reported was a genuine libxl bug. So perhaps you didn't need to change libvirt. Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Jackson
2013-Nov-21 18:49 UTC
Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes
Ian Campbell writes ("Re: [PATCH v2 0/3] libxl: event system: SIGCHLD related fixes"):> On Tue, 2013-11-12 at 18:27 +0000, Ian Jackson wrote: > > libxl''s SIGCHLD handler is broken if libxl_sigchld_owner_libxl_always > > and does not always work correctly with the fd registration system. > > > > 1/3 libxl: event system: pass gc, not just ctx, to internal > > 2/3 libxl: event system: Make > > 3/3 libxl: event system: properly register the SIGCHLD > > > > This is v2 of this series because when I reviewed it myself I found > > that there was a critical missing ! in perhaps_removehandler, which > > would remove the handler iff it was needed. IMO this shows the > > weakness of my testing arrangements. > > Sorry for also missing this. FWIW my ack from last time can still apply.Well, in the absence of more information, and after in-person confirmation that this (attempted) bugfix would be better exposed for testing in the tree rather than floating about in emails, I have pushed these three patches. Thanks, ian.