On Fri, Jul 6, 2012 at 3:50 PM, Ian Jackson <Ian.Jackson@eu.citrix.com>
wrote:> Roger reported to me (pers.comm.) this crash on NetBSD:
>
> Program terminated with signal 11, Segmentation fault.
> #0 0x00007f7ff743131b in afterpoll_check_fd (poller=<optimized
out>, fds=0x7f7ff7b241c0, nfds=7, fd=-1, events=1)
> at libxl_event.c:856
> 856 if (fds[slot].fd != fd)
>
> More information:
> http://dpaste.org/E5sw2/raw/
> http://dpaste.org/7zzRT/raw/ (with #define DEBUG in libxl_event.c)
> http://dpaste.org/jeCRG/raw/ (strace, seems a bit broken)
This strace trace should not be used (see below).
> There are, I think, at least three things going wrong here:
>
> - Firstly, the bootloader is failing for some unknown reason. I
> think this is (first) manifesting on NetBSD as POLLHUP on the pty
> master fd. I don''t know why this might be but a good way to
find
> out would be to run the bootloader by hand. The debug log lists
> the arguments it''s been given and you can just run it by hand
and
> see what it does.
The bootloader itself seems to run fine, at least I''ve been able to
run it successfully using the command line spitted by xl.
> - Secondly, this causes a crash in the libxl event handling
> machinery. This is due to a mishandled reentrancy hazard. Patch
> below.
>
> - Thirdly, for some unknown reason libxl is not able to kill and
> reap the bootloader child process:
> libxl: warning: libxl_bootloader.c:292:bootloader_abort:
> after failure, failed to kill bootloader [16432]: No such process
>
> This is a mystery to me. I asked for a system call trace to try to
> see what libxl was actually doing but unfortunately strace on
> Roger''s NetBSD install doesn''t seem to work. ISTR
NetBSD having
> some other system instead which might produce better information.
Yes, I should have used ktrace/ktruss, and I will do so :). Once I''m
able to obtain a proper trace I will comment on the options below,
right now this is also a mystery to me.
> It seems to me that libxl ought to be able to kill() an unreaped
> child so I looked to see whether the child might have been reaped.
> There are only three calls to wait* functions in xl/libxl. The one
> in xl.c can be ruled out because we are constantly within libxl
> rather than libxl, and all the callbacks into xl are accompanied by
> debugging output which we do not see in the trace. The one in
> libxl_exec.c is in the middle process of libxl__spawn_spawn and can
> be disregarded.
>
> That leaves the one in libxl__fork_selfpipe_woken. This is not
> invariably accompanied by debugging output so it is possible that
> something has gone wrong with the handling of the bootloader''s
> death. However looking at the flow after waitpid, in
> childproc_reaped, there seem to be these possibilities:
>
> - the child was not found, in which case childproc_reaped
> returns ERROR_UNKNOWN_CHILD, and we end up calling
> xl_reaped_callback which will (we hope!) also give
> ERROR_UNKNOWN_CHILD, so that we will call
> libxl_report_child_exitstatus complaining about the unknown
> child. We don''t see this in the trace.
>
> - the child was found and is still associated with the bootloader
> callback, in which case we call bootloader_finished which
> also always logs.
>
> - somehow we call some other function which throws the
> information awaw. I think this is unlikely.
>
> I therefore conclude that the child has died but been auto-reaped
> somehow. Perhaps SIGCHLD is set to SIG_IGN ? That would be Evil,
> Bad and Wrong.
>
> I think a proper system call trace will be necessary to resolve this.
>
> Naturally it is important not to fix the first of these problems
> before debugging the error handling :-).
>
> Thanks,
> Ian.
>
>
> Here is a patch to reproduce the problem on Linux. touch the file
> "boom" while the bootloader is running. You need to run the
whole
> thing under valgrind to spot the illegal access.
>
> diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
> index 99972a2..f7df0b1 100644
> --- a/tools/libxl/libxl_aoutils.c
> +++ b/tools/libxl/libxl_aoutils.c
> @@ -102,6 +102,9 @@ static void datacopier_readable(libxl__egc *egc,
libxl__ev_fd *ev,
> libxl__datacopier_state *dc = CONTAINER_OF(ev, *dc, toread);
> STATE_AO_GC(dc->ao);
>
> +if (!access("boom",R_OK))
> + revents = 0x10;
> +
> if (revents & ~POLLIN) {
> LOG(ERROR, "unexpected poll event 0x%x (should be
POLLIN)"
> " on %s during copy of %s", revents,
dc->readwhat, dc->copywhat);
>
> Doing this does not (for me, at least), seem to lose track of the
> bootloader process. Instead, I see (when combined with the patch
> below):
>
> libxl: error: libxl_aoutils.c:110:datacopier_readable: unexpected poll
event 0x10 (should be POLLIN) on bootloader pty during copy o bootloader output
for domain 22
> libxl: debug: libxl_event.c:120:libxl__ev_fd_deregister: ev_fd=0x42b13a8
deregister fd=13
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13c0
deregister unregistered
> libxl: debug: libxl_event.c:120:libxl__ev_fd_deregister: ev_fd=0x42b1348
deregister fd=17
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b1360
deregister unregistered
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13a8
deregister unregistered
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13c0
deregister unregistered
> libxl: debug: libxl_event.c:1610:libxl__ao_inprogress: ao 0x42b1038: not
ready, waiting
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b1348
deregister unregistered
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b1360
deregister unregistered
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13a8
deregister unregistered
> libxl: debug: libxl_event.c:116:libxl__ev_fd_deregister: ev_fd=0x42b13c0
deregister unregistered
> libxl: error: libxl_bootloader.c:557:bootloader_finished: bootloader failed
- consult logfile /var/log/xen/bootloader.22.log
> libxl: error: libxl_exec.c:129:libxl_report_child_exitstatus: bootloader
[3615] died due to fatal signal Terminated
>
>
> The fix I have come up with is below.
>
> Ian.
>
>
> Subject: [PATCH] libxl: Deal properly with reentrant changes to fd events
>
> In afterpoll_internal, the callback functions may register and
> deregister events arbitrarily. This means that we need to consider
> the reentrancy-safety of the event machinery state variables.
>
> Most of the code is safe but the fd handling is not. Fix this by
> arranging to restart the fd scan loop every time we call one of these
> callback functions.
>
> Another possible solution would be simply to return from
> afterpoll_internal after calling efd->func. That would be a small and
> more obviously correct change but would prevent the process from
> handling more than one fd event with a single call to poll.
>
> Reported-by: Roger Pau Monne <roger.pau@citrix.com>
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Thanks for the patch, I will try it as soon as I can and report back.
>
> diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
> index 1957505..991261e 100644
> --- a/tools/libxl/libxl_event.c
> +++ b/tools/libxl/libxl_event.c
> @@ -838,15 +838,19 @@ int libxl_osevent_beforepoll(libxl_ctx *ctx, int
*nfds_io,
>
> static int afterpoll_check_fd(libxl__poller *poller,
> const struct pollfd *fds, int nfds,
> - int fd, int events)
> + int fd, int events,
> + int *(rindices[3]))
> /* returns mask of events which were requested and occurred */
> {
> if (fd >= poller->fd_rindices_allocd)
> /* added after we went into poll, have to try again */
> return 0;
>
> + events |= POLLERR | POLLHUP;
> +
> int i, revents = 0;
> for (i=0; i<3; i++) {
> + if (rindices) rindices[i] = 0;
> int slot = poller->fd_rindices[fd][i];
>
> if (slot >= nfds)
> @@ -858,11 +862,16 @@ static int afterpoll_check_fd(libxl__poller *poller,
> continue;
>
> assert(!(fds[slot].revents & POLLNVAL));
> - revents |= fds[slot].revents;
> - }
>
> - /* we mask in case requested events have changed */
> - revents &= (events | POLLERR | POLLHUP);
> + /* we mask in case requested events have changed */
> + int slot_revents = fds[slot].revents & events;
> + if (!slot_revents)
> + /* this slot is for a different set of events */
> + continue;
> +
> + revents |= slot_revents;
> + if (rindices) rindices[i] = &poller->fd_rindices[fd][i];
> + }
>
> return revents;
> }
> @@ -876,26 +885,78 @@ static void afterpoll_internal(libxl__egc *egc,
libxl__poller *poller,
> EGC_GC;
> libxl__ev_fd *efd;
>
> - LIBXL_LIST_FOREACH(efd, &CTX->efds, entry) {
> - if (!efd->events)
> - continue;
> + /*
> + * Warning! Reentrancy hazards!
> + *
> + * Many parts of this function eventually call arbitrary callback
> + * functions which may modify the event handling data structures.
> + *
> + * Of the data structures used here:
> + *
> + * egc, poller, now
> + * are allocated by our caller and relate to the
> + * current thread and its call stack down into the
> + * event machinery; it is not freed until we return.
> + * So it is safe.
> + *
> + * fds is either what application passed into
> + * libxl_osevent_afterpoll (which, although this
> + * isn''t explicitly stated, clearly must
remain
> + * valid until libxl_osevent_afterpoll returns) or
> + * it''s poller->fd_polls which is modified
only by
> + * our (non-recursive) caller eventloop_iteration.
> + *
> + * CTX comes from our caller, and applications are
> + * forbidden from destroying it while we are running.
> + * So the ctx pointer itself is safe to use; now
> + * for its contents:
> + *
> + * CTX->etimes is used in a simple reentrancy-safe manner.
> + *
> + * CTX->efds is more complicated; see below.
> + */
>
> - int revents = afterpoll_check_fd(poller,fds,nfds,
efd->fd,efd->events);
> - if (revents) {
> - DBG("ev_fd=%p occurs fd=%d events=%x revents=%x",
> - efd, efd->fd, efd->events, revents);
> - efd->func(egc, efd, efd->fd, efd->events, revents);
> + for (;;) {
> + /* We restart our scan of fd events whenever we call a
> + * callback function. This is necessary because such
> + * a callback might make arbitrary changes to CTX->efds.
> + * We invalidate the fd_rindices[] entries which were used
> + * so that we don''t call the same function again. */
> + int revents, i, *(rindices[3]);
> +
> + LIBXL_LIST_FOREACH(efd, &CTX->efds, entry) {
> +
> + if (!efd->events)
> + continue;
> +
> + revents = afterpoll_check_fd(poller,fds,nfds,
> + efd->fd,efd->events,
> + rindices);
> + if (revents)
> + goto found_fd_event;
> }
> + /* no ordinary fd events, then */
> + break;
> +
> + found_fd_event:
> + DBG("ev_fd=%p occurs fd=%d events=%x revents=%x",
> + efd, efd->fd, efd->events, revents);
> +
> + efd->func(egc, efd, efd->fd, efd->events, revents);
> +
> + for (i=0; i<3; i++)
> + if (rindices[i])
> + *(rindices[i]) = INT_MAX;
> }
>
> - if (afterpoll_check_fd(poller,fds,nfds,
poller->wakeup_pipe[0],POLLIN)) {
> + if (afterpoll_check_fd(poller,fds,nfds,
poller->wakeup_pipe[0],POLLIN, 0)) {
> int e = libxl__self_pipe_eatall(poller->wakeup_pipe[0]);
> if (e) LIBXL__EVENT_DISASTER(egc, "read wakeup", e, 0);
> }
>
> int selfpipe = libxl__fork_selfpipe_active(CTX);
> if (selfpipe >= 0 &&
> - afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN)) {
> + afterpoll_check_fd(poller,fds,nfds, selfpipe, POLLIN, 0)) {
> int e = libxl__self_pipe_eatall(selfpipe);
> if (e) LIBXL__EVENT_DISASTER(egc, "read sigchld pipe",
e, 0);
> libxl__fork_selfpipe_woken(egc);
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 7a655b3..647cc7c 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -278,7 +278,7 @@ struct libxl__poller {
> int fd_polls_allocd;
>
> int fd_rindices_allocd;
> - int (*fd_rindices)[3]; /* see libxl_osevent_beforepoll */
> + int (*fd_rindices)[3]; /* see libxl_event.c:beforepoll_internal */
>
> int wakeup_pipe[2]; /* 0 means no fd allocated */
> };
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel