Richard W.M. Jones
2019-Aug-27 11:47 UTC
[Libguestfs] [PATCH nbdkit] sh: Remove assert and replace with smarter file descriptor duplication. (was: Re: [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible)
On Fri, Aug 02, 2019 at 02:26:15PM -0500, Eric Blake wrote:> + /* Ensure that stdin/out/err of the current process were not empty > + * before we started creating pipes (otherwise, the close and dup2 > + * calls below become more complex to juggle fds around correctly). > + */ > + assert (in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && > + out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && > + err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO);This assert is now being thrown whenever we use nbdkit + sh plugin + nbdkit -s option. It's easy to demonstrate using a libnbd one-liner: $ nbdsh -c 'h.connect_command (["nbdkit", "sh", "/tmp/min.sh", "-s", "--exit-with-parent"])' \ -c 'print ("%r" % h.get_size())' 1048576 nbdkit: call.c:155: call3: Assertion `in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO' failed. (This is triggered frequently by tests in the libnbd test suite which is where I first observed it.) It happens during the shutdown path where there is one final call to sh_close in the plugin: #0 0x00007fc5d929c615 in raise () from /lib64/libc.so.6 #1 0x00007fc5d92858d9 in abort () from /lib64/libc.so.6 #2 0x00007fc5d92857a9 in __assert_fail_base.cold () from /lib64/libc.so.6 #3 0x00007fc5d9294a56 in __assert_fail () from /lib64/libc.so.6 #4 0x00007fc5d969738c in call3.constprop () from /usr/lib64/nbdkit/plugins/nbdkit-sh-plugin.so #5 0x00007fc5d9697493 in call () from /usr/lib64/nbdkit/plugins/nbdkit-sh-plugin.so #6 0x00007fc5d969833c in sh_close () from /usr/lib64/nbdkit/plugins/nbdkit-sh-plugin.so #7 0x0000561694bcf8e8 in plugin_close (b=<optimized out>, conn=0x56169df9afe0) at plugins.c:305 #8 0x0000561694bca649 in free_connection (conn=0x56169df9afe0) at connections.c:377 #9 _handle_single_connection (sockout=<optimized out>, sockin=<optimized out>) at connections.c:267 #10 handle_single_connection (sockin=<optimized out>, sockout=<optimized out>) at connections.c:277 #11 0x0000561694bc9583 in start_serving () at main.c:856 #12 main (argc=<optimized out>, argv=0x7ffd09440178) at main.c:651 This is after conn->close has been called and stdin/stdout (which were connected to the client socket) have actually been closed. We can easily paper over this by ignoring the assert on the close path, but I foresee two problems: (1) I don't believe the assert is generally correct. While it's not ideal for nbdkit to be run with stdin/out/err closed (they obviously should be connected to /dev/null) it's also not impossible for that to happen. We don't cope well with this situation. (2) The assert is protecting us from some minimal checks in this code, but I think the right answer is to just add those checks. https://github.com/libguestfs/nbdkit/blob/79b26ee95034d9c90913da3b5db596503c5d03d9/plugins/sh/call.c#L166 Anyway, a possible patch for this is attached, but I'm not very confident it is correct. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Eric Blake
2019-Aug-27 12:55 UTC
Re: [Libguestfs] [PATCH nbdkit] sh: Remove assert and replace with smarter file descriptor duplication. (was: Re: [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible)
On 8/27/19 6:47 AM, Richard W.M. Jones wrote:> On Fri, Aug 02, 2019 at 02:26:15PM -0500, Eric Blake wrote: >> + /* Ensure that stdin/out/err of the current process were not empty >> + * before we started creating pipes (otherwise, the close and dup2 >> + * calls below become more complex to juggle fds around correctly). >> + */ >> + assert (in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && >> + out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && >> + err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO); > > This assert is now being thrown whenever we use nbdkit + sh plugin + > nbdkit -s option. It's easy to demonstrate using a libnbd one-liner: > > $ nbdsh -c 'h.connect_command (["nbdkit", "sh", "/tmp/min.sh", "-s", "--exit-with-parent"])' \ > -c 'print ("%r" % h.get_size())' > 1048576 > nbdkit: call.c:155: call3: Assertion `in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO' failed. > > (This is triggered frequently by tests in the libnbd test suite which > is where I first observed it.) > > It happens during the shutdown path where there is one final call to > sh_close in the plugin:Hmm, so we've closed stdin/out because the client connection is no longer around, but still give the shell script one more callback. Yeah, the failure is definitely explainable, and worth fixing.> We can easily paper over this by ignoring the assert on the close > path, but I foresee two problems: > > (1) I don't believe the assert is generally correct. While it's not > ideal for nbdkit to be run with stdin/out/err closed (they obviously > should be connected to /dev/null) it's also not impossible for that to > happen. We don't cope well with this situation. > > (2) The assert is protecting us from some minimal checks in this code, > but I think the right answer is to just add those checks.It's also possible to open /dev/null in a loop until getting one larger than STDERR_FILENO (so you'd have up to four open() calls), then do pipe() with the assurance of no collisions, then close the scratch fds. But that's probably more syscall overhead and not worth doing, so your attempt at doing the fd shuffling correctly is better.> > Anyway, a possible patch for this is attached, but I'm not very > confident it is correct.Not quite. I don't think POSIX guarantees that fd[0] < fd[1] when pipe() succeeds. But in the situation we're hitting, fd 0 and 1 are closed when we start pipe()ing, so our first pipe will claim 0 and 1 in either order, then the first check will either be a no-op (because fd[0] is 0 and doesn't need moving) or else fd[0] will be 1 and needs to overwrite stdin. Overwriting is fine; but if fd[0] is 0, we HAVE to clear FD_CLOEXEC, otherwise stdin will be closed by exec(). But it's close; so I'll use it as the starting point and push the corrected version soon. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Eric Blake
2019-Aug-27 13:31 UTC
Re: [Libguestfs] [PATCH nbdkit] sh: Remove assert and replace with smarter file descriptor duplication. (was: Re: [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible)
On 8/27/19 7:55 AM, Eric Blake wrote:> On 8/27/19 6:47 AM, Richard W.M. Jones wrote: >> On Fri, Aug 02, 2019 at 02:26:15PM -0500, Eric Blake wrote: >>> + /* Ensure that stdin/out/err of the current process were not empty >>> + * before we started creating pipes (otherwise, the close and dup2 >>> + * calls below become more complex to juggle fds around correctly). >>> + */ >>> + assert (in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && >>> + out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && >>> + err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO); >> >> This assert is now being thrown whenever we use nbdkit + sh plugin + >> nbdkit -s option. It's easy to demonstrate using a libnbd one-liner: >>> Hmm, so we've closed stdin/out because the client connection is no > longer around, but still give the shell script one more callback. Yeah, > the failure is definitely explainable, and worth fixing. > >> We can easily paper over this by ignoring the assert on the close >> path, but I foresee two problems: >> >> (1) I don't believe the assert is generally correct. While it's not >> ideal for nbdkit to be run with stdin/out/err closed (they obviously >> should be connected to /dev/null) it's also not impossible for that to >> happen. We don't cope well with this situation.Alternative fix: instead of closing stdin/out for -s, open /dev/null and dup2() it over stdin/out. That has the same effect of ending the client connection, but leaves the fds allocated so that this assert() still works as-is, then we don't have to do any fd shuffling. That's certainly a smaller patch to write.> But it's close; so I'll use it as the starting point and push the > corrected version soon. >-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Seemingly Similar Threads
- [PATCH nbdkit] sh: Remove assert and replace with smarter file descriptor duplication. (was: Re: [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible)
- Re: [PATCH nbdkit] sh: Remove assert and replace with smarter file descriptor duplication. (was: Re: [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible)
- [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible
- [nbdkit PATCH] server: Enforce sane stdin/out/err
- [nbdkit PATCH 9/8] sh: Document CLOEXEC considerations