Eric Blake
2023-Sep-22 21:03 UTC
[Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server
On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote:> On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote: > > There are a number of ways in which gathering information can fail. > > But when possible, it makes sense to let the server know that we are > > done talking, as it minimizes the likelihood that nbdinfo's error > > message will be obscured by an accompanying error message by the > > server complaining about an unclean disconnect. > > > > For example, with a one-off qemu temporarily patched as mentioned in > > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off > > 'qemu-nbd -f raw -r file &' produces: > > > > pre-patch: > > > > $ ./run nbdinfo --size nbd://localhost > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 9223372036854781440 which does not fit in signed result: Value too large for defined data type > > qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected end-of-file before all data were read > > This doesn't necessarily seem like a bug?It's a quality of service thing - qemu is just being noisy that the client closed the socket abruptly without giving any reason why. It doesn't change libnbd's behavior (nbdinfo has already reported its error message), but may confuse people reading qemu-nbd logs. It may also be a case where we could patch qemu-nbd to not output a complaint if the client exited at a clean point in the back-and-forth transactions. We still want to be noisy if the socket closes after the first byte has been read, but if there are zero bytes available, announcing that the client no longer cares does not add much value.> > This feels like quite a significant change in behaviour to me. In > particular I'm worried if it interacts in some subtle way with the > forking done by the "[ ... ]" syntax for running servers on the > command line (or any of the other ways that libnbd might fork/exec).Interesting observation. atexit() handlers are not preserved across exec, and I think all our fork() paths end in either exec or _exit (also where atexit handlers are ignored), so I don't think we are risking calling the handler twice.> > Can we hold this patch until after 1.18 is released and then put it > in? Should only be a week or two.Sure, being conservative on this one is fine by me. I'll delay it until after 1.18.> > Provisionally ACKed for post-1.18 > > Rich. > > > post-patch: > > $ ./run nbdinfo --size nbd://localhost > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 9223372036854781440 which does not fit in signed result: Value too large for defined data type > > > > if nbdinfo is run in the same terminal as qemu-nbd. > > > > Signed-off-by: Eric Blake <eblake at redhat.com> > > --- > > info/main.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/info/main.c b/info/main.c > > index c8248c61..05f19f43 100644 > > --- a/info/main.c > > +++ b/info/main.c > > @@ -88,6 +88,17 @@ usage (FILE *fp, int exitcode) > > exit (exitcode); > > } > > > > +void > > +clean_shutdown (void) > > +{ > > + /* If we are connected but detect an error, try to give the server > > + * notice that we are done talking. Ignore failures, as this is > > + * only a courtesy measure. > > + */ > > + if (nbd) > > + nbd_shutdown (nbd, 0); > > +} > > + > > int > > main (int argc, char *argv[]) > > { > > @@ -277,6 +288,7 @@ main (int argc, char *argv[]) > > fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); > > exit (EXIT_FAILURE); > > } > > + atexit (clean_shutdown); > > nbd_set_uri_allow_local_file (nbd, true); /* Allow ?tls-psk-file. */ > > > > /* Set optional modes in the handle. */ > > @@ -353,6 +365,7 @@ main (int argc, char *argv[]) > > free_exports (); > > nbd_shutdown (nbd, 0); > > nbd_close (nbd); > > + nbd = NULL; > > > > /* Close the output stream and copy it to the real stdout. */ > > if (fclose (fp) == EOF) { > > -- > > 2.41.0 > > > > > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > virt-top is 'top' for virtual machines. Tiny program with many > powerful monitoring features, net stats, disk stats, logging, etc. > http://people.redhat.com/~rjones/virt-top >-- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Eric Blake
2023-Sep-27 21:22 UTC
[Libguestfs] [libnbd PATCH] info: Try harder for graceful disconnect from server
On Fri, Sep 22, 2023 at 04:03:19PM -0500, Eric Blake wrote:> On Fri, Sep 22, 2023 at 09:47:55PM +0100, Richard W.M. Jones wrote: > > On Fri, Sep 22, 2023 at 12:33:36PM -0500, Eric Blake wrote: > > > There are a number of ways in which gathering information can fail. > > > But when possible, it makes sense to let the server know that we are > > > done talking, as it minimizes the likelihood that nbdinfo's error > > > message will be obscured by an accompanying error message by the > > > server complaining about an unclean disconnect. > > > > > > For example, with a one-off qemu temporarily patched as mentioned in > > > commit 0f8ee8c6 to advertise sizes larger than 2^63, kicking off > > > 'qemu-nbd -f raw -r file &' produces: > > > > > > pre-patch: > > > > > > $ ./run nbdinfo --size nbd://localhost > > > /home/eblake/libnbd/info/.libs/nbdinfo: nbd_get_size: server claims size 9223372036854781440 which does not fit in signed result: Value too large for defined data type > > > qemu-nbd: option negotiation failed: Failed to read opts magic: Unexpected end-of-file before all data were read > > > > This doesn't necessarily seem like a bug? > > It's a quality of service thing - qemu is just being noisy that the > client closed the socket abruptly without giving any reason why. It > doesn't change libnbd's behavior (nbdinfo has already reported its > error message), but may confuse people reading qemu-nbd logs. > > It may also be a case where we could patch qemu-nbd to not output a > complaint if the client exited at a clean point in the back-and-forth > transactions. We still want to be noisy if the socket closes after > the first byte has been read, but if there are zero bytes available, > announcing that the client no longer cares does not add much value. > > > > > This feels like quite a significant change in behaviour to me. In > > particular I'm worried if it interacts in some subtle way with the > > forking done by the "[ ... ]" syntax for running servers on the > > command line (or any of the other ways that libnbd might fork/exec). > > Interesting observation. atexit() handlers are not preserved across > exec, and I think all our fork() paths end in either exec or _exit > (also where atexit handlers are ignored), so I don't think we are > risking calling the handler twice. > > > > > Can we hold this patch until after 1.18 is released and then put it > > in? Should only be a week or two. > > Sure, being conservative on this one is fine by me. I'll delay it > until after 1.18. > > > > > Provisionally ACKed for post-1.18Now in as commit fd4f3fea, so we can start getting some soak time under CI. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org