Eric Blake
2022-Feb-01 19:44 UTC
[Libguestfs] [libnbd PATCH 4/4] copy: Ignore extents if we encountered earlier error
In practice, the nbd extents callback will only be called once per nbd_aio_block_status, because we have only requested exactly one metacontext id, and because compliant servers send only one reply chunk per id. In theory, a server could reply first with an error chunk (perhaps meaning "I'm unable to get status beyond offset X") followed by an extent chunk ("but here's what I was able to get") while still being compliant. And a non-compliant server can send unexpected chunks, just to try and confuse us. Right now, it would take a custom NBD server to actually trigger any scenario where *error would ever be non-zero on entry to next_extent() (nbdkit does not have that much power). So while *error ever being non-zero is unlikely, our easiest course is to handle it the same way we would for a server sending us an unexpected id: ignore the rest of the extent callback. The later completion callback will handle the overall command failing. --- copy/nbd-ops.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c index dfc62f20..ab30badd 100644 --- a/copy/nbd-ops.c +++ b/copy/nbd-ops.c @@ -1,5 +1,5 @@ /* NBD client library in userspace. - * Copyright (C) 2020 Red Hat Inc. + * Copyright (C) 2020-2022 Red Hat Inc. * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public @@ -337,7 +337,7 @@ add_extent (void *vp, const char *metacontext, extent_list *ret = vp; size_t i; - if (strcmp (metacontext, "base:allocation") != 0) + if (strcmp (metacontext, "base:allocation") != 0 || *error) return 0; for (i = 0; i < nr_entries; i += 2) { -- 2.34.1
Laszlo Ersek
2022-Feb-02 08:42 UTC
[Libguestfs] [libnbd PATCH 4/4] copy: Ignore extents if we encountered earlier error
On 02/01/22 20:44, Eric Blake wrote:> In practice, the nbd extents callback will only be called once per > nbd_aio_block_status, because we have only requested exactly one > metacontext id, and because compliant servers send only one reply > chunk per id. In theory, a server could reply first with an error > chunk (perhaps meaning "I'm unable to get status beyond offset X") > followed by an extent chunk ("but here's what I was able to get") > while still being compliant. And a non-compliant server can send > unexpected chunks, just to try and confuse us. Right now, it would > take a custom NBD server to actually trigger any scenario where *error > would ever be non-zero on entry to next_extent() (nbdkit does not have > that much power). So while *error ever being non-zero is unlikely, > our easiest course is to handle it the same way we would for a server > sending us an unexpected id: ignore the rest of the extent callback. > The later completion callback will handle the overall command failing.This commit message mostly answers my question under the cover letter, and indeed it's quite the opposite of what I expected. I'd have expected the command (the exchange with the server) completing, upon which time the completion callback would have run; then the results (a series of extent callbacks) would be passed to the extent callback. So, that's precisely *not* the case in reality. :/> --- > copy/nbd-ops.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c > index dfc62f20..ab30badd 100644 > --- a/copy/nbd-ops.c > +++ b/copy/nbd-ops.c > @@ -1,5 +1,5 @@ > /* NBD client library in userspace. > - * Copyright (C) 2020 Red Hat Inc. > + * Copyright (C) 2020-2022 Red Hat Inc. > * > * This library is free software; you can redistribute it and/or > * modify it under the terms of the GNU Lesser General Public > @@ -337,7 +337,7 @@ add_extent (void *vp, const char *metacontext, > extent_list *ret = vp; > size_t i; > > - if (strcmp (metacontext, "base:allocation") != 0) > + if (strcmp (metacontext, "base:allocation") != 0 || *error) > return 0; > > for (i = 0; i < nr_entries; i += 2) { >So, I don't understand this. First, here we have a non-aio (i.e., synchronous) nbd_block_status call. in case add_extent (the extents callback) is entered with *error !=0, that means (per docs) "the error parameter contains the errno value of any previously detected error". In case we responded by returning -1, then the following (per docs) would apply: "If the callback returns -1, and no earlier error has been detected, then the overall block status command will fail with any non-zero value stored into the callback's error parameter" But we don't do that, we say "everything is OK, I'll just ignore this", like we ignore metadata contexts that we don't need. So how is the "later completion callback" going to fail the entire nbd_block_status() call? First, because this is not an AIO nbd_block_status() call, we don't provide the completion callback ourselves -- I guess it is some default function somewhere in libnbd. Or is it the following: upon getting (*error != 0) on entry, we just decline looking at the (likely nonsensical) "entries" array, and permit the error we got via *error to "survive" through all further add_extent() calls, and to finally fail the outermost nbd_block_status() call? Next question: even if that's the case, should we *first* check *error, and only check "metacontext" if *error is zero? What says that "metacontext" is defined if *error is nonzero? Thanks, Laszlo
Richard W.M. Jones
2022-Feb-03 10:36 UTC
[Libguestfs] [libnbd PATCH 4/4] copy: Ignore extents if we encountered earlier error
The patch series looks fine, with or without Laszlo's small suggested change. ACK Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v