Laszlo Ersek
2022-Feb-02 08:42 UTC
[Libguestfs] [libnbd PATCH 4/4] copy: Ignore extents if we encountered earlier error
On 02/01/22 20:44, Eric Blake wrote:> In practice, the nbd extents callback will only be called once per > nbd_aio_block_status, because we have only requested exactly one > metacontext id, and because compliant servers send only one reply > chunk per id. In theory, a server could reply first with an error > chunk (perhaps meaning "I'm unable to get status beyond offset X") > followed by an extent chunk ("but here's what I was able to get") > while still being compliant. And a non-compliant server can send > unexpected chunks, just to try and confuse us. Right now, it would > take a custom NBD server to actually trigger any scenario where *error > would ever be non-zero on entry to next_extent() (nbdkit does not have > that much power). So while *error ever being non-zero is unlikely, > our easiest course is to handle it the same way we would for a server > sending us an unexpected id: ignore the rest of the extent callback. > The later completion callback will handle the overall command failing.This commit message mostly answers my question under the cover letter, and indeed it's quite the opposite of what I expected. I'd have expected the command (the exchange with the server) completing, upon which time the completion callback would have run; then the results (a series of extent callbacks) would be passed to the extent callback. So, that's precisely *not* the case in reality. :/> --- > copy/nbd-ops.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c > index dfc62f20..ab30badd 100644 > --- a/copy/nbd-ops.c > +++ b/copy/nbd-ops.c > @@ -1,5 +1,5 @@ > /* NBD client library in userspace. > - * Copyright (C) 2020 Red Hat Inc. > + * Copyright (C) 2020-2022 Red Hat Inc. > * > * This library is free software; you can redistribute it and/or > * modify it under the terms of the GNU Lesser General Public > @@ -337,7 +337,7 @@ add_extent (void *vp, const char *metacontext, > extent_list *ret = vp; > size_t i; > > - if (strcmp (metacontext, "base:allocation") != 0) > + if (strcmp (metacontext, "base:allocation") != 0 || *error) > return 0; > > for (i = 0; i < nr_entries; i += 2) { >So, I don't understand this. First, here we have a non-aio (i.e., synchronous) nbd_block_status call. in case add_extent (the extents callback) is entered with *error !=0, that means (per docs) "the error parameter contains the errno value of any previously detected error". In case we responded by returning -1, then the following (per docs) would apply: "If the callback returns -1, and no earlier error has been detected, then the overall block status command will fail with any non-zero value stored into the callback's error parameter" But we don't do that, we say "everything is OK, I'll just ignore this", like we ignore metadata contexts that we don't need. So how is the "later completion callback" going to fail the entire nbd_block_status() call? First, because this is not an AIO nbd_block_status() call, we don't provide the completion callback ourselves -- I guess it is some default function somewhere in libnbd. Or is it the following: upon getting (*error != 0) on entry, we just decline looking at the (likely nonsensical) "entries" array, and permit the error we got via *error to "survive" through all further add_extent() calls, and to finally fail the outermost nbd_block_status() call? Next question: even if that's the case, should we *first* check *error, and only check "metacontext" if *error is zero? What says that "metacontext" is defined if *error is nonzero? Thanks, Laszlo
Eric Blake
2022-Feb-02 14:54 UTC
[Libguestfs] [libnbd PATCH 4/4] copy: Ignore extents if we encountered earlier error
On Wed, Feb 02, 2022 at 09:42:46AM +0100, Laszlo Ersek wrote:> On 02/01/22 20:44, Eric Blake wrote: > > In practice, the nbd extents callback will only be called once per > > nbd_aio_block_status, because we have only requested exactly one > > metacontext id, and because compliant servers send only one reply > > chunk per id. In theory, a server could reply first with an error > > chunk (perhaps meaning "I'm unable to get status beyond offset X") > > followed by an extent chunk ("but here's what I was able to get") > > while still being compliant. And a non-compliant server can send > > unexpected chunks, just to try and confuse us. Right now, it would > > take a custom NBD server to actually trigger any scenario where *error > > would ever be non-zero on entry to next_extent() (nbdkit does not have > > that much power). So while *error ever being non-zero is unlikely, > > our easiest course is to handle it the same way we would for a server > > sending us an unexpected id: ignore the rest of the extent callback. > > The later completion callback will handle the overall command failing. > > This commit message mostly answers my question under the cover letter, > and indeed it's quite the opposite of what I expected. I'd have expected > the command (the exchange with the server) completing, upon which time > the completion callback would have run; then the results (a series of > extent callbacks) would be passed to the extent callback. > > So, that's precisely *not* the case in reality. :/Yeah, understanding the control flow is helpful. The first thing to note: every client command has a sticky error field (see struct command->error); this field starts life 0, and while the command is live, it is set to non-zero in the following scenarios: if the client never sends the command to the server (for example, settings from nbd_set_strict_mode() determined it would be an out-of-bounds request); when the server replies with NBD_REPLY_TYPE_ERROR in a structured reply (although the command may still be live if it wasn't the final reply chunk); when the server replies with an error in a simple reply (with simple replies, the command is necessarily complete); or when a client mid-command callback returns -1 (only pread_structured and block_status have these, and the command may still be live if it wasn't the final reply chunk). Once the command is complete, then we call the completion callback (if there is one) with *error set to the value of command->error, giving one more chance to change the value stored in command->error; and then nbd_aio_command_completed returns the value of command->error as the error status of the overall command (unless the completion callback returned 1 to auto-retire the command first). Then we have nbd_aio_block_status which kicks off a transaction that will eventually send NBD_CMD_BLOCK_STATUS, and is associated with TWO callbacks: the mandatory mid-command nbd_extents callback to read block status information as it comes over the wire from the server, and the optional nbd_completion callback to mark when the server has nothing more to send. The documentation states that the nbd_extents callback can be called 0 or more times (0 if the server replies with an error, or even if nbd_set_strict_mode() determines that the command violates the NBD protocol and is never even sent to the server). If you only have one metacontext id (as in nbdcopy only caring about base:allocation), one time is all the more we expect from a compliant server (a noncompliant server can send more than one reply, but we'd have to code up such a custom server, as nbdkit does not behave that way). The *error parameter passed to nbd_extents is non-zero only if there is a running error detected so far (generally unlikely; a server would have to send an error chunk before a status chunk, or send more than one status chunk where the libnbd client set *error during the first time the callback was reached). Then the nbd_completion callback is always called exactly once, after there is nothing further to be done with the command. The *error parameter passed to the nbd_completion callback is the same value that will be returned from nbd_aio_command_comppleted (unless the callback returns 1 for auto-retiring). And finally we have nbd_block_status: a synchronous thin wrapper that calls nbd_aio_block_status, nbd_aio_poll, nbd_aio_command_completed under the hood. The nbd_extents callback is still mandatory, and is called mid-command prior to the function call completing, in response to server information as it comes over the wire. The optional nbd_completion callback of nbd_aio_block_status is not exposed (the command completes synchronously, instead; whether we used the completion callback under the hood is an implementation detail that doesn't matter to the public interface). I'm open to a documentation patch that gives more clarity on this sequence of callbacks.> > > --- > > copy/nbd-ops.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c > > index dfc62f20..ab30badd 100644 > > --- a/copy/nbd-ops.c > > +++ b/copy/nbd-ops.c > > @@ -1,5 +1,5 @@ > > /* NBD client library in userspace. > > - * Copyright (C) 2020 Red Hat Inc. > > + * Copyright (C) 2020-2022 Red Hat Inc. > > * > > * This library is free software; you can redistribute it and/or > > * modify it under the terms of the GNU Lesser General Public > > @@ -337,7 +337,7 @@ add_extent (void *vp, const char *metacontext, > > extent_list *ret = vp; > > size_t i; > > > > - if (strcmp (metacontext, "base:allocation") != 0) > > + if (strcmp (metacontext, "base:allocation") != 0 || *error) > > return 0; > > > > for (i = 0; i < nr_entries; i += 2) { > > > > So, I don't understand this. First, here we have a non-aio (i.e., > synchronous) nbd_block_status call. in case add_extent (the extents > callback) is entered with *error !=0, that means (per docs) "the error > parameter contains the errno value of any previously detected error".I guess what the docs are missing is the disclaimer that if the mid-command callback is reached, there is usually no previously detected error. The only two situations where there is a previous error is if the server replies with NBD_REPLY_TYPE_ERROR first and then NBD_REPLY_TYPE_BLOCK_STATUS second (a custom server), or if the client negotiated more than one context and the first use of the callback assigns an error (by returning -1) before the second callback is reached. The other cases where an error could be detected never reach the mid-command callback (if nbd_set_strict_mode() prevented the command from ever going to the server, or if the server replied with an error as a final chunk).> > In case we responded by returning -1, then the following (per docs) > would apply: > > "If the callback returns -1, and no earlier error has been detected, > then the overall block status command will fail with any non-zero value > stored into the callback's error parameter" > > But we don't do that, we say "everything is OK, I'll just ignore this", > like we ignore metadata contexts that we don't need.Yes. But even though the mid-command callback ignores the error, the synchronous wrapper will still see the same error via nbd_aio_command_completed under the hood, so the overall nbd_blocK_status will fail with the same value.> > So how is the "later completion callback" going to fail the entire > nbd_block_status() call? First, because this is not an AIO > nbd_block_status() call, we don't provide the completion callback > ourselves -- I guess it is some default function somewhere in libnbd.Right - the synchronous wrappers return the same value as would be present in *error given to the completion callback of an asynchronous API (whether the synchronous wrappers actually use the completion callback, or get at the value directly, is an implementation detail). And remember that the completion callback to the asynchronous APIs is optional; if it is not provided, that error value is visible through nbd_aio_command_completed.> > Or is it the following: upon getting (*error != 0) on entry, we just > decline looking at the (likely nonsensical) "entries" array, and permit > the error we got via *error to "survive" through all further > add_extent() calls, and to finally fail the outermost nbd_block_status() > call?Yes, that is the best way of looking at it. In fact, the "entries" array is most likely valid, rather than non-sensical, but since we know that our response to a failing nbd_block_status() is to die, there's no point wasting time processing the valid extents to prolong the time before we die. In other words, this patch is optional; it does not add safety (as the error will resurface as the result of nbd_block_status, where we will fail), but merely speed (since we are going to fail anyways, why waste the time getting to that failure).> > Next question: even if that's the case, should we *first* check *error, > and only check "metacontext" if *error is zero? What says that > "metacontext" is defined if *error is nonzero?The nbd_extents callback is only reached any time the server replies with NBD_REPLY_TYPE_BLOCK_STATUS with a metacontext id that we recognized. If we did not recognize the id, or if the server never replies with that chunk, then we don't call the mid-command callback in the first place. So the "metacontext" and "extents" parameters to nbd_extents are always valid, even if *error is nonzero. And now that I've written this email, I wish I could incorporate some of it into the commit message, even though it is already pushed. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org