Eric Blake
2022-Aug-24 21:20 UTC
[Libguestfs] [libnbd PATCH 1/2] api: Better use of nbd_internal_reset_size_and_flags
On Wed, Aug 24, 2022 at 01:56:03PM +0200, Laszlo Ersek wrote:> On 08/24/22 04:53, Eric Blake wrote: > > The documentation for nbd_internal_reset_size_and_flags() claims we > > should wipe all knowledge of previously-negotiated meta contexts when > > swapping export names. However, we weren't actually doing that, which > > could lead to a user calling nbd_opt_info() for one export, then > > switching to another export name, then having nbd_get_size() still > > report the size of the old export. At present, this is only mildly > > confusing, but once the next patch exposes nbd_opt_set_meta_context() > > for user control, > > (1) The "next patch" does not expose nbd_opt_set_meta_context(). In > fact, I can't find nbd_opt_set_meta_context() -- the function -- in the > codebase anywhere (@ 9e5bf05a2196). > > (I prefer writing "a subsequent patch in this series" rather than "next > patch", because that allows me to reshuffle patches without having to > rewrite commit messages, as long as inter-patch dependencies are still > satisfied.)Yep, exactly what happened; where the subsequent patch is still not quite polished enough to push to the list (but hopefully later today). Will fix before committing.> > (2) I see the NBD_OPT_SET_META_CONTEXT opcode, but I'm not sure what it > is used for. There seems to be documentation for > <https://libguestfs.org/nbd_opt_list_meta_context.3.html>, but not for > nbd_opt_set_meta_context(). So perhaps "introduces" is a better verb > than "exposes"? > > Is NBD_OPT_SET_META_CONTEXT internally used by nbd_add_meta_context() > and nbd_clear_meta_contexts()?Right now, the code base does: nbd_set_export_name() => update h->export_name; no server interaction nbd_add_meta_context() => append to h->request_meta_contexts; no server interaction nbd_clear_meta_contexts() => wipe all of h->request_meta_contexts; no server interaction nbd_opt_list_meta_context() => Issue NBD_OPT_LIST_META_CONTEXT to the server using h->request_meta_contexts as its query list and h->export_name, but collect the results only in the user's callback - not stateful on the server, and only stateful in the client if the callback does something to preserve the server's answers nbd_opt_info()/nbd_opt_go() => Issue NBD_OPT_SET_META_CONTEXT using h->request_meta_contexts as its query list and h->export_name, collect the results into h->meta_contexts, then issue NBD_OPT_INFO/GO to the server. Stateful, because the server's replies map the users context names (strings) to the servers ids (uint32_t); and later NBD_CMD_BLOCK_STATUS refers only to the server's 32-bit id, even though the libnbd API does not directly expose those ids. nbd_can_meta_context() => Query whether h->meta_contexts contains a server reply for an earlier call that triggered NBD_OPT_SET_META_CONTEXT and which has not been wiped by changing state via nbd_internal_reset_size_and_flags(); no server interaction. with the additional caveat that all the nbd_opt_* functions are unreachable unless you do nbd_set_opt_mode(true) to enable manual option negotiation (the default is that nbd_connect_*() behaves as if it does nbd_opt_go() on your behalf). The promised upcoming patch will add: nbd_set_request_meta_context(false) => Skip the NBD_OPT_SET_META_CONTEXT portion in nbd_opt_info()/nbd_opt_go() nbd_opt_set_meta_context() => Issue NBD_OPT_SET_META_CONTEXT to the server using h->request_meta_contexts as its query list and h->export_name, and collect the results into both the user's callback and h->meta_contexts at the same time. Stateful - wipes out any ids returned by an earlier SET_META_CONTEXT, but the ids returned by the server can survive a number of other NBD_OPT_ commands prior to NBD_OPT_GO, provided that NBD_OPT_GO uses the same export name as NBD_OPT_SET_META_CONTEXT. In that manner, it will become possible for manual option haggling to set the meta contexts at a different point in the negotiation phase, useful when trying to trigger various server corner cases. The server wipes meta context state any time it gets NBD_OPT_SET_META_CONTEXT or NBD_OPT_GO with a different export name than before. But since we don't pass in export name as a direct parameter to either nbd_opt_go() or the upcoming nbd_opt_set_meta_context(), we instead need to wipe the state when we change h->export_name (that is, at nbd_set_export_name()). This does mean that if a client calls nbd_set_export_name(nbd, "a"), nbd_set_meta_context(...), nbd_set_export_name(nbd, "b"), nbd_set_export_name(nbd, "a"), nbd_opt_go(nbd), that the server will not observe the name change, and will actually support the previously-negotiated contexts, but libnbd will have wiped access to those, and trigger client-side failure on attempts to call nbd_block_status() on the grounds that it doesn't remember any negotiated contexts. But a client actually doing this is unlikely.> > > it would become actively dangerous: negotiating meta > > contexts for one export, then switching to a different export name > > before nbd_opt_go, can lead us into calling nbd_cmd_block_status() when > > the server is not expecting it. > > (3) What is nbd_cmd_block_status()? I can only find nbd_block_status(). > Do you mean the NBD extent callback that is called once by metacontext?Typo on my part. The API nbd_block_status() issues NBD_CMD_BLOCK_STATUS to the server.> > (4) The word "server" seems like a typo; libnbd is used by NBD clients.Here, it is intended. The NBD spec says a client should not issue NBD_OPT_BLOCK_STATUS to the server unless it successfully negotiated NBD_OPT_SET_META_CONTEXT with the same export name later passed to NBD_OPT_GO, without intervening commands that might wipe that state. The client code in nbd_block_status() tries to enforce that it won't confuse the server by checking whether h->meta_contexts is non-NULL before it will issue NBD_CMD_BLOCK_STATUS. But once we allow manual control over setting meta contexts, rather than tightly coupling it with NBD_OPT_GO, we need to make sure we wipe h->meta_contexts in at least all places the server wipes state (even if we also happen to wipe it in more places, as in my "a"/set/"b"/"a"/go example).> > > > > While fixing this, code things to check whether we are actually > > swapping export names; an idempotent nbd_set_export_name() is not > > observable by the server, and therefore does not need to invalidate > > what has already been negotiated. > > > > OK. > > > Conversely, calling nbd_opt_list_meta_context() does not need to wipe > > the list - it does not touch h->meta_contexts, but relies on the > > user's callback instead. > > (5) I kind of wish this were a separate patch, but I see the point of > keeping it unified, too.Maybe I can separate it; the test would be calling nbd_opt_info() to set h->meta_contexts, nbd_can_meta_context() to verify that "base:allocation" is supported by the server, nbd_opt_list_meta_context() to do a query, then nbd_can_meta_context() again to see whether "base:allocation" is still remembered or was accidentally wiped.> > (6) I don't understand the "but". I see no conflict (or even: relevance) > with the user's callback. Listing the meta contexts does not touch > "h->meta_contexts", so we should not clear that list, period. I don't > understand what specifically we rely on the user's callback *for*, for > this particular discussion. If it's important, please elaborate; > otherwise I'm just getting confused! :)Fair enough. Basically, nbd_opt_list_meta_context() hands the server's answers to the user's callback instead of by modifications to h->meta_contexts, so it does not impact future nbd_can_meta_context(). I'll see if I can reword that a bit.> > The code looks OK to my untrained eye, so I'm ready to R-b that. > > Thanks! > Laszlo > > > --- > > generator/states-newstyle-opt-meta-context.c | 7 +++---- > > lib/handle.c | 4 ++++ > > 2 files changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/generator/states-newstyle-opt-meta-context.c b/generator/states-newstyle-opt-meta-context.c > > index 30b9617..76f1032 100644 > > --- a/generator/states-newstyle-opt-meta-context.c > > +++ b/generator/states-newstyle-opt-meta-context.c > > @@ -1,5 +1,5 @@ > > /* nbd client library in userspace: state machine > > - * Copyright (C) 2013-2020 Red Hat Inc. > > + * Copyright (C) 2013-2022 Red Hat Inc. > > * > > * This library is free software; you can redistribute it and/or > > * modify it under the terms of the GNU Lesser General Public > > @@ -28,7 +28,6 @@ STATE_MACHINE { > > * contexts, when doing SET (but an empty LIST is okay). > > */ > > assert (h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE); > > - nbd_internal_reset_size_and_flags (h); > > if (h->opt_current == NBD_OPT_LIST_META_CONTEXT) { > > assert (h->opt_mode); > > assert (h->structured_replies); > > @@ -37,6 +36,8 @@ STATE_MACHINE { > > } > > else { > > assert (CALLBACK_IS_NULL (h->opt_cb.fn.context)); > > + nbd_internal_reset_size_and_flags (h); > > + assert (h->meta_contexts == NULL); > > opt = NBD_OPT_SET_META_CONTEXT; > > if (!h->structured_replies || h->request_meta_contexts.len == 0) { > > SET_NEXT_STATE (%^OPT_GO.START); > > @@ -44,8 +45,6 @@ STATE_MACHINE { > > } > > } > > > > - assert (h->meta_contexts == NULL); > > -This hunk fixes the nbd_opt_list_meta_context() accidentally wiping context when it doesn't have to,> > /* Calculate the length of the option request data. */ > > len = 4 /* exportname len */ + strlen (h->export_name) + 4 /* nr queries */; > > for (i = 0; i < h->request_meta_contexts.len; ++i) > > diff --git a/lib/handle.c b/lib/handle.c > > index 8713e18..2aa89b9 100644 > > --- a/lib/handle.c > > +++ b/lib/handle.c > > @@ -219,6 +219,10 @@ nbd_unlocked_set_export_name (struct nbd_handle *h, const char *export_name) > > return -1; > > } > > > > + if (strcmp (export_name, h->export_name) == 0) > > + return 0; > > + > > + nbd_internal_reset_size_and_flags (h);and this hunk fixes the missing wipe (the testsuite coverage so far only covered this case).> > new_name = strdup (export_name); > > if (!new_name) { > > set_error (errno, "strdup"); > > >-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Laszlo Ersek
2022-Aug-25 10:13 UTC
[Libguestfs] [libnbd PATCH 1/2] api: Better use of nbd_internal_reset_size_and_flags
On 08/24/22 23:20, Eric Blake wrote:> On Wed, Aug 24, 2022 at 01:56:03PM +0200, Laszlo Ersek wrote: >> On 08/24/22 04:53, Eric Blake wrote: >>> The documentation for nbd_internal_reset_size_and_flags() claims we >>> should wipe all knowledge of previously-negotiated meta contexts when >>> swapping export names. However, we weren't actually doing that, which >>> could lead to a user calling nbd_opt_info() for one export, then >>> switching to another export name, then having nbd_get_size() still >>> report the size of the old export. At present, this is only mildly >>> confusing, but once the next patch exposes nbd_opt_set_meta_context() >>> for user control, >> >> (1) The "next patch" does not expose nbd_opt_set_meta_context(). In >> fact, I can't find nbd_opt_set_meta_context() -- the function -- in the >> codebase anywhere (@ 9e5bf05a2196). >> >> (I prefer writing "a subsequent patch in this series" rather than "next >> patch", because that allows me to reshuffle patches without having to >> rewrite commit messages, as long as inter-patch dependencies are still >> satisfied.) > > Yep, exactly what happened; where the subsequent patch is still not > quite polished enough to push to the list (but hopefully later today). > Will fix before committing. > >> >> (2) I see the NBD_OPT_SET_META_CONTEXT opcode, but I'm not sure what it >> is used for. There seems to be documentation for >> <https://libguestfs.org/nbd_opt_list_meta_context.3.html>, but not for >> nbd_opt_set_meta_context(). So perhaps "introduces" is a better verb >> than "exposes"? >> >> Is NBD_OPT_SET_META_CONTEXT internally used by nbd_add_meta_context() >> and nbd_clear_meta_contexts()? > > Right now, the code base does: > > nbd_set_export_name() => update h->export_name; no server interaction > > nbd_add_meta_context() => append to h->request_meta_contexts; no > server interaction > > nbd_clear_meta_contexts() => wipe all of h->request_meta_contexts; no > server interaction > > nbd_opt_list_meta_context() => Issue NBD_OPT_LIST_META_CONTEXT to the > server using h->request_meta_contexts as its query list and > h->export_name, but collect the results only in the user's callback - > not stateful on the server, and only stateful in the client if the > callback does something to preserve the server's answers > > nbd_opt_info()/nbd_opt_go() => Issue NBD_OPT_SET_META_CONTEXT using > h->request_meta_contexts as its query list and h->export_name, collect > the results into h->meta_contexts, then issue NBD_OPT_INFO/GO to the > server. Stateful, because the server's replies map the users context > names (strings) to the servers ids (uint32_t); and later > NBD_CMD_BLOCK_STATUS refers only to the server's 32-bit id, even > though the libnbd API does not directly expose those ids. > > nbd_can_meta_context() => Query whether h->meta_contexts contains a > server reply for an earlier call that triggered > NBD_OPT_SET_META_CONTEXT and which has not been wiped by changing > state via nbd_internal_reset_size_and_flags(); no server interaction. > > with the additional caveat that all the nbd_opt_* functions are > unreachable unless you do nbd_set_opt_mode(true) to enable manual > option negotiation (the default is that nbd_connect_*() behaves as if > it does nbd_opt_go() on your behalf).Thanks for writing this up; it's a lot to digest. I think I understand it now. There's a whole lot of state here, which is something I don't really like. (I don't know the design background however.) FWIW I've found this stuff easier to understand by thinking about it in terms of requests/responses, and by considering the client-side state only a "lightweight state", solely existing for implementing the request/response patterns. In other words, I kind of consider the client-side state "ad hoc" (whereas the server-side state is heavy-weight state). The overlap between nbd_opt_list_meta_context() and nbd_opt_info() is especially confusing. For the following reasons: - both functions perform the same listing at the base level (build an intersection between the client's desired meta context list and the server's supported meta context list) - the intersection is consumed differently on the client (the first function reports results through a callback, the second one requires subsequent, repeated calls to nbd_can_meta_context()). - the first function is stateless (considering only the server-side state "state" here), the second is stateful. While this is well reflected by the opcode names LIST vs. SET, those opcodes are hidden from the function names. In particular, the second function says "info" in the name, which I would not expect to change state on the server. - NBD_OPT_INFO in particular looks like a useless opcode. NBD_OPT_SET_META_CONTEXT handles the filtering (intersection) and mapping to numeric IDs. NBD_OPT_GO transitions the server to the "ready" state (completes negotiation). So those two look useful. I don't understand what the precise responsibility is that *only* NBD_OPT_INFO can cover. The manual at <https://libguestfs.org/nbd_opt_info.3.html> says that nbd_opt_info() enables export size querying (for example), but that kind of querying is just as possible without nbd_opt_info(); only nbd_set_opt_mode() is needed -- see the example code at <https://libguestfs.org/nbd_set_opt_mode.3.html>.> > The promised upcoming patch will add: > > nbd_set_request_meta_context(false) => Skip the > NBD_OPT_SET_META_CONTEXT portion in nbd_opt_info()/nbd_opt_go()This makes sense for nbd_opt_go(), as it will then retain a single responsibility: transitioning the server to the ready state. However, with SET_META_CONTEXT skipped, nbd_opt_go() appears to be reduced to sending the NBD_OPT_INFO opcode -- and that one looks entirely useless to me (see above).> > nbd_opt_set_meta_context() => Issue NBD_OPT_SET_META_CONTEXT to the > server using h->request_meta_contexts as its query list and > h->export_name, and collect the results into both the user's callback > and h->meta_contexts at the same time. Stateful - wipes out any ids > returned by an earlier SET_META_CONTEXT, but the ids returned by the > server can survive a number of other NBD_OPT_ commands prior to > NBD_OPT_GO, provided that NBD_OPT_GO uses the same export name as > NBD_OPT_SET_META_CONTEXT.Right, so nbd_opt_set_meta_context() is the "first half" extracted from the current behavior of nbd_opt_info()/nbd_opt_go(). (Again with my personal confusion that the "second half" of nbd_opt_info(), i.e., the sending of the NB_OPT_INFO opcode, looks useless.)> > In that manner, it will become possible for manual option haggling to > set the meta contexts at a different point in the negotiation phase, > useful when trying to trigger various server corner cases. > > The server wipes meta context state any time it gets > NBD_OPT_SET_META_CONTEXT or NBD_OPT_GO with a different export name > than before. But since we don't pass in export name as a direct > parameter to either nbd_opt_go() or the upcoming > nbd_opt_set_meta_context(), we instead need to wipe the state when we > change h->export_name (that is, at nbd_set_export_name()). This does > mean that if a client calls nbd_set_export_name(nbd, "a"), > nbd_set_meta_context(...), nbd_set_export_name(nbd, "b"), > nbd_set_export_name(nbd, "a"), nbd_opt_go(nbd), that the server will > not observe the name change, and will actually support the > previously-negotiated contexts, but libnbd will have wiped access to > those, and trigger client-side failure on attempts to call > nbd_block_status() on the grounds that it doesn't remember any > negotiated contexts. But a client actually doing this is unlikely.So this is the problem with stateful protocols; we need to remember and recognize the points at which the state goes stale and must be cleared. I wonder why the NBD protocol was not designed to take as many parameters as possible per-request (minimizing the state in both client and server), and only turn the kind of information into actual "state" that is required by the *bulk* operations (basically the block-level operations on the device, such as read, write, zero/trim, etc) where the parameter repetitions would have significant bandwidth and computation (formatting/parsing) cost. The cost of the negotiation phase looks trivial in comparison to the cost of the "ready state" work (IIRC the "ready state"); all the cached information does not seem to buy us much.> >> >>> it would become actively dangerous: negotiating meta >>> contexts for one export, then switching to a different export name >>> before nbd_opt_go, can lead us into calling nbd_cmd_block_status() when >>> the server is not expecting it. >> >> (3) What is nbd_cmd_block_status()? I can only find nbd_block_status(). >> Do you mean the NBD extent callback that is called once by metacontext? > > Typo on my part. The API nbd_block_status() issues > NBD_CMD_BLOCK_STATUS to the server. > >> >> (4) The word "server" seems like a typo; libnbd is used by NBD clients. > > Here, it is intended. The NBD spec says a client should not issue > NBD_OPT_BLOCK_STATUS to the server unless it successfully negotiated > NBD_OPT_SET_META_CONTEXT with the same export name later passed to > NBD_OPT_GO, without intervening commands that might wipe that state. > The client code in nbd_block_status() tries to enforce that it won't > confuse the server by checking whether h->meta_contexts is non-NULL > before it will issue NBD_CMD_BLOCK_STATUS. But once we allow manual > control over setting meta contexts, rather than tightly coupling it > with NBD_OPT_GO, we need to make sure we wipe h->meta_contexts in at > least all places the server wipes state (even if we also happen to > wipe it in more places, as in my "a"/set/"b"/"a"/go example).OK. The wording of your commit message ("actively dangerous") confused me, as I didn't expect the client to be able to break the server (for any definition of "break"). However, if the spec uses the "SHOULD NOT" term, then preventing that sequence of operations in the library makes sense.> >> >>> >>> While fixing this, code things to check whether we are actually >>> swapping export names; an idempotent nbd_set_export_name() is not >>> observable by the server, and therefore does not need to invalidate >>> what has already been negotiated. >>> >> >> OK. >> >>> Conversely, calling nbd_opt_list_meta_context() does not need to wipe >>> the list - it does not touch h->meta_contexts, but relies on the >>> user's callback instead. >> >> (5) I kind of wish this were a separate patch, but I see the point of >> keeping it unified, too. > > Maybe I can separate it; the test would be calling nbd_opt_info() to > set h->meta_contexts, nbd_can_meta_context() to verify that > "base:allocation" is supported by the server, > nbd_opt_list_meta_context() to do a query, then nbd_can_meta_context() > again to see whether "base:allocation" is still remembered or was > accidentally wiped..... Again, anything around nbd_opt_info() keeps confusing me. What we actually need here is the NBD_OPT_SET_META_CONTEXT. Currently, we can do that in two ways, with nbd_opt_info() and nbd_opt_go(). The latter is no good, as it kicks us out of the negotiation phase. The former I *dislike*, because it does something *in addition* to NBD_OPT_SET_META_CONTEXT that I don't understand -- namely the NBD_OPT_INFO opcode. Therefore, the new function nbd_opt_set_meta_context() would be *just right* for this step, as it exposes NBD_OPT_SET_META_CONTEXT (and the population of h->meta_contexts) in isilation. Perhaps introduce the new function before updating the test case?> >> >> (6) I don't understand the "but". I see no conflict (or even: relevance) >> with the user's callback. Listing the meta contexts does not touch >> "h->meta_contexts", so we should not clear that list, period. I don't >> understand what specifically we rely on the user's callback *for*, for >> this particular discussion. If it's important, please elaborate; >> otherwise I'm just getting confused! :) > > Fair enough. Basically, nbd_opt_list_meta_context() hands the > server's answers to the user's callback instead of by modifications to > h->meta_contexts, so it does not impact future nbd_can_meta_context(). > I'll see if I can reword that a bit.Upon re-reading the commit message, after your explanation, I think the "but" is fine, but benefits from a bit of expanding: Conversely, calling nbd_opt_list_meta_context() does not need to wipe the list - it does not touch h->meta_contexts, but relies on the user's callback instead, *for collecting the list of those metacontexts that the server acknowledges*. How does it sound? Laszlo> >> >> The code looks OK to my untrained eye, so I'm ready to R-b that. >> >> Thanks! >> Laszlo >> >>> --- >>> generator/states-newstyle-opt-meta-context.c | 7 +++---- >>> lib/handle.c | 4 ++++ >>> 2 files changed, 7 insertions(+), 4 deletions(-) >>> >>> diff --git a/generator/states-newstyle-opt-meta-context.c b/generator/states-newstyle-opt-meta-context.c >>> index 30b9617..76f1032 100644 >>> --- a/generator/states-newstyle-opt-meta-context.c >>> +++ b/generator/states-newstyle-opt-meta-context.c >>> @@ -1,5 +1,5 @@ >>> /* nbd client library in userspace: state machine >>> - * Copyright (C) 2013-2020 Red Hat Inc. >>> + * Copyright (C) 2013-2022 Red Hat Inc. >>> * >>> * This library is free software; you can redistribute it and/or >>> * modify it under the terms of the GNU Lesser General Public >>> @@ -28,7 +28,6 @@ STATE_MACHINE { >>> * contexts, when doing SET (but an empty LIST is okay). >>> */ >>> assert (h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE); >>> - nbd_internal_reset_size_and_flags (h); >>> if (h->opt_current == NBD_OPT_LIST_META_CONTEXT) { >>> assert (h->opt_mode); >>> assert (h->structured_replies); >>> @@ -37,6 +36,8 @@ STATE_MACHINE { >>> } >>> else { >>> assert (CALLBACK_IS_NULL (h->opt_cb.fn.context)); >>> + nbd_internal_reset_size_and_flags (h); >>> + assert (h->meta_contexts == NULL); >>> opt = NBD_OPT_SET_META_CONTEXT; >>> if (!h->structured_replies || h->request_meta_contexts.len == 0) { >>> SET_NEXT_STATE (%^OPT_GO.START); >>> @@ -44,8 +45,6 @@ STATE_MACHINE { >>> } >>> } >>> >>> - assert (h->meta_contexts == NULL); >>> - > > This hunk fixes the nbd_opt_list_meta_context() accidentally wiping > context when it doesn't have to, > >>> /* Calculate the length of the option request data. */ >>> len = 4 /* exportname len */ + strlen (h->export_name) + 4 /* nr queries */; >>> for (i = 0; i < h->request_meta_contexts.len; ++i) >>> diff --git a/lib/handle.c b/lib/handle.c >>> index 8713e18..2aa89b9 100644 >>> --- a/lib/handle.c >>> +++ b/lib/handle.c >>> @@ -219,6 +219,10 @@ nbd_unlocked_set_export_name (struct nbd_handle *h, const char *export_name) >>> return -1; >>> } >>> >>> + if (strcmp (export_name, h->export_name) == 0) >>> + return 0; >>> + >>> + nbd_internal_reset_size_and_flags (h); > > and this hunk fixes the missing wipe (the testsuite coverage so far > only covered this case). > >>> new_name = strdup (export_name); >>> if (!new_name) { >>> set_error (errno, "strdup"); >>> >> >