Richard W.M. Jones
2020-Jul-21 12:46 UTC
[Libguestfs] Extending nbdkit to support listing exports
Hi Eric, Nir. $SUBJECT - it's complicated! Imagine a scenario where we have extended the file plugin so you can point it at a directory and it will use the client-supplied export name to select a file to serve. Also we have extended the tar filter so that you can map tar file components into export names. You may be able to see where this is going ... Now we point the file plugin at a directory of tar files, and layer the tar filter on top, and then what happens? The client can only supply a single export name, so how can it specify component X of tar file Y? tar filter ---> file plugin tar-exportname=on exportname=on directory=/my-files ---> directory of tar files It gets worse! At the moment filters and plugins can read the export name sent by the client by calling the global function nbdkit_export_name(). I'm not so worried about filters, but plugins must be able to continue to use this global function. In the scenario above both the tar filter and the file plugin would see the same string (if we did nothing), and it would only make sense to one or the other but never to both. Listing exports is also complicated: Should we list only the nearest filter's exports? Or the plugin's exports? Cartesian product of both somehow? That's the background, but how do we solve it? ---------------------------------------------- I'd like to say first of all that I believe export names are already a dumpster fire of potential insecurity, and that we shouldn't have the core server attempt to parse them, so solutions involving splitting the export names into components (like tar-component/file-name) should not be considered because they are too risky, and run into arbitrary limits (what escape character to use? what is the max length?) Also changing the NBD protocol is not under consideration. I think we need to consider these aspects separately: (a) How filters and plugins get the client exportname. (b) How filters and plugins respond when the client issues NBD_OPT_LIST. (c) How the server, filters and plugins respond to NBD_OPT_INFO. (a) Client exportname --------------------- The client sends the export name of the export it wants to access with NBD_OPT_EXPORT_NAME or the newer NBD_OPT_GO. This is an opaque string (not a filename, pathname etc). Currently filters and plugins can fetch this using nbdkit_export_name(). However it cannot be modified by filters. My proposal is that we extend the .open() callback with an extra export_name parameter: void *filter_open (int readonly, const char *export_name); For plugins I have already proposed this for inclusion in API V3. We already do this in nbdkit-sh-plugin. For backwards compatibility with existing plugins we'd make nbdkit_export_name() return the export name passed down by the last filter in the chain, and deprecate this function in V3. For filters the next_open field would take the export_name and this would allow filters to modify the export name. nbdkit-tar-filter would take an explicit tar-export-name=<new> parameter to pass a different export name down to the underlying plugin. If the tar filter was implementing its own export name processing then this parameter would either be required or would default to "". (If the tar filter was not implementing export name functionality it would pass it down unchanged). (b) Listing exports with NBD_OPT_LIST ------------------------------------- I believe we need a new .list_exports() callback for filters and plugins to handle this case. I'm not completely clear on the API, but it could be something as simple as: const char **list_exports (void); returning a NULL-terminated list of strings. Is it conceivable we will need additional fields in future? Plugins which do not implement this would be assumed to return a single export "" (or perhaps the -e parameter??) Filters could ignore or replace exports by optionally implementing this callback. I wouldn't recommend that filters modify export names however. (c) Handling NBD_OPT_INFO ------------------------- I believe our current implementation of this option (which is definitely not being robustly tested!) is something like this: - NBD_OPT_INFO received from client (containing the export name) - open a backend handle - query backend for flags - reply to client - close backend handle - process the next option Assuming this is actually correct, probably we can make sure that it passes the export name to the backend_open, and everything should just work. In fact this would happen as a side effect of fixing (a) above so I don't believe there's anything to do here except to have some test coverage. ---------------------------------------------------------------------- Thoughts welcomed as always, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Eric Blake
2020-Jul-21 14:04 UTC
Re: [Libguestfs] Extending nbdkit to support listing exports
On 7/21/20 7:46 AM, Richard W.M. Jones wrote:> Hi Eric, Nir. > > $SUBJECT - it's complicated! Imagine a scenario where we have > extended the file plugin so you can point it at a directory and it > will use the client-supplied export name to select a file to serve. > Also we have extended the tar filter so that you can map tar file > components into export names. You may be able to see where this is > going ... > > Now we point the file plugin at a directory of tar files, and layer > the tar filter on top, and then what happens? The client can only > supply a single export name, so how can it specify component X of tar > file Y? > > tar filter ---> file plugin > tar-exportname=on exportname=on > directory=/my-files ---> directory of tar files > > It gets worse! At the moment filters and plugins can read the export > name sent by the client by calling the global function > nbdkit_export_name(). I'm not so worried about filters, but plugins > must be able to continue to use this global function. In the scenario > above both the tar filter and the file plugin would see the same > string (if we did nothing), and it would only make sense to one or the > other but never to both. > > Listing exports is also complicated: Should we list only the nearest > filter's exports? Or the plugin's exports? Cartesian product of both > somehow?Definitely something for filters to decide, and not the core engine. But it does sound like we can come up with quite a bit of flexibility, and the idea of a Cartesian product (a single export name can tell both the tar filter which component to extract and the file plugin which tar file to serve) is indeed tempting.> > > That's the background, but how do we solve it? > ---------------------------------------------- > > I'd like to say first of all that I believe export names are already a > dumpster fire of potential insecurity, and that we shouldn't have the > core server attempt to parse them, so solutions involving splitting > the export names into components (like tar-component/file-name) should > not be considered because they are too risky, and run into arbitrary > limits (what escape character to use? what is the max length?) Also > changing the NBD protocol is not under consideration.Not directly, but maybe it is worth adding optional support to the NBD protocol to make it possible for NBD_OPT_LIST to support a hierarchical return. That is, pointing the file plugin to serve directory=base containing the following layout: base + f1 + dir1 + f2 + f3 + dir2 + f4 the current semantics of NBD_OPT_LIST only lend themselves to returning 4 NBD_REP_SERVER replies in a row: "f1", "dir1/f2", "dir1/f3", "dir2/f4". But what if the protocol added NBD_OPT_LIST_HIER which took a starting point argument, and new replies NBD_REP_SERVER_DIR and NBD_REP_ERR_NOTDIR, such that we could then have: NBD_OPT_LIST_HIER("") => SERVER("f1"), SERVER_DIR("dir1/"), SERVER_DIR("dir2/") NBD_OPT_LIST_HIER("dir1/") => SERVER("dir1/f2"), SERVER("dir1/f3") NBD_OPT_LIST_HIER("dir2/") => SERVER("dir2/f4") NBD_OPT_LIST_HIER("f1") => NBD_REP_ERR_NOTDIR NBD_OPT_LIST_HIER("nosuch") => NBD_REP_ERR_UNKNOWN True, this hierarchy is only available to new client/server pairs (if either side lacks the new code, you're back to flat listings), but it is food for thought. I'll propose something on the upstream NBD list if we think it has potential.> > I think we need to consider these aspects separately: > > (a) How filters and plugins get the client exportname. > > (b) How filters and plugins respond when the client issues NBD_OPT_LIST. > > (c) How the server, filters and plugins respond to NBD_OPT_INFO.I think (c) is already handled - they respond the same as they do to NBD_OPT_GO, but rather than starting to serve traffic, they go back to waiting for the next NBD_OPT_*.> > > (a) Client exportname > --------------------- > > The client sends the export name of the export it wants to access with > NBD_OPT_EXPORT_NAME or the newer NBD_OPT_GO. This is an opaque string > (not a filename, pathname etc). Currently filters and plugins can > fetch this using nbdkit_export_name(). However it cannot be modified > by filters. > > My proposal is that we extend the .open() callback with an extra > export_name parameter: > > void *filter_open (int readonly, const char *export_name); > > For plugins I have already proposed this for inclusion in API V3. We > already do this in nbdkit-sh-plugin. For backwards compatibility with > existing plugins we'd make nbdkit_export_name() return the export name > passed down by the last filter in the chain, and deprecate this > function in V3.Yeah, for plugins, we can't do anything easy without moving to a v3 protocol; but making the existing nbdkit_export_name() smarter about grabbing the name from the last filter rather than directly from the client makes sense for v2 filters.> > For filters the next_open field would take the export_name and this > would allow filters to modify the export name.For filters, we have total liberty to change our .open signature right now, even without introducing plugin v3.> > nbdkit-tar-filter would take an explicit tar-export-name=<new> > parameter to pass a different export name down to the underlying > plugin. If the tar filter was implementing its own export name > processing then this parameter would either be required or would > default to "". (If the tar filter was not implementing export name > functionality it would pass it down unchanged).Makes sense.> > > (b) Listing exports with NBD_OPT_LIST > ------------------------------------- > > I believe we need a new .list_exports() callback for filters and > plugins to handle this case. I'm not completely clear on the API, but > it could be something as simple as: > > const char **list_exports (void); > > returning a NULL-terminated list of strings.But what is the lifetime on those strings? Can the list change over time (that is, if I'm running 'nbdkit file dir=base', does it rerun opendir()/stat()/closedir() to populate a fresh list for each client, or does it only populate the list once, regardless of later underlying changes in the directory hierarchy it is wrapping)? Returning const char ** means the plugin has to manage the memory (the server will not touch anything), so whether the plugin computes the list up-front (compute the list during .load, free it during .unload) or per-client (compute the list during .preconnect or .list_exports, free during .close) should be reasonable. Thus, I'm assuming the sequence would be: client connects .preconnect if client calls NBD_OPT_LIST .list_exports if client calls NBD_OPT_INFO or NBD_OPT_GO .open .get_size .can_FOO... client disconnects .close that is, .preconnect always corresponds to the initial client connection before NBD_OPT_ handling, and .open corresponds to the point when the client requests a specific export, but .close can gracefully clean up whatever per-client results it generated for any client that actually asked for a listing. It also gives us the flexibility that a single client can both obtain a listing and choose which export to then request, rather than having to do two separate connections (one for the listing, the second specifying a specific export). I was wondering if we should instead have the core server handle memory, similar to how we handle .exports (that is, the server allocates a container, passes it to the plugin, and the plugin calls nbdkit_export_list_add("...") as many times as it wants, where the server now frees the list instead of throwing the burden on the plugin). Offhand, I'm thinking either approach could be made to work, so it becomes a decision on which is easier to maintain (both from the perspective of the core server code, and for an arbitrary plugin writer).> > Is it conceivable we will need additional fields in future?For now, only one field - a description string. That's all the more that NBD_REP_SERVER supports, but 'qemu-nbd -D' is an example of setting it. If my idea of adding NBD_OPT_INFO_HIER and NBD_REP_SERVER_DIR goes anywhere, that may be an opportunity to pass other information as well.> > Plugins which do not implement this would be assumed to return a > single export "" (or perhaps the -e parameter??)Both sound fine. In turn, what happens if -e is in use when a plugin does have .list_exports? Do we just ignore -e in that case?> > Filters could ignore or replace exports by optionally implementing > this callback. I wouldn't recommend that filters modify export names > however. > > > (c) Handling NBD_OPT_INFO > ------------------------- > > I believe our current implementation of this option (which is > definitely not being robustly tested!) is something like this: > > - NBD_OPT_INFO received from client (containing the export name) > - open a backend handle > - query backend for flags > - reply to client > - close backend handle > - process the next option > > Assuming this is actually correct, probably we can make sure that it > passes the export name to the backend_open, and everything should just > work. In fact this would happen as a side effect of fixing (a) above > so I don't believe there's anything to do here except to have some > test coverage.Yep, which is why I stated above that I think we already do (c).> > > ---------------------------------------------------------------------- > > Thoughts welcomed as always, > > Rich. >-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Eric Blake
2020-Jul-21 14:30 UTC
Re: [Libguestfs] Extending nbdkit to support listing exports
On 7/21/20 9:04 AM, Eric Blake wrote:> > I was wondering if we should instead have the core server handle memory, > similar to how we handle .exports (that is, the server allocates aI meant .extents there.> container, passes it to the plugin, and the plugin calls > nbdkit_export_list_add("...") as many times as it wants, where the > server now frees the list instead of throwing the burden on the plugin). > Â Offhand, I'm thinking either approach could be made to work, so it > becomes a decision on which is easier to maintain (both from the > perspective of the core server code, and for an arbitrary plugin writer). >-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2020-Jul-21 14:39 UTC
Re: [Libguestfs] Extending nbdkit to support listing exports
On Tue, Jul 21, 2020 at 09:04:04AM -0500, Eric Blake wrote:> On 7/21/20 7:46 AM, Richard W.M. Jones wrote: > >(b) Listing exports with NBD_OPT_LIST > >------------------------------------- > > > >I believe we need a new .list_exports() callback for filters and > >plugins to handle this case. I'm not completely clear on the API, but > >it could be something as simple as: > > > > const char **list_exports (void); > > > >returning a NULL-terminated list of strings. > > But what is the lifetime on those strings? Can the list change over > time (that is, if I'm running 'nbdkit file dir=base', does it rerun > opendir()/stat()/closedir() to populate a fresh list for each > client, or does it only populate the list once, regardless of later > underlying changes in the directory hierarchy it is wrapping)? > Returning const char ** means the plugin has to manage the memory > (the server will not touch anything), so whether the plugin computes > the list up-front (compute the list during .load, free it during > .unload) or per-client (compute the list during .preconnect or > .list_exports, free during .close) should be reasonable.Yeah this wasn't very well specified. As a general principle we should do whatever is easiest for the plugin to implement, even if that means extra complexity / copying in the server.> Thus, I'm assuming the sequence would be: > > client connects > .preconnect > if client calls NBD_OPT_LIST > .list_exports > if client calls NBD_OPT_INFO or NBD_OPT_GO > .open > .get_size > .can_FOO... > client disconnects > .close > > that is, .preconnect always corresponds to the initial client > connection before NBD_OPT_ handling, and .open corresponds to the > point when the client requests a specific export, but .close can > gracefully clean up whatever per-client results it generated for any > client that actually asked for a listing. It also gives us the > flexibility that a single client can both obtain a listing and > choose which export to then request, rather than having to do two > separate connections (one for the listing, the second specifying a > specific export).We may need to consider this case for libnbd too. Unfortunately it's rather difficult to implement.> I was wondering if we should instead have the core server handle > memory, similar to how we handle .exports (that is, the server[I think you meant "extents"]> allocates a container, passes it to the plugin, and the plugin calls > nbdkit_export_list_add("...") as many times as it wants, where the > server now frees the list instead of throwing the burden on the > plugin). Offhand, I'm thinking either approach could be made to > work, so it becomes a decision on which is easier to maintain (both > from the perspective of the core server code, and for an arbitrary > plugin writer).Yes this sounds better. It was probably the reason why extents are implemented this way.> >Is it conceivable we will need additional fields in future? > > For now, only one field - a description string. That's all the more > that NBD_REP_SERVER supports, but 'qemu-nbd -D' is an example of > setting it.Ah I didn't realise there existed a server which supported this. We might need to consider implementing this for libnbd too.> If my idea of adding NBD_OPT_INFO_HIER and NBD_REP_SERVER_DIR goes > anywhere, that may be an opportunity to pass other information as > well. > > > > >Plugins which do not implement this would be assumed to return a > >single export "" (or perhaps the -e parameter??) > > Both sound fine. In turn, what happens if -e is in use when a > plugin does have .list_exports? Do we just ignore -e in that case?The -e option was a bit misguided and has never done anything important. I believe it would be a good move to deprecate this option right away. By commenting it out I can see the only things it does at the moment are: - Act as the default reply in our current implementation of NBD_OPT_LIST. We would replace this implementation under the current proposal anyway. - Passed through as $exportname to the --run script, where it is effectively useless. Export names are per-connection, not per-server, and the true export name for plugins which support them is not and cannot ever be passed to this script. - Be incompatible with the -o option for no particularly strong reason, but nominally because the oldstyle protocol doesn't support export names. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/