Martin Kletzander
2019-Mar-12 08:04 UTC
Re: [Libguestfs] Supporting sparse disks in nbdkit
On Mon, Mar 11, 2019 at 02:53:55PM -0500, Eric Blake wrote:>On 3/11/19 2:44 PM, Richard W.M. Jones wrote: >> On Fri, Mar 08, 2019 at 10:50:35AM -0600, Eric Blake wrote: >>>> int extents (void *handle, uint32_t count, uint64_t offset, >>>> uint32_t flags /* always 0? */, >>> >>> Unless we support the REQ_ONE flag (where the client specifically wants >>> only one extent returned). >> ... >>>> nbdkit would need to do some massaging on this to get it into the >>>> right format for NBD_CMD_BLOCK_STATUS. (I'm very confused about what >>>> NBD_CMD_FLAG_REQ_ONE is supposed to do.) >>> >>> REQ_ONE says that a successful answer has to be exactly one extent, >>> covering only the head of the region. (That is, force the server to >>> behave as if obtaining information beyond the first extent is >>> expensive). It also tells the server that it must not report extra >>> information beyond the initial request (because at least qemu 3.0 would >>> assert that the server supplied too much information). >>> >>> You're also right that nbdkit can do REQ_ONE massaging, so whether we >>> pass REQ_ONE on to the client or not makes it sound like the client >>> needs a tri-state opt-in (no block status support, block status but let >>> nbdkit handle REQ_ONE, block status and the plugin can handle REQ_ONE >>> efficiently itself) >> >> I had a think about the REQ_ONE flag a little. What do you think >> about this plan? >> >> If NBD_CMD_FLAG_REQ_ONE is passed by the nbd client, nbdkit does not >> need to pass it to the plugin. Instead nbdkit would turn the request >> into: >> >> extents (count = 1, offset = <offset>) >> >> The plugin could still respond with multiple extents, but nbdkit would >> pick only the one covering the <offset> to send back. >> >> In this case can_extents would return just a boolean since there is no >> special supported needed by the plugin AFAICT. >Is that needed then? You would know whether the plugin can do extents() based on the function pointer being set or not. I mean can_write makes sense as that is per file (volume/anything), but can_extents would be set per plugin and it would be set to ".extents != NULL".>Seems like it would work. Still, I worry that some plugin might be doing >some expensive work to compute a list of extents, only for nbdkit to >throw that work away, vs. being able to accept a flag to know that >nothing more than one extent will be used so the plugin might as well >quit work early if it wants. Maybe we just document that we pass the >flag through when the client requested it, but also permit the plugin to >ignore the flag (plugins that care have the flag to optimize by, and >plugins that don't care will not violate client expectations because >nbdkit clamps the answer appropriately). >That would be visible by the `@count` parameter. But if that count will only be 0 (give me all you've got) and 1 (basically translated REQ_ONE), maybe instead of passing the `count`, it could be just the flag, i.e.: extents(offset, flags). When I think about it, what seems more usable might be a `length` parameter so that the plugin does not need to check the whole file if someone is asking for a part of the disk only. Of course that thinking might be flawed as I don't know what the request looks like in the NBD protocol.>-- >Eric Blake, Principal Software Engineer >Red Hat, Inc. +1-919-301-3226 >Virtualization: qemu.org | libvirt.org >
Richard W.M. Jones
2019-Mar-12 08:26 UTC
Re: [Libguestfs] Supporting sparse disks in nbdkit
On Tue, Mar 12, 2019 at 09:04:59AM +0100, Martin Kletzander wrote:> On Mon, Mar 11, 2019 at 02:53:55PM -0500, Eric Blake wrote: > >On 3/11/19 2:44 PM, Richard W.M. Jones wrote: > >>On Fri, Mar 08, 2019 at 10:50:35AM -0600, Eric Blake wrote: > >>>> int extents (void *handle, uint32_t count, uint64_t offset, > >>>> uint32_t flags /* always 0? */, > >>> > >>>Unless we support the REQ_ONE flag (where the client specifically wants > >>>only one extent returned). > >>... > >>>>nbdkit would need to do some massaging on this to get it into the > >>>>right format for NBD_CMD_BLOCK_STATUS. (I'm very confused about what > >>>>NBD_CMD_FLAG_REQ_ONE is supposed to do.) > >>> > >>>REQ_ONE says that a successful answer has to be exactly one extent, > >>>covering only the head of the region. (That is, force the server to > >>>behave as if obtaining information beyond the first extent is > >>>expensive). It also tells the server that it must not report extra > >>>information beyond the initial request (because at least qemu 3.0 would > >>>assert that the server supplied too much information). > >>> > >>>You're also right that nbdkit can do REQ_ONE massaging, so whether we > >>>pass REQ_ONE on to the client or not makes it sound like the client > >>>needs a tri-state opt-in (no block status support, block status but let > >>>nbdkit handle REQ_ONE, block status and the plugin can handle REQ_ONE > >>>efficiently itself) > >> > >>I had a think about the REQ_ONE flag a little. What do you think > >>about this plan? > >> > >>If NBD_CMD_FLAG_REQ_ONE is passed by the nbd client, nbdkit does not > >>need to pass it to the plugin. Instead nbdkit would turn the request > >>into: > >> > >>extents (count = 1, offset = <offset>) > >> > >>The plugin could still respond with multiple extents, but nbdkit would > >>pick only the one covering the <offset> to send back. > >> > >>In this case can_extents would return just a boolean since there is no > >>special supported needed by the plugin AFAICT. > > > > Is that needed then? You would know whether the plugin can do > extents() based on the function pointer being set or not. I mean > can_write makes sense as that is per file (volume/anything), but > can_extents would be set per plugin and it would be set to ".extents > != NULL".We have can_* functions for most optional plugin features for a couple of reasons: (1) In non-C languages you can't just check the function pointer. (2) It's possible (in fact, quite likely) that a plugin could support extents only for certain disk images, but would always have an ‘extents’ function. As an example of (2), on Linux different filesystems have different levels of support for sparseness and sparse detection and so the nbdkit-file-plugin might implement ‘extents’ conditionally. As another example, a generalized "archive" plugin would support sparseness only for (say) tarballs but not for ZIP files.> >Seems like it would work. Still, I worry that some plugin might be doing > >some expensive work to compute a list of extents, only for nbdkit to > >throw that work away, vs. being able to accept a flag to know that > >nothing more than one extent will be used so the plugin might as well > >quit work early if it wants. Maybe we just document that we pass the > >flag through when the client requested it, but also permit the plugin to > >ignore the flag (plugins that care have the flag to optimize by, and > >plugins that don't care will not violate client expectations because > >nbdkit clamps the answer appropriately). > > That would be visible by the `@count` parameter. But if that count > will only be 0 (give me all you've got) and 1 (basically translated > REQ_ONE), maybe instead of passing the `count`, it could be just the > flag, i.e.: extents(offset, flags). When I think about it, what > seems more usable might be a `length` parameter so that the plugin > does not need to check the whole file if someone is asking for a > part of the disk only. Of course that thinking might be flawed as I > don't know what the request looks like in the NBD protocol.I think I'm going to pass the REQ_ONE flag to the plugin, but the plugin is free to either ignore it or implement it as an optimization. If the flag is present and the plugin returns more than one extent then nbdkit will determine the single extent to return. As a general approach we want plugins to be easy to write and for nbdkit to do the hard work when necessary. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Martin Kletzander
2019-Mar-12 09:41 UTC
Re: [Libguestfs] Supporting sparse disks in nbdkit
On Tue, Mar 12, 2019 at 08:26:46AM +0000, Richard W.M. Jones wrote:>On Tue, Mar 12, 2019 at 09:04:59AM +0100, Martin Kletzander wrote: >> On Mon, Mar 11, 2019 at 02:53:55PM -0500, Eric Blake wrote: >> >On 3/11/19 2:44 PM, Richard W.M. Jones wrote: >> >>On Fri, Mar 08, 2019 at 10:50:35AM -0600, Eric Blake wrote: >> >>>> int extents (void *handle, uint32_t count, uint64_t offset, >> >>>> uint32_t flags /* always 0? */, >> >>> >> >>>Unless we support the REQ_ONE flag (where the client specifically wants >> >>>only one extent returned). >> >>... >> >>>>nbdkit would need to do some massaging on this to get it into the >> >>>>right format for NBD_CMD_BLOCK_STATUS. (I'm very confused about what >> >>>>NBD_CMD_FLAG_REQ_ONE is supposed to do.) >> >>> >> >>>REQ_ONE says that a successful answer has to be exactly one extent, >> >>>covering only the head of the region. (That is, force the server to >> >>>behave as if obtaining information beyond the first extent is >> >>>expensive). It also tells the server that it must not report extra >> >>>information beyond the initial request (because at least qemu 3.0 would >> >>>assert that the server supplied too much information). >> >>> >> >>>You're also right that nbdkit can do REQ_ONE massaging, so whether we >> >>>pass REQ_ONE on to the client or not makes it sound like the client >> >>>needs a tri-state opt-in (no block status support, block status but let >> >>>nbdkit handle REQ_ONE, block status and the plugin can handle REQ_ONE >> >>>efficiently itself) >> >> >> >>I had a think about the REQ_ONE flag a little. What do you think >> >>about this plan? >> >> >> >>If NBD_CMD_FLAG_REQ_ONE is passed by the nbd client, nbdkit does not >> >>need to pass it to the plugin. Instead nbdkit would turn the request >> >>into: >> >> >> >>extents (count = 1, offset = <offset>) >> >> >> >>The plugin could still respond with multiple extents, but nbdkit would >> >>pick only the one covering the <offset> to send back. >> >> >> >>In this case can_extents would return just a boolean since there is no >> >>special supported needed by the plugin AFAICT. >> > >> >> Is that needed then? You would know whether the plugin can do >> extents() based on the function pointer being set or not. I mean >> can_write makes sense as that is per file (volume/anything), but >> can_extents would be set per plugin and it would be set to ".extents >> != NULL". > >We have can_* functions for most optional plugin features for a couple >of reasons: > >(1) In non-C languages you can't just check the function pointer. >This makes sense, that's what I was missing.>(2) It's possible (in fact, quite likely) that a plugin could support >extents only for certain disk images, but would always have an >‘extents’ function. > >As an example of (2), on Linux different filesystems have different >levels of support for sparseness and sparse detection and so the >nbdkit-file-plugin might implement ‘extents’ conditionally. As >another example, a generalized "archive" plugin would support >sparseness only for (say) tarballs but not for ZIP files. >This does too, but I thought this would be "avoided" by just reporting the whole zip as dense.>> >Seems like it would work. Still, I worry that some plugin might be doing >> >some expensive work to compute a list of extents, only for nbdkit to >> >throw that work away, vs. being able to accept a flag to know that >> >nothing more than one extent will be used so the plugin might as well >> >quit work early if it wants. Maybe we just document that we pass the >> >flag through when the client requested it, but also permit the plugin to >> >ignore the flag (plugins that care have the flag to optimize by, and >> >plugins that don't care will not violate client expectations because >> >nbdkit clamps the answer appropriately). >> >> That would be visible by the `@count` parameter. But if that count >> will only be 0 (give me all you've got) and 1 (basically translated >> REQ_ONE), maybe instead of passing the `count`, it could be just the >> flag, i.e.: extents(offset, flags). When I think about it, what >> seems more usable might be a `length` parameter so that the plugin >> does not need to check the whole file if someone is asking for a >> part of the disk only. Of course that thinking might be flawed as I >> don't know what the request looks like in the NBD protocol. > >I think I'm going to pass the REQ_ONE flag to the plugin, but the >plugin is free to either ignore it or implement it as an optimization. >If the flag is present and the plugin returns more than one extent >then nbdkit will determine the single extent to return. >I think that's the nicest approach.>As a general approach we want plugins to be easy to write and for >nbdkit to do the hard work when necessary. > >Rich. > >-- >Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones >Read my programming and virtualization blog: http://rwmj.wordpress.com >virt-builder quickly builds VMs from scratch >http://libguestfs.org/virt-builder.1.html