Eric Blake
2018-Jan-19 16:22 UTC
Re: [Libguestfs] [PATCH nbdkit filters-v2 2/5] Introduce filters.
On 01/19/2018 09:23 AM, Richard W.M. Jones wrote:> Filters can be placed in front of plugins to modify their behaviour. > > This commit adds the <nbdkit-filter.h> header file, the manual page, > the ‘filterdir’ directory (like ‘plugindir’), the ‘filters/’ source > directory which will contain the actual filters, the ‘--filters’ > parameter, and the filters backend logic. > ---> +++ b/TODO > @@ -34,10 +34,8 @@ nbdkit there is no compelling reason unless the result is better than > qemu-nbd. For the majority of users it would be better if they were > directed to qemu-nbd for these use cases. > > -Filters > -------- > - > -It should be possible to layer filters over plugins to do things like: > +Suggestions for filters > +----------------------- > > * adding artificial delays (see wdelay/rdelay options in the file > plugin)How about a filter that intentionally disables WRITE_ZEROES and/or FUA support, if for no other reason than for testing sane fallbacks and comparing speed differences?> +++ b/docs/nbdkit-filter.pod > @@ -0,0 +1,501 @@ > +=encoding utf8 > + > +=head1 NAME > + > +nbdkit-filter - How to write nbdkit filters > + > +=head1 SYNOPSIS> +To see example filters, take a look at the source of nbdkit, in the > +C<filters> directory. > + > +Filters must be written in C and must be fully thread safe.Should we mention that the rules on filter semantics are stricter than for plugins, and that unlike plugins, we don't guarantee stable cross-version API?> +=head1 CALLBACKS> +=head2 C<.get_size> > + > + int64_t (*get_size) (struct nbdkit_next_ops *next_ops, void *nxdata, > + void *handle); > + > +This intercepts the plugin C<.get_size> method and can be used to read > +or modify the apparent size of the block device that the NBD client > +will see. > + > +The returned size must be E<ge> 0. If there is an error, C<.get_size> > +should call C<nbdkit_error> with an error message and return C<-1>.The rule on non-zero size matches plugins, but I'm not sure if there is a technical reason why we have that restriction. Down the road, when I get the upstream NBD resize proposal finalized, I can see the ability to create an image with size 0, then use resize to update it, as something that would be nice to support.> + > +=head2 C<.pread> > + > + int (*pread) (struct nbdkit_next_ops *next_ops, void *nxdata, > + void *handle, void *buf, uint32_t count, uint64_t offset);Wrong signature, missing the uint32_t flags that the backend interface has, and that I'm adding for plugins API_VERSION 2 for FUA support.> +=head2 C<.zero> > + > + int (*zero) (struct nbdkit_next_ops *next_ops, void *nxdata, > + void *handle, uint32_t count, uint64_t offset, int may_trim); > +And so on (uint32_t flags instead of int may_trim).> +=head1 THREADS > + > +Because filters can be mixed and used with any plugin and thus any > +threading model supported by L<nbdkit-plugin(3)>, filters must be > +thread safe. They must be able to handle concurrent requests even on > +the same handle. > + > +Filters may have to use pthread primitives like mutexes to achieve > +this.Would it ever make sense to allow a filter to intentionally request a lower concurrency level than the underlying plugin? At connection time, you'd compute the threading level as the lowest level requested by any element of the chain; it might make it easier to write plugins that aren't fully parallel (such filters may penalize the speed that can otherwise be achieved by using the plugin in isolation - but again, it may be useful for testing). That would imply adding a backend.thread_model() callback, which needs to be exposed to filters but not to plugins (plugins continue to use the #define at compile time); if the filter does not define the .thread_model callback, it defaults to fully-parallel; but if it does define the callback, it can result in something less concurrent.> +++ b/docs/nbdkit.pod > @@ -7,7 +7,7 @@ nbdkit - A toolkit for creating NBD servers > =head1 SYNOPSIS > > nbdkit [-e EXPORTNAME] [--exit-with-parent] [-f] > - [-g GROUP] [-i IPADDR] > + [--filter=FILTER ...] [-g GROUP] [-i IPADDR] > [--newstyle] [--oldstyle] [-P PIDFILE] [-p PORT] [-r] > [--run CMD] [-s] [--selinux-label LABEL] [-t THREADS] > [--tls=off|on|require] [--tls-certificates /path/to/certificates] > @@ -119,6 +119,13 @@ not allowed with the oldstyle protocol. > > I<Don't> fork into the background. > > +=item B<--filter> FILTER > + > +Add a filter before the plugin. This option may be given one or more > +times to stack filters in front of the plugin. They are processed in > +the order they appear on the command line. See L</FILTERS> and > +L<nbdkit-filter(3)>. > +Does it ever make sense to provide the same filter more than once on the command line? Or are we stating that a given filter can only be used once in the chain?> +struct nbdkit_next_ops { > + int64_t (*get_size) (void *nxdata); > + > + int (*can_write) (void *nxdata); > + int (*can_flush) (void *nxdata); > + int (*is_rotational) (void *nxdata); > + int (*can_trim) (void *nxdata); > + > + int (*pread) (void *nxdata, void *buf, uint32_t count, uint64_t offset); > + int (*pwrite) (void *nxdata, > + const void *buf, uint32_t count, uint64_t offset); > + int (*flush) (void *nxdata); > + int (*trim) (void *nxdata, uint32_t count, uint64_t offset); > + int (*zero) (void *nxdata, uint32_t count, uint64_t offset, int may_trim); > +};Your documentation matches your code, and did not pick up my potential changes to add 'uint32_t flags' everywhere. I'll reply shortly with what my rebase looks like on top of yours... -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2018-Jan-19 17:12 UTC
Re: [Libguestfs] [PATCH nbdkit filters-v2 2/5] Introduce filters.
On Fri, Jan 19, 2018 at 10:22:27AM -0600, Eric Blake wrote:> > +Suggestions for filters > > +----------------------- > > > > * adding artificial delays (see wdelay/rdelay options in the file > > plugin) > > How about a filter that intentionally disables WRITE_ZEROES and/or FUA > support, if for no other reason than for testing sane fallbacks and > comparing speed differences?If you have ideas for the TODO file just go ahead and commit them, no need for review.> > +=head1 SYNOPSIS > > > +To see example filters, take a look at the source of nbdkit, in the > > +C<filters> directory. > > + > > +Filters must be written in C and must be fully thread safe. > > Should we mention that the rules on filter semantics are stricter than > for plugins, and that unlike plugins, we don't guarantee stable > cross-version API?Yes, we'll need to tighten up the language here. I noted your other patch which modified this text.> > +=head1 CALLBACKS > > > +=head2 C<.get_size> > > + > > + int64_t (*get_size) (struct nbdkit_next_ops *next_ops, void *nxdata, > > + void *handle); > > + > > +This intercepts the plugin C<.get_size> method and can be used to read > > +or modify the apparent size of the block device that the NBD client > > +will see. > > + > > +The returned size must be E<ge> 0. If there is an error, C<.get_size> > > +should call C<nbdkit_error> with an error message and return C<-1>. > > The rule on non-zero size matches plugins, but I'm not sure if there is > a technical reason why we have that restriction. Down the road, when I > get the upstream NBD resize proposal finalized, I can see the ability to > create an image with size 0, then use resize to update it, as something > that would be nice to support.OK that's interesting. BTW qemu has historically had lots of bugs in the block layer related to zero (or even < 4096 byte) devices. We added a workaround in libguestfs: https://github.com/libguestfs/libguestfs/blob/a30b51747fc6605f50a9d5d8095fd2b6d1473b1c/lib/drives.c#L395> > + > > +=head2 C<.pread> > > + > > + int (*pread) (struct nbdkit_next_ops *next_ops, void *nxdata, > > + void *handle, void *buf, uint32_t count, uint64_t offset); > > Wrong signature, missing the uint32_t flags that the backend interface > has, and that I'm adding for plugins API_VERSION 2 for FUA support.Right, but this patch is against the version 1 API. We will need to add flags at some point.> > +=head1 THREADS > > + > > +Because filters can be mixed and used with any plugin and thus any > > +threading model supported by L<nbdkit-plugin(3)>, filters must be > > +thread safe. They must be able to handle concurrent requests even on > > +the same handle. > > + > > +Filters may have to use pthread primitives like mutexes to achieve > > +this. > > Would it ever make sense to allow a filter to intentionally request a > lower concurrency level than the underlying plugin? At connection time, > you'd compute the threading level as the lowest level requested by any > element of the chain; it might make it easier to write plugins that > aren't fully parallel (such filters may penalize the speed that can > otherwise be achieved by using the plugin in isolation - but again, it > may be useful for testing). That would imply adding a > backend.thread_model() callback, which needs to be exposed to filters > but not to plugins (plugins continue to use the #define at compile > time); if the filter does not define the .thread_model callback, it > defaults to fully-parallel; but if it does define the callback, it can > result in something less concurrent.Yes, this is a good idea actually.> > +=item B<--filter> FILTER > > + > > +Add a filter before the plugin. This option may be given one or more > > +times to stack filters in front of the plugin. They are processed in > > +the order they appear on the command line. See L</FILTERS> and > > +L<nbdkit-filter(3)>. > > + > > Does it ever make sense to provide the same filter more than once on the > command line? Or are we stating that a given filter can only be used > once in the chain?I couldn't think of a case where we'd need the same filter multiple times. It could be made to work but we'd have to ban global variables and modify the load() function. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Eric Blake
2018-Jan-19 19:38 UTC
Re: [Libguestfs] [PATCH nbdkit filters-v2 2/5] Introduce filters.
On 01/19/2018 11:12 AM, Richard W.M. Jones wrote:>>> + >>> +=head2 C<.pread> >>> + >>> + int (*pread) (struct nbdkit_next_ops *next_ops, void *nxdata, >>> + void *handle, void *buf, uint32_t count, uint64_t offset); >> >> Wrong signature, missing the uint32_t flags that the backend interface >> has, and that I'm adding for plugins API_VERSION 2 for FUA support. > > Right, but this patch is against the version 1 API. We will need to > add flags at some point.So, do we ever want filters to be usable via version 1 instead of version 2 backend API, or should we be aiming that all filters are automatically version 2 from the point where we release this? I'm fine if we have some churn in .git by taking your patches with version 1 signatures then my FUA patches on top that converts to version 2; I'm more worried that if we allow filters with version 1 semantics that we may run into some interesting difficulties (not the least of which is that a version 2 plugin that can support efficient FUA is inherently penalized if it is used with a filter talking only version 1).>>> +Filters may have to use pthread primitives like mutexes to achieve >>> +this. >> >> Would it ever make sense to allow a filter to intentionally request a >> lower concurrency level than the underlying plugin? At connection time, >> you'd compute the threading level as the lowest level requested by any >> element of the chain; it might make it easier to write plugins that >> aren't fully parallel (such filters may penalize the speed that can >> otherwise be achieved by using the plugin in isolation - but again, it >> may be useful for testing). That would imply adding a >> backend.thread_model() callback, which needs to be exposed to filters >> but not to plugins (plugins continue to use the #define at compile >> time); if the filter does not define the .thread_model callback, it >> defaults to fully-parallel; but if it does define the callback, it can >> result in something less concurrent. > > Yes, this is a good idea actually.Are you planning on writing that, or do you want me to take a shot at it?> >>> +=item B<--filter> FILTER >>> + >>> +Add a filter before the plugin. This option may be given one or more >>> +times to stack filters in front of the plugin. They are processed in >>> +the order they appear on the command line. See L</FILTERS> and >>> +L<nbdkit-filter(3)>. >>> + >> >> Does it ever make sense to provide the same filter more than once on the >> command line? Or are we stating that a given filter can only be used >> once in the chain? > > I couldn't think of a case where we'd need the same filter multiple > times. It could be made to work but we'd have to ban global > variables and modify the load() function.I can (sort of) see it if we support composition; but again, if composition is done right, you still have just a single use of each plugin per nbdkit process. Let's say I want to access [0-100M) and [200M-300M) of a given file via composition. Here's a possible way to write it: nbdkit --filter=compose compose='--filter=offset offset=0 range=100M file file=FILE' --filter=offset offset=200M range=100M file file=FILE Is there a way to rewrite it so I don't have to nest quotes when doing more than one composition? If there is, then --filter=offset appears twice, as does 'file file=FILE'. But either way, under the hood, we'd implement it as two separate nbdkit processes if composition is a filter: nbdkit --> compose --> offset 0/100 --> file \-> nbd ==> nbdkit --> offset 200/100 --> file where -> is the backend chain, and ==> is an NBD socket. Or maybe composition is a plugin, in which case it might look more like: nbdkit compose sub='--filter=offset offset=0 range=100M file file=FILE' sub='--filter=offset offset=200M range=100M file file=FILE' (Can config even accept the same key= with different values more than once on the command line?) nbdkit --> compose ==> nbdkit --> offset 0/100 --> file \=> nbdkit --> offset 200/100 --> file Certainly some tradeoffs to revisit depending on whether composition fits better as a filter or a plugin -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Apparently Analagous Threads
- Re: [PATCH nbdkit filters-v2 2/5] Introduce filters.
- [PATCH nbdkit filters-v3 3/7] Introduce filters.
- [PATCH nbdkit filters-v2 2/5] Introduce filters.
- [nbdkit PATCH 1/2] filters: Add .export_description wrappers
- Re: [nbdkit PATCH 1/2] filters: Add .export_description wrappers