Eric Blake
2018-Jan-17 22:38 UTC
Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
On 01/17/2018 02:53 PM, Richard W.M. Jones wrote:> Previously the file plugin supported ‘rdelay’ and ‘wdelay’ parameters > for injecting delays (for testing) into read and write requests. This > moves the functionality to a new delay filter so that it can be used > with any plugin. > ---> +/* Write data. */ > +static int > +delay_pwrite (struct nbdkit_next *next, void *nxdata, > + void *handle, > + const void *buf, uint32_t count, uint64_t offset) > +{ > + write_delay (); > + return next->pwrite (nxdata, buf, count, offset); > +} > + > +/* Zero data. */ > +static int > +delay_zero (struct nbdkit_next *next, void *nxdata, > + void *handle, uint32_t count, uint64_t offset, int may_trim) > +{ > + write_delay (); > + return next->zero (nxdata, count, offset, may_trim);If next->zero() fails with EOPNOTSUPP, that means we will delay once in trying the underlying command, and again for each iteration of the fallback loop as it calls delay_pwrite(). Is that okay, or do we want to reproduce some fallback logic here and directly call next->pwrite on EOPNOTSUPP so as to only have a single write delay in that case?> +++ b/plugins/file/nbdkit-file-plugin.pod > @@ -31,21 +31,13 @@ This parameter is required. > > =item B<rdelay=E<lt>NNE<gt>ms> > > -Delay reads for C<SECS> seconds or C<NN> milliseconds. > -This is used to simulate a slow or remote server, or to > -test certain kinds of race conditions in Linux. > - > -The default is no delay. > - > =item B<wdelay=SECS> > > =item B<wdelay=E<lt>NNE<gt>ms> > > -Delay writes for C<SECS> seconds or C<NN> milliseconds. > -This is used to simulate a slow or remote server, or to > -test certain kinds of race conditions in Linux. > - > -The default is no delay. > +These plugin parameters have been moved to the > +L<nbdkit-delay-filter(1)> filter. Modify the command line to add > +I<--filter=delay> in order to use these parameters.I guess we don't promise back-comptibility on plugin parameters, but at least you've documented how to fix things (and the testsuite had to follow that advice). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Eric Blake
2018-Jan-18 02:01 UTC
Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
On 01/17/2018 04:38 PM, Eric Blake wrote:> On 01/17/2018 02:53 PM, Richard W.M. Jones wrote: >> Previously the file plugin supported ‘rdelay’ and ‘wdelay’ parameters >> for injecting delays (for testing) into read and write requests. This >> moves the functionality to a new delay filter so that it can be used >> with any plugin. >> --- > >> +/* Write data. */ >> +static int >> +delay_pwrite (struct nbdkit_next *next, void *nxdata, >> + void *handle, >> + const void *buf, uint32_t count, uint64_t offset) >> +{ >> + write_delay (); >> + return next->pwrite (nxdata, buf, count, offset); >> +} >> + >> +/* Zero data. */ >> +static int >> +delay_zero (struct nbdkit_next *next, void *nxdata, >> + void *handle, uint32_t count, uint64_t offset, int may_trim) >> +{ >> + write_delay (); >> + return next->zero (nxdata, count, offset, may_trim); > > If next->zero() fails with EOPNOTSUPP, that means we will delay once in > trying the underlying command, and again for each iteration of the > fallback loop as it calls delay_pwrite(). Is that okay, or do we want > to reproduce some fallback logic here and directly call next->pwrite on > EOPNOTSUPP so as to only have a single write delay in that case?Actually, thinking about this more: When I added zero support, I documented in commit ac3f294a that for the file plugin, wdelay is indeed doubled on systems lacking efficient zero support. But the fallback for handling EOPNOTSUPP is currently done at the plugins.c level (ie. it is part of next->zero()) - so a filter should never see a plugin returning EOPNOTSUPP. Or put another way, your movement of wdelay from being part of the file plugin to now being a separate filter actually FIXES the double delay. On the other hand, a filter can itself return EOPNOTSUPP (rather than calling next->zero), in which case we still need to support the fallback to .pwrite higher up in the call stack Along the same lines, with my FUA patches, if we keep the fallback of calling .flush in connections.c, then the flush will pass through the entire filter chain; but I had proposed moving the fallback flush into plugins.c, at which point the fallback flush is done at the layer that lacks FUA support. And maybe we want a flush delay to mirror the existing rdelay and wdelay (simulating a long flush time can indeed be useful in tests); where it definitely matters whether the flush delay is triggered as part of FUA fallback, or only triggered on an actual NBD_CMD_FLUSH from the client. I'm also thinking ahead to expansions - for example, there are proposals to add resize and block status commands to NBD. In your implementation, if a filter does not define .can_flush or .flush, then its .can_flush implementation merely returns whatever the underlying backend's version would report. But as we add new callbacks, such as a new .can_resize, we need to make sure we have sane defaults. For example, the offset filter should probably not allow resizes (even if the underlying plugin does). If the offset filter compiled in a version of nbdkit that lacks the .can_resize hook is run by nbdkit that has added the hook, we therefore want to default .can_resize to false as part of loading the smaller filter struct into memory (and NOT default to falling through to the plugin's .can_resize); a default based on fall-through should only happen if the filter had a size large enough to admit that the user explicitly omitted the callback in the filter, rather than it being NULL because of version mismatch. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2018-Jan-18 08:47 UTC
Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
On Wed, Jan 17, 2018 at 04:38:04PM -0600, Eric Blake wrote:> > +/* Zero data. */ > > +static int > > +delay_zero (struct nbdkit_next *next, void *nxdata, > > + void *handle, uint32_t count, uint64_t offset, int may_trim) > > +{ > > + write_delay (); > > + return next->zero (nxdata, count, offset, may_trim); > > If next->zero() fails with EOPNOTSUPP, that means we will delay once in > trying the underlying command, and again for each iteration of the > fallback loop as it calls delay_pwrite(). Is that okay, or do we want > to reproduce some fallback logic here and directly call next->pwrite on > EOPNOTSUPP so as to only have a single write delay in that case?Good point. How about this, it seems simpler: static int delay_zero (struct nbdkit_next *next, void *nxdata, void *handle, uint32_t count, uint64_t offset, int may_trim) { int r = next->zero (nxdata, count, offset, may_trim); if (r != -1) write_delay (); return r; } And the same for the other methods in that filter. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Richard W.M. Jones
2018-Jan-18 08:55 UTC
Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
On Wed, Jan 17, 2018 at 08:01:35PM -0600, Eric Blake wrote:> Actually, thinking about this more: > > When I added zero support, I documented in commit ac3f294a that for the > file plugin, wdelay is indeed doubled on systems lacking efficient zero > support. But the fallback for handling EOPNOTSUPP is currently done at > the plugins.c level (ie. it is part of next->zero()) - so a filter > should never see a plugin returning EOPNOTSUPP. Or put another way, > your movement of wdelay from being part of the file plugin to now being > a separate filter actually FIXES the double delay. On the other hand, a > filter can itself return EOPNOTSUPP (rather than calling next->zero), in > which case we still need to support the fallback to .pwrite higher up > in the call stackOK, so ignore my previous email. However, I'm wary of doing fallbacks in general in the filter layer, for a couple of reasons: (1) If you run some filters twice (eg. offset filter) then you end up doubling offsets which will cause data corruption. This doesn't apply in the EOPNOTSUPP case, but it does apply in the .zero fallback case (where it falls back to .write). (2) Filters are a new thing and we can just declare that they cannot use EOPNOTSUPP.> Along the same lines, with my FUA patches, if we keep the fallback of > calling .flush in connections.c, then the flush will pass through the > entire filter chain; but I had proposed moving the fallback flush into > plugins.c, at which point the fallback flush is done at the layer that > lacks FUA support. And maybe we want a flush delay to mirror the > existing rdelay and wdelay (simulating a long flush time can indeed be > useful in tests); where it definitely matters whether the flush delay is > triggered as part of FUA fallback, or only triggered on an actual > NBD_CMD_FLUSH from the client. > > I'm also thinking ahead to expansions - for example, there are proposals > to add resize and block status commands to NBD. In your implementation, > if a filter does not define .can_flush or .flush, then its .can_flush > implementation merely returns whatever the underlying backend's version > would report. But as we add new callbacks, such as a new .can_resize, > we need to make sure we have sane defaults. For example, the offset > filter should probably not allow resizes (even if the underlying plugin > does). If the offset filter compiled in a version of nbdkit that lacks > the .can_resize hook is run by nbdkit that has added the hook, we > therefore want to default .can_resize to false as part of loading the > smaller filter struct into memory (and NOT default to falling through to > the plugin's .can_resize); a default based on fall-through should only > happen if the filter had a size large enough to admit that the user > explicitly omitted the callback in the filter, rather than it being NULL > because of version mismatch.We may have to say that the ABI guarantee only applies to plugins, which might not be a bad thing since filters are much harder to write than plugins and are going to be quite closely tied to both the version and precise implementation of nbdkit. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Eric Blake
2018-Jan-18 14:19 UTC
Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
On 01/18/2018 02:47 AM, Richard W.M. Jones wrote:> On Wed, Jan 17, 2018 at 04:38:04PM -0600, Eric Blake wrote: >>> +/* Zero data. */ >>> +static int >>> +delay_zero (struct nbdkit_next *next, void *nxdata, >>> + void *handle, uint32_t count, uint64_t offset, int may_trim) >>> +{ >>> + write_delay (); >>> + return next->zero (nxdata, count, offset, may_trim); >> >> If next->zero() fails with EOPNOTSUPP, that means we will delay once in >> trying the underlying command, and again for each iteration of the >> fallback loop as it calls delay_pwrite(). Is that okay, or do we want >> to reproduce some fallback logic here and directly call next->pwrite on >> EOPNOTSUPP so as to only have a single write delay in that case? > > Good point. How about this, it seems simpler: > > static int > delay_zero (struct nbdkit_next *next, void *nxdata, > void *handle, uint32_t count, uint64_t offset, int may_trim) > { > int r = next->zero (nxdata, count, offset, may_trim); > > if (r != -1) > write_delay ();It changes the semantics from an up-front delay to a delay only on success (failures are reported immediately). In fact, there may be reasons to make it configurable on whether you want front- or back-loaded delays, as a workflow where errors are reported instantly is different from a workflow where everything is delayed. But if we document that the plugin backend will never let EOPNOTSUPP escape, and that filters must not fail with EOPNOTSUPP (ie. the fallback to write can only be done by the plugin), then keeping the unconditional up-front delay should be fine without doubling up when a fallback to write is needed. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Reasonably Related Threads
- Re: [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
- Re: [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
- [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
- [PATCH nbdkit filters-v2 5/5] INCOMPLETE filters: Add nbdkit-partition-filter.
- [PATCH nbdkit v4 06/15] delay: Allow block status (extents) requests to be separately delayed.