thr3ads.net - Libguestfs - Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter. [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2018-Jan-17 22:38 UTC

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

On 01/17/2018 02:53 PM, Richard W.M. Jones wrote:> Previously the file plugin supported ‘rdelay’ and ‘wdelay’ parameters
> for injecting delays (for testing) into read and write requests.  This
> moves the functionality to a new delay filter so that it can be used
> with any plugin.
> ---
> +/* Write data. */
> +static int
> +delay_pwrite (struct nbdkit_next *next, void *nxdata,
> +              void *handle,
> +              const void *buf, uint32_t count, uint64_t offset)
> +{
> +  write_delay ();
> +  return next->pwrite (nxdata, buf, count, offset);
> +}
> +
> +/* Zero data. */
> +static int
> +delay_zero (struct nbdkit_next *next, void *nxdata,
> +            void *handle, uint32_t count, uint64_t offset, int may_trim)
> +{
> +  write_delay ();
> +  return next->zero (nxdata, count, offset, may_trim);
If next->zero() fails with EOPNOTSUPP, that means we will delay once in
trying the underlying command, and again for each iteration of the
fallback loop as it calls delay_pwrite().  Is that okay, or do we want
to reproduce some fallback logic here and directly call next->pwrite on
EOPNOTSUPP so as to only have a single write delay in that case?

> +++ b/plugins/file/nbdkit-file-plugin.pod
> @@ -31,21 +31,13 @@ This parameter is required.
>  
>  =item B<rdelay=E<lt>NNE<gt>ms>
>  
> -Delay reads for C<SECS> seconds or C<NN> milliseconds.
> -This is used to simulate a slow or remote server, or to
> -test certain kinds of race conditions in Linux.
> -
> -The default is no delay.
> -
>  =item B<wdelay=SECS>
>  
>  =item B<wdelay=E<lt>NNE<gt>ms>
>  
> -Delay writes for C<SECS> seconds or C<NN> milliseconds.
> -This is used to simulate a slow or remote server, or to
> -test certain kinds of race conditions in Linux.
> -
> -The default is no delay.
> +These plugin parameters have been moved to the
> +L<nbdkit-delay-filter(1)> filter.  Modify the command line to add
> +I<--filter=delay> in order to use these parameters.
I guess we don't promise back-comptibility on plugin parameters, but at
least you've documented how to fix things (and the testsuite had to
follow that advice).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Eric Blake

2018-Jan-18 02:01 UTC

head link

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

On 01/17/2018 04:38 PM, Eric Blake wrote:> On 01/17/2018 02:53 PM, Richard W.M. Jones wrote:
>> Previously the file plugin supported ‘rdelay’ and ‘wdelay’ parameters
>> for injecting delays (for testing) into read and write requests.  This
>> moves the functionality to a new delay filter so that it can be used
>> with any plugin.
>> ---
> 
>> +/* Write data. */
>> +static int
>> +delay_pwrite (struct nbdkit_next *next, void *nxdata,
>> +              void *handle,
>> +              const void *buf, uint32_t count, uint64_t offset)
>> +{
>> +  write_delay ();
>> +  return next->pwrite (nxdata, buf, count, offset);
>> +}
>> +
>> +/* Zero data. */
>> +static int
>> +delay_zero (struct nbdkit_next *next, void *nxdata,
>> +            void *handle, uint32_t count, uint64_t offset, int
may_trim)
>> +{
>> +  write_delay ();
>> +  return next->zero (nxdata, count, offset, may_trim);
> 
> If next->zero() fails with EOPNOTSUPP, that means we will delay once in
> trying the underlying command, and again for each iteration of the
> fallback loop as it calls delay_pwrite().  Is that okay, or do we want
> to reproduce some fallback logic here and directly call next->pwrite on
> EOPNOTSUPP so as to only have a single write delay in that case?
Actually, thinking about this more:

When I added zero support, I documented in commit ac3f294a that for the
file plugin, wdelay is indeed doubled on systems lacking efficient zero
support.  But the fallback for handling EOPNOTSUPP is currently done at
the plugins.c level (ie. it is part of next->zero()) - so a filter
should never see a plugin returning EOPNOTSUPP.  Or put another way,
your movement of wdelay from being part of the file plugin to now being
a separate filter actually FIXES the double delay.  On the other hand, a
filter can itself return EOPNOTSUPP (rather than calling next->zero), in
which case we still need to support the fallback to .pwrite  higher up
in the call stack

Along the same lines, with my FUA patches, if we keep the fallback of
calling .flush in connections.c, then the flush will pass through the
entire filter chain; but I had proposed moving the fallback flush into
plugins.c, at which point the fallback flush is done at the layer that
lacks FUA support.  And maybe we want a flush delay to mirror the
existing rdelay and wdelay (simulating a long flush time can indeed be
useful in tests); where it definitely matters whether the flush delay is
triggered as part of FUA fallback, or only triggered on an actual
NBD_CMD_FLUSH from the client.

I'm also thinking ahead to expansions - for example, there are proposals
to add resize and block status commands to NBD.  In your implementation,
if a filter does not define .can_flush or .flush, then its .can_flush
implementation merely returns whatever the underlying backend's version
would report.  But as we add new callbacks, such as a new .can_resize,
we need to make sure we have sane defaults.  For example, the offset
filter should probably not allow resizes (even if the underlying plugin
does).  If the offset filter compiled in a version of nbdkit that lacks
the .can_resize hook is run by nbdkit that has added the hook, we
therefore want to default .can_resize to false as part of loading the
smaller filter struct into memory (and NOT default to falling through to
the plugin's .can_resize); a default based on fall-through should only
happen if the filter had a size large enough to admit that the user
explicitly omitted the callback in the filter, rather than it being NULL
because of version mismatch.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Richard W.M. Jones

2018-Jan-18 08:47 UTC

head link

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

On Wed, Jan 17, 2018 at 04:38:04PM -0600, Eric Blake
wrote:> > +/* Zero data. */
> > +static int
> > +delay_zero (struct nbdkit_next *next, void *nxdata,
> > +            void *handle, uint32_t count, uint64_t offset, int
may_trim)
> > +{
> > +  write_delay ();
> > +  return next->zero (nxdata, count, offset, may_trim);
> 
> If next->zero() fails with EOPNOTSUPP, that means we will delay once in
> trying the underlying command, and again for each iteration of the
> fallback loop as it calls delay_pwrite().  Is that okay, or do we want
> to reproduce some fallback logic here and directly call next->pwrite on
> EOPNOTSUPP so as to only have a single write delay in that case?
Good point.  How about this, it seems simpler:

  static int
  delay_zero (struct nbdkit_next *next, void *nxdata,
              void *handle, uint32_t count, uint64_t offset, int may_trim)
  {
    int r = next->zero (nxdata, count, offset, may_trim);

    if (r != -1)
      write_delay ();
    return r;
  }

And the same for the other methods in that filter.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Richard W.M. Jones

2018-Jan-18 08:55 UTC

head link

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

On Wed, Jan 17, 2018 at 08:01:35PM -0600, Eric Blake
wrote:> Actually, thinking about this more:
> 
> When I added zero support, I documented in commit ac3f294a that for the
> file plugin, wdelay is indeed doubled on systems lacking efficient zero
> support.  But the fallback for handling EOPNOTSUPP is currently done at
> the plugins.c level (ie. it is part of next->zero()) - so a filter
> should never see a plugin returning EOPNOTSUPP.  Or put another way,
> your movement of wdelay from being part of the file plugin to now being
> a separate filter actually FIXES the double delay.  On the other hand, a
> filter can itself return EOPNOTSUPP (rather than calling next->zero), in
> which case we still need to support the fallback to .pwrite  higher up
> in the call stack
OK, so ignore my previous email.

However, I'm wary of doing fallbacks in general in the filter layer,
for a couple of reasons:

(1) If you run some filters twice (eg. offset filter) then you end up
doubling offsets which will cause data corruption.  This doesn't apply
in the EOPNOTSUPP case, but it does apply in the .zero fallback case
(where it falls back to .write).

(2) Filters are a new thing and we can just declare that they cannot
use EOPNOTSUPP.
> Along the same lines, with my FUA patches, if we keep the fallback of
> calling .flush in connections.c, then the flush will pass through the
> entire filter chain; but I had proposed moving the fallback flush into
> plugins.c, at which point the fallback flush is done at the layer that
> lacks FUA support.  And maybe we want a flush delay to mirror the
> existing rdelay and wdelay (simulating a long flush time can indeed be
> useful in tests); where it definitely matters whether the flush delay is
> triggered as part of FUA fallback, or only triggered on an actual
> NBD_CMD_FLUSH from the client.
> 
> I'm also thinking ahead to expansions - for example, there are
proposals
> to add resize and block status commands to NBD.  In your implementation,
> if a filter does not define .can_flush or .flush, then its .can_flush
> implementation merely returns whatever the underlying backend's version
> would report.  But as we add new callbacks, such as a new .can_resize,
> we need to make sure we have sane defaults.  For example, the offset
> filter should probably not allow resizes (even if the underlying plugin
> does).  If the offset filter compiled in a version of nbdkit that lacks
> the .can_resize hook is run by nbdkit that has added the hook, we
> therefore want to default .can_resize to false as part of loading the
> smaller filter struct into memory (and NOT default to falling through to
> the plugin's .can_resize); a default based on fall-through should only
> happen if the filter had a size large enough to admit that the user
> explicitly omitted the callback in the filter, rather than it being NULL
> because of version mismatch.
We may have to say that the ABI guarantee only applies to plugins,
which might not be a bad thing since filters are much harder to write
than plugins and are going to be quite closely tied to both the
version and precise implementation of nbdkit.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

Eric Blake

2018-Jan-18 14:19 UTC

head link

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

On 01/18/2018 02:47 AM, Richard W.M. Jones wrote:> On Wed, Jan 17, 2018 at 04:38:04PM -0600, Eric Blake wrote:
>>> +/* Zero data. */
>>> +static int
>>> +delay_zero (struct nbdkit_next *next, void *nxdata,
>>> +            void *handle, uint32_t count, uint64_t offset, int
may_trim)
>>> +{
>>> +  write_delay ();
>>> +  return next->zero (nxdata, count, offset, may_trim);
>>
>> If next->zero() fails with EOPNOTSUPP, that means we will delay once
in
>> trying the underlying command, and again for each iteration of the
>> fallback loop as it calls delay_pwrite().  Is that okay, or do we want
>> to reproduce some fallback logic here and directly call next->pwrite
on
>> EOPNOTSUPP so as to only have a single write delay in that case?
> 
> Good point.  How about this, it seems simpler:
> 
>   static int
>   delay_zero (struct nbdkit_next *next, void *nxdata,
>               void *handle, uint32_t count, uint64_t offset, int may_trim)
>   {
>     int r = next->zero (nxdata, count, offset, may_trim);
> 
>     if (r != -1)
>       write_delay ();
It changes the semantics from an up-front delay to a delay only on
success (failures are reported immediately).  In fact, there may be
reasons to make it configurable on whether you want front- or
back-loaded delays, as a workflow where errors are reported instantly is
different from a workflow where everything is delayed.

But if we document that the plugin backend will never let EOPNOTSUPP
escape, and that filters must not fail with EOPNOTSUPP (ie. the fallback
to write can only be done by the plugin), then keeping the unconditional
up-front delay should be fine without doubling up when a fallback to
write is needed.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Possibly Parallel Threads

Search for more maybe matching threads

Libguestfs - Jan 2018 - Re: [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

Re: [Libguestfs] [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.

Possibly Parallel Threads