thr3ads.net - Libguestfs - Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2017-Jan-24 14:51 UTC

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

On 01/24/2017 05:38 AM, Richard W.M. Jones wrote:> On Mon, Jan 23, 2017 at 09:13:23PM -0600, Eric Blake wrote:
>> Begin the language binding followups to my new .zero callback, since
>> Rich was indeed correct that we want them.
>>
>> I'm more familiar with python and perl (at least to the point that
>> I was able to modify the appropriate example files and prove to
>> myself that the bindings worked), so I've started with those.
>> I'm less familiar with ruby and ocaml, so I've left those for
>> tomorrow (it will rely heavily on copy-and-paste); it is also a
>> chance to review if my decision of requiring a return value in
>> order to trigger the graceful fallback is the best approach.
> 
> The other methods (eg. trim) deal with this by having a separate
> method called "can_*" (eg. can_trim).  Can we have a can_zero
callback
> instead?
I thought about that, but there's several problems:

1) the can_trim and similar functions are required in order to alter
what the server advertises to the guest during the handshake phase; at
which point the guest is out-of-spec if it tries to use the command that
the plugin did not advertise. But while those other commands have no
real impact to the amount of wire traffic, we want to always advertise
WRITE_ZEROES support to the client (because it is so much more efficient
to pass a WRITE_ZEROES command over the wire than a regular WRITE with a
payload of zeroes), even when the plugin doesn't support it.  Since
having a can_zero callback does not affect what we advertise, the only
other reason to implement it would be if can affect what we ask the
plugin to do.

2) the fallocate() system call does not have any introspectible way to
tell if it will work on a given file system, other than to try it and
see if it fails with EOPNOTSUPP.  On the C side, I thought it was
elegant to let the errno value be a deciding factor on whether the
attempt was fatal or whether to gracefully fall back to calling the
pwrite callback, while still centralizing the calloc() logic in the
plugin.c file rather than making every single C implementation of .zero
repeat the logic.  Since it is not easy to introspect whether
fallocate() will fail on a given file, it's much harder to write a
can_zero() function than it is to just try fallocate() and have a
graceful fallback.

3) the NBD_CMD_WRITE_ZEROES command has a flag value that affects
operation.  The existing pread and pwrite callbacks do NOT have a flag
argument (arguably, a hole in those specifications if we ever come up
with a reason that we need to pass flags to those commands, but not
necessarily something to worry about today).  But the write zeroes
command, per spec, has a way to state whether the server may use trim
(provided that the hole reads back as zero) or must leave things
allocated, giving the client some rudimentary control over whether the
server will create a sparse file.  The spec says that the server is not
required to trim (so falling back to pwrite is always appropriate), but
the way the spec is worded, using fallocate(FALLOC_FL_PUNCH_HOLE) is
only appropriate when the may_trim flag is set in the existing C
interface (which in turn corresponds to a lack of the
NBD_CMD_FLAG_NO_HOLE flag on the wire).  Having a boolean can_zero()
callback is insufficient to tell whether the plugin would behave
differently for may_trim=0 vs. may_trim=1

So here's some ideas for alternative implementations in the language
bindings:

1) give the binding a way to request graceful fallback to write (what I
implemented in this version)

2) provide a completely different interface to the language bindings,
but keeping the C interface as already committed.  The language binding
would only implement a single new entry point, perhaps named
trim_reads_as_zero().  On the C side of the binding, when plugin.c calls
lang_zero(, may_trim=1), C checks if trim_reads_as_zero() is defined and
returns true, and if so calls the existing trim() binding; if
trim_reads_as-zero() is not defined, or returns false, or trim() is not
defined, then C calls the existing pwrite() binding.  Thus, languages do
NOT need a zero() binding.  But doing this may lose opportunities for
optimizing an all-zero write even where fallocate(FALLOC_FL_PUNCH_HOLE)
is inappropriate, as newer Linux has added
fallocate(FALLOC_FL_ZERO_RANGE) [hmm, maybe I should do a followup patch
to file.c to add the use of that option, but that's a story for another
patch]

3) adjust the C interface to match the bindings interface proposed in 2,
by making trim_reads_as_zero() part of the overall callback interface.
We haven't released yet, so now's the time to tweak the interfaces, but
like option 2, it makes it harder for an implementation to take
advantage of faster ways to write large blocks of zeroes while still
avoiding holes.

4) document that the language bindings can't use automatic fallback to
write - they must implement it themselves.  But unlike C having to do a
verbose reimplementation of a call to calloc() and free(), the languages
we bind to tend to have more compact representations for quickly
creating a buffer of all zeros (perl's "\0" x count, python's
bytearray(count), etc).

I'm probably leaning towards number 4, since you expressed concern about
number 1, but would love further opinions before I spend any more time
churning out patches that aren't needed.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Richard W.M. Jones

2017-Jan-24 15:16 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

On Tue, Jan 24, 2017 at 08:51:05AM -0600, Eric Blake
wrote:> On 01/24/2017 05:38 AM, Richard W.M. Jones wrote:
> > On Mon, Jan 23, 2017 at 09:13:23PM -0600, Eric Blake wrote:
> >> Begin the language binding followups to my new .zero callback,
since
> >> Rich was indeed correct that we want them.
> >>
> >> I'm more familiar with python and perl (at least to the point
that
> >> I was able to modify the appropriate example files and prove to
> >> myself that the bindings worked), so I've started with those.
> >> I'm less familiar with ruby and ocaml, so I've left those
for
> >> tomorrow (it will rely heavily on copy-and-paste); it is also a
> >> chance to review if my decision of requiring a return value in
> >> order to trigger the graceful fallback is the best approach.
> > 
> > The other methods (eg. trim) deal with this by having a separate
> > method called "can_*" (eg. can_trim).  Can we have a
can_zero callback
> > instead?
> 
> I thought about that, but there's several problems:
> 
> 1) the can_trim and similar functions are required in order to alter
> what the server advertises to the guest during the handshake phase; at
> which point the guest is out-of-spec if it tries to use the command that
> the plugin did not advertise. But while those other commands have no
> real impact to the amount of wire traffic, we want to always advertise
> WRITE_ZEROES support to the client (because it is so much more efficient
> to pass a WRITE_ZEROES command over the wire than a regular WRITE with a
> payload of zeroes), even when the plugin doesn't support it.  Since
> having a can_zero callback does not affect what we advertise, the only
> other reason to implement it would be if can affect what we ask the
> plugin to do.
> 
> 2) the fallocate() system call does not have any introspectible way to
> tell if it will work on a given file system, other than to try it and
> see if it fails with EOPNOTSUPP.  On the C side, I thought it was
> elegant to let the errno value be a deciding factor on whether the
> attempt was fatal or whether to gracefully fall back to calling the
> pwrite callback, while still centralizing the calloc() logic in the
> plugin.c file rather than making every single C implementation of .zero
> repeat the logic.  Since it is not easy to introspect whether
> fallocate() will fail on a given file, it's much harder to write a
> can_zero() function than it is to just try fallocate() and have a
> graceful fallback.
> 
> 3) the NBD_CMD_WRITE_ZEROES command has a flag value that affects
> operation.  The existing pread and pwrite callbacks do NOT have a flag
> argument (arguably, a hole in those specifications if we ever come up
> with a reason that we need to pass flags to those commands, but not
> necessarily something to worry about today).  But the write zeroes
> command, per spec, has a way to state whether the server may use trim
> (provided that the hole reads back as zero) or must leave things
> allocated, giving the client some rudimentary control over whether the
> server will create a sparse file.  The spec says that the server is not
> required to trim (so falling back to pwrite is always appropriate), but
> the way the spec is worded, using fallocate(FALLOC_FL_PUNCH_HOLE) is
> only appropriate when the may_trim flag is set in the existing C
> interface (which in turn corresponds to a lack of the
> NBD_CMD_FLAG_NO_HOLE flag on the wire).  Having a boolean can_zero()
> callback is insufficient to tell whether the plugin would behave
> differently for may_trim=0 vs. may_trim=1
> 
> So here's some ideas for alternative implementations in the language
> bindings:
> 
> 1) give the binding a way to request graceful fallback to write (what I
> implemented in this version)
> 
> 2) provide a completely different interface to the language bindings,
> but keeping the C interface as already committed.  The language binding
> would only implement a single new entry point, perhaps named
> trim_reads_as_zero().  On the C side of the binding, when plugin.c calls
> lang_zero(, may_trim=1), C checks if trim_reads_as_zero() is defined and
> returns true, and if so calls the existing trim() binding; if
> trim_reads_as-zero() is not defined, or returns false, or trim() is not
> defined, then C calls the existing pwrite() binding.  Thus, languages do
> NOT need a zero() binding.  But doing this may lose opportunities for
> optimizing an all-zero write even where fallocate(FALLOC_FL_PUNCH_HOLE)
> is inappropriate, as newer Linux has added
> fallocate(FALLOC_FL_ZERO_RANGE) [hmm, maybe I should do a followup patch
> to file.c to add the use of that option, but that's a story for another
> patch]
> 
> 3) adjust the C interface to match the bindings interface proposed in 2,
> by making trim_reads_as_zero() part of the overall callback interface.
> We haven't released yet, so now's the time to tweak the interfaces,
but
> like option 2, it makes it harder for an implementation to take
> advantage of faster ways to write large blocks of zeroes while still
> avoiding holes.
> 
> 4) document that the language bindings can't use automatic fallback to
> write - they must implement it themselves.  But unlike C having to do a
> verbose reimplementation of a call to calloc() and free(), the languages
> we bind to tend to have more compact representations for quickly
> creating a buffer of all zeros (perl's "\0" x count,
python's
> bytearray(count), etc).
> 
> I'm probably leaning towards number 4, since you expressed concern
about
> number 1, but would love further opinions before I spend any more time
> churning out patches that aren't needed.
Another option -- simpler, I think -- is to review the way that
plugins set errors.

Currently there is only one call (nbdkit_error) which takes a string.
Thus if the plugin error comes with an errno, it is effectively lost
(or at best, turned into a string, but not in a way that callers can
use).

However we could add another API:

  nbdkit_set_errno (int errno);

This allows the plugin to send the errno back to the core code
explicitly.  It's effectively stored as a global in the core code.

Note that the correct calling sequence from C is:

  nbdkit_set_error (errno);  // this call is optional and may be omitted
  nbdkit_error ("oops: %s", strerror (errno));
  return -1;

Mapping these to the language bindings should be easy.  We just need
to add a binding to nbdkit_set_error, but the exception mechanism for
indicating an error and sending the string can stay the same.

In Perl, old code to raise an error would look like:

  sub pread
  {
     my $h = shift;
     my $count = shift;
     my $offset = shift;
     my $ret;
     read ($FH, $ret, $count, $offset) || die "read: $!"
     return $ret;
  }

(Note that die does not exit the program, it raises an exception which
is caught and turned into nbdkit_error in the Perl nbdkit plugin).

This would continue to work fine, but new code which cared about the
errno could do this instead:

  sub pread
  {
     my $h = shift;
     my $count = shift;
     my $offset = shift;
     my $ret;
     read ($FH, $ret, $count, $offset) || {
       nbdkit_set_errno (POSIX::errno ());
       die "read: $!"
     }
     return $ret;
  }

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Richard W.M. Jones

2017-Jan-24 15:41 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

On Tue, Jan 24, 2017 at 03:16:45PM +0000, Richard W.M. Jones
wrote:> However we could add another API:
> 
>   nbdkit_set_errno (int errno);
> 
> This allows the plugin to send the errno back to the core code
> explicitly.  It's effectively stored as a global in the core code.
What I'm really saying here is that at the moment (after your previous
patch series) we have an implicit channel where errno is passed from
the plugin to the core code.  We should simply make that channel
explicit and controlled by the plugin.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

Eric Blake

2017-Jan-26 02:48 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

On 01/24/2017 09:16 AM, Richard W.M. Jones wrote:
> In Perl, old code to raise an error would look like:
> 
>   sub pread
>   {
>      my $h = shift;
>      my $count = shift;
>      my $offset = shift;
>      my $ret;
>      read ($FH, $ret, $count, $offset) || die "read: $!"
>      return $ret;
>   }
> 
> (Note that die does not exit the program, it raises an exception which
> is caught and turned into nbdkit_error in the Perl nbdkit plugin).
> 
> This would continue to work fine, but new code which cared about the
> errno could do this instead:
> 
>   sub pread
>   {
>      my $h = shift;
>      my $count = shift;
>      my $offset = shift;
>      my $ret;
>      read ($FH, $ret, $count, $offset) || {
>        nbdkit_set_errno (POSIX::errno ());
>        die "read: $!"
>      }
>      return $ret;
>   }
Except I can't even figure out how to expose nbdkit_set_error (hmm, I
named it set_error instead of set_errno in my v2 series) to the perl
code.  In fact, with just this change to example.pl:

diff --git i/plugins/perl/example.pl w/plugins/perl/example.pl
index fcdac33..64f6de3 100644
--- i/plugins/perl/example.pl
+++ w/plugins/perl/example.pl
@@ -1,4 +1,5 @@
 use strict;
+use POSIX ();

 # Example Perl plugin.
 #
@@ -80,6 +81,8 @@ sub pwrite
     my $buf = shift;
     my $count = length ($buf);
     my $offset = shift;
+    my $err = POSIX::EPERM;

     substr ($disk, $offset, $count) = $buf;
+    die "forced write failure";
 }

I'm getting this failure:

$ ./src/nbdkit -e foo -fv plugins/perl/.libs/nbdkit-perl-plugin.so \
    script=plugins/perl/example.pl
nbdkit: debug: registering plugins/perl/.libs/nbdkit-perl-plugin.so
nbdkit: debug: registered plugins/perl/.libs/nbdkit-perl-plugin.so (name
perl)
nbdkit: debug: plugins/perl/.libs/nbdkit-perl-plugin.so: load
nbdkit: debug: plugins/perl/.libs/nbdkit-perl-plugin.so: config
key=script, value=plugins/perl/example.pl
Can't load module Fcntl, dynamic loading not available in this perl.
  (You may need to build a new perl executable which either supports
  dynamic loading or has the Fcntl module statically linked into it.)
 at /usr/lib64/perl5/POSIX.pm line 17.
Compilation failed in require at /usr/lib64/perl5/POSIX.pm line 17.
BEGIN failed--compilation aborted at /usr/lib64/perl5/POSIX.pm line 17.
Compilation failed in require at plugins/perl/example.pl line 2.
BEGIN failed--compilation aborted at plugins/perl/example.pl line 2.
nbdkit: error: plugins/perl/example.pl: one of the required callbacks
'open', 'get_size' or 'pread' is not defined by this
Perl script.
nbdkit requires these callbacks.

So I'm not sure how anyone else is doing anything fancy in an NBD perl
script.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Seemingly Similar Threads

Search for more possibly parallel threads

Libguestfs - Jan 2017 - Re: [nbdkit PATCH 0/2] bind .zero to more languages

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

Re: [Libguestfs] [nbdkit PATCH 0/2] bind .zero to more languages

Seemingly Similar Threads