thr3ads.net - Libguestfs - [Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data [Aug 2023]

If this information is useful, please help other people find it:
Share via:

Laszlo Ersek

2023-Aug-31 09:18 UTC

[Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data

On 8/31/23 10:55, Richard W.M. Jones wrote:> On Thu, Aug 31, 2023 at 10:40:53AM +0200, Laszlo Ersek wrote:
>> On 8/31/23 00:21, Eric Blake wrote:
>>> I hit another transient failure in libnbd CI when a poorly-written
>>> eval script did not consume all of stdin during .pwrite.  As
behaving
>>> as a data sink can be a somewhat reasonable feature of a
>>> quickly-written sh or eval plugin, we should not be so insistent as
>>> treating an EPIPE failure as an immediate return of EIO to the
client.
>>>
>>> Signed-off-by: Eric Blake <eblake at redhat.com>
>>> ---
>>>
>>> I probably need to add unit test coverage of this before committing
>>> (although proving that I win the data race on a client process
exiting
>>> faster than the parent can write enough data to hit EPIPE is hard).
>>>
>>>  plugins/sh/nbdkit-sh-plugin.pod |  8 +++++++
>>>  plugins/sh/call.c               | 38
++++++++++++++++++++++-----------
>>>  2 files changed, 34 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/plugins/sh/nbdkit-sh-plugin.pod
b/plugins/sh/nbdkit-sh-plugin.pod
>>> index b2c946a0..8b83a5b3 100644
>>> --- a/plugins/sh/nbdkit-sh-plugin.pod
>>> +++ b/plugins/sh/nbdkit-sh-plugin.pod
>>> @@ -505,6 +505,14 @@ Unlike in other languages, if you provide a
C<pwrite> method you
>>>  B<must> also provide a C<can_write> method which exits
with code C<0>
>>>  (true).
>>>
>>> +With nbdkit E<ge> 1.36, this method may return C<0>
without consuming
>>> +any data from stdin, and without producing any output, in order to
>>> +behave as an intentional data sink.  But in older versions, nbdkit
>>> +would treat any C<EPIPE> failure in writing to your script
as an error
>>> +condition even if your script returns success; to avoid unintended
>>> +failures, you may want to include C<"cat
>/dev/null"> in a script
>>> +intending to ignore the client's write requests.
>>> +
>>>  =item C<flush>
>>>
>>>   /path/to/script flush <handle>
>>> diff --git a/plugins/sh/call.c b/plugins/sh/call.c
>>> index 888c6459..79c67a04 100644
>>> --- a/plugins/sh/call.c
>>> +++ b/plugins/sh/call.c
>>> @@ -34,6 +34,7 @@
>>>
>>>  #include <assert.h>
>>>  #include <fcntl.h>
>>> +#include <stdbool.h>
>>>  #include <stdio.h>
>>>  #include <stdlib.h>
>>>  #include <inttypes.h>
>>> @@ -130,6 +131,7 @@ debug_call (const char **argv)
>>>   */
>>>  static int
>>>  call3 (const char *wbuf, size_t wbuflen, /* sent to stdin (can be
NULL) */
>>> +       bool *pipe_full,                  /* set if wbuf not fully
written */
>>>         string *rbuf,                     /* read from stdout */
>>>         string *ebuf,                     /* read from stderr */
>>>         const char **argv)                /* script + parameters */
>>> @@ -275,15 +277,8 @@ call3 (const char *wbuf, size_t wbuflen, /*
sent to stdin (can be NULL) */
>>>        r = write (pfds[0].fd, wbuf, wbuflen);
>>>        if (r == -1) {
>>>          if (errno == EPIPE) {
>>> -          /* We tried to write to the script but it didn't
consume
>>> -           * the data.  Probably the script exited without reading
>>> -           * from stdin.  This is an error in the script.
>>> -           */
>>> -          nbdkit_error ("%s: write to script failed because
of a broken pipe: "
>>> -                        "this can happen if the script exits
without "
>>> -                        "consuming stdin, which usually
indicates a bug "
>>> -                        "in the script",
>>> -                        argv0);
>>> +          *pipe_full = true;
>>> +          r = wbuflen;
>>>          }
>>>          else
>>>            nbdkit_error ("%s: write: %m", argv0);
>>> @@ -555,7 +550,7 @@ call (const char **argv)
>>>    CLEANUP_FREE_STRING string rbuf = empty_vector;
>>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
>>>
>>> -  r = call3 (NULL, 0, &rbuf, &ebuf, argv);
>>> +  r = call3 (NULL, 0, NULL, &rbuf, &ebuf, argv);
>>>    return handle_script_error (argv[0], &ebuf, r);
>>>  }
>>>
>>> @@ -568,7 +563,7 @@ call_read (string *rbuf, const char **argv)
>>>    int r;
>>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
>>>
>>> -  r = call3 (NULL, 0, rbuf, &ebuf, argv);
>>> +  r = call3 (NULL, 0, NULL, rbuf, &ebuf, argv);
>>>    r = handle_script_error (argv[0], &ebuf, r);
>>>    if (r == ERROR)
>>>      string_reset (rbuf);
>>> @@ -584,7 +579,26 @@ call_write (const char *wbuf, size_t wbuflen,
const char **argv)
>>>    int r;
>>>    CLEANUP_FREE_STRING string rbuf = empty_vector;
>>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
>>> +  bool pipe_full = false;
>>>
>>> -  r = call3 (wbuf, wbuflen, &rbuf, &ebuf, argv);
>>> +  r = call3 (wbuf, wbuflen, &pipe_full, &rbuf, &ebuf,
argv);
>>> +  if (pipe_full && r == OK) {
>>> +    /* We allow scripts to intentionally ignore data, but they
must
>>> +     * have no output when doing so.
>>> +     */
>>> +    if (rbuf.len > 0 || ebuf.len > 0) {
>>> +      nbdkit_error ("%s: write to script failed because of a
broken pipe: "
>>> +                    "this can happen if the script exits
without "
>>> +                    "consuming stdin, which usually indicates
a bug "
>>> +                    "in the script",
>>> +                    argv[0]);
>>> +      r = ERROR;
>>> +    }
>>> +    else
>>> +      nbdkit_debug ("%s: write to script failed because of a
broken pipe; "
>>> +                    "assuming this was an intentional data
sink, although it "
>>> +                    "may indicate a bug in the script",
>>> +                    argv[0]);
>>> +  }
>>>    return handle_script_error (argv[0], &ebuf, r);
>>>  }
>>
>> The patch appears to do what it says on the tin, but I'm
unconvinced we
>> should relax the requirement. The "may indicate a bug" debug
message is
>> easy to miss. A hard failure upon EPIPE is noticeable, and as you write
>> the script can just drain unwanted input with "cat
>/dev/null".
>>
>> I think I may not be appreciating this patch due to not understanding
>> how difficult it could be to update such a script. Can you elaborate on
>> that? Can you provide an example where adding "cat
>/dev/null" to the
>> pwrite script is a disproportionate chore?
> 
> This is related to the following commit which Eric pushed yesterday:
> 
>
https://gitlab.com/nbdkit/libnbd/-/commit/4df70870420d1be9348ac45f4aa467501eca5089
> 
> The commit is correct.  But it's an ongoing problem that we forget to
> discard all the input in pwrite.  Note you have to do this on every
> path out of the pwrite method, including errors or (like this case)
> where the test just isn't interested in the data.  The error is also
> intermittent because it is a race between two possible paths through
> the poll(2) loop in plugins/sh/call.c:call3() Also this case only
> applies to the sh plugin.  Other plugins (eg written in C) ignore or
> only partially consume the buffer in pwrite all the time, and that's
> never a problem.
I don't understand.

https://libguestfs.org/nbdkit-plugin.3.html#pwrite

"""
The callback must write the whole count bytes if it can. The NBD
protocol doesn't allow partial writes (instead, these would be errors).
If the whole count bytes was written successfully, the callback should
return 0 to indicate there was no error.

If there is an error (including a short write which couldn't be
recovered from), .pwrite should call nbdkit_error with an error message,
and nbdkit_set_error to record an appropriate error (unless errno is
sufficient), then return -1.
"""

How is it safe for a plugin (written in C) to *silently* throw away part
of the data that the client wants written?

I agree consistency between sh and other plugins is good, but the
"baseline" seems unfathomable to me. I probably don't understand
it.

Laszlo
> 
> In my counter-proposal, just posted but not showing up in the mailing
> list archives yet, I argue that we should simply ignore EPIPE.
> 
> Rich.
>

Richard W.M. Jones

2023-Aug-31 09:52 UTC

head link

[Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data

On Thu, Aug 31, 2023 at 11:18:27AM +0200, Laszlo Ersek
wrote:> On 8/31/23 10:55, Richard W.M. Jones wrote:
> > On Thu, Aug 31, 2023 at 10:40:53AM +0200, Laszlo Ersek wrote:
> >> On 8/31/23 00:21, Eric Blake wrote:
> >>> I hit another transient failure in libnbd CI when a
poorly-written
> >>> eval script did not consume all of stdin during .pwrite.  As
behaving
> >>> as a data sink can be a somewhat reasonable feature of a
> >>> quickly-written sh or eval plugin, we should not be so
insistent as
> >>> treating an EPIPE failure as an immediate return of EIO to the
client.
> >>>
> >>> Signed-off-by: Eric Blake <eblake at redhat.com>
> >>> ---
> >>>
> >>> I probably need to add unit test coverage of this before
committing
> >>> (although proving that I win the data race on a client process
exiting
> >>> faster than the parent can write enough data to hit EPIPE is
hard).
> >>>
> >>>  plugins/sh/nbdkit-sh-plugin.pod |  8 +++++++
> >>>  plugins/sh/call.c               | 38
++++++++++++++++++++++-----------
> >>>  2 files changed, 34 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/plugins/sh/nbdkit-sh-plugin.pod
b/plugins/sh/nbdkit-sh-plugin.pod
> >>> index b2c946a0..8b83a5b3 100644
> >>> --- a/plugins/sh/nbdkit-sh-plugin.pod
> >>> +++ b/plugins/sh/nbdkit-sh-plugin.pod
> >>> @@ -505,6 +505,14 @@ Unlike in other languages, if you provide
a C<pwrite> method you
> >>>  B<must> also provide a C<can_write> method which
exits with code C<0>
> >>>  (true).
> >>>
> >>> +With nbdkit E<ge> 1.36, this method may return
C<0> without consuming
> >>> +any data from stdin, and without producing any output, in
order to
> >>> +behave as an intentional data sink.  But in older versions,
nbdkit
> >>> +would treat any C<EPIPE> failure in writing to your
script as an error
> >>> +condition even if your script returns success; to avoid
unintended
> >>> +failures, you may want to include C<"cat
>/dev/null"> in a script
> >>> +intending to ignore the client's write requests.
> >>> +
> >>>  =item C<flush>
> >>>
> >>>   /path/to/script flush <handle>
> >>> diff --git a/plugins/sh/call.c b/plugins/sh/call.c
> >>> index 888c6459..79c67a04 100644
> >>> --- a/plugins/sh/call.c
> >>> +++ b/plugins/sh/call.c
> >>> @@ -34,6 +34,7 @@
> >>>
> >>>  #include <assert.h>
> >>>  #include <fcntl.h>
> >>> +#include <stdbool.h>
> >>>  #include <stdio.h>
> >>>  #include <stdlib.h>
> >>>  #include <inttypes.h>
> >>> @@ -130,6 +131,7 @@ debug_call (const char **argv)
> >>>   */
> >>>  static int
> >>>  call3 (const char *wbuf, size_t wbuflen, /* sent to stdin
(can be NULL) */
> >>> +       bool *pipe_full,                  /* set if wbuf not
fully written */
> >>>         string *rbuf,                     /* read from stdout
*/
> >>>         string *ebuf,                     /* read from stderr
*/
> >>>         const char **argv)                /* script +
parameters */
> >>> @@ -275,15 +277,8 @@ call3 (const char *wbuf, size_t wbuflen,
/* sent to stdin (can be NULL) */
> >>>        r = write (pfds[0].fd, wbuf, wbuflen);
> >>>        if (r == -1) {
> >>>          if (errno == EPIPE) {
> >>> -          /* We tried to write to the script but it
didn't consume
> >>> -           * the data.  Probably the script exited without
reading
> >>> -           * from stdin.  This is an error in the script.
> >>> -           */
> >>> -          nbdkit_error ("%s: write to script failed
because of a broken pipe: "
> >>> -                        "this can happen if the script
exits without "
> >>> -                        "consuming stdin, which usually
indicates a bug "
> >>> -                        "in the script",
> >>> -                        argv0);
> >>> +          *pipe_full = true;
> >>> +          r = wbuflen;
> >>>          }
> >>>          else
> >>>            nbdkit_error ("%s: write: %m", argv0);
> >>> @@ -555,7 +550,7 @@ call (const char **argv)
> >>>    CLEANUP_FREE_STRING string rbuf = empty_vector;
> >>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>>
> >>> -  r = call3 (NULL, 0, &rbuf, &ebuf, argv);
> >>> +  r = call3 (NULL, 0, NULL, &rbuf, &ebuf, argv);
> >>>    return handle_script_error (argv[0], &ebuf, r);
> >>>  }
> >>>
> >>> @@ -568,7 +563,7 @@ call_read (string *rbuf, const char
**argv)
> >>>    int r;
> >>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>>
> >>> -  r = call3 (NULL, 0, rbuf, &ebuf, argv);
> >>> +  r = call3 (NULL, 0, NULL, rbuf, &ebuf, argv);
> >>>    r = handle_script_error (argv[0], &ebuf, r);
> >>>    if (r == ERROR)
> >>>      string_reset (rbuf);
> >>> @@ -584,7 +579,26 @@ call_write (const char *wbuf, size_t
wbuflen, const char **argv)
> >>>    int r;
> >>>    CLEANUP_FREE_STRING string rbuf = empty_vector;
> >>>    CLEANUP_FREE_STRING string ebuf = empty_vector;
> >>> +  bool pipe_full = false;
> >>>
> >>> -  r = call3 (wbuf, wbuflen, &rbuf, &ebuf, argv);
> >>> +  r = call3 (wbuf, wbuflen, &pipe_full, &rbuf,
&ebuf, argv);
> >>> +  if (pipe_full && r == OK) {
> >>> +    /* We allow scripts to intentionally ignore data, but
they must
> >>> +     * have no output when doing so.
> >>> +     */
> >>> +    if (rbuf.len > 0 || ebuf.len > 0) {
> >>> +      nbdkit_error ("%s: write to script failed because
of a broken pipe: "
> >>> +                    "this can happen if the script exits
without "
> >>> +                    "consuming stdin, which usually
indicates a bug "
> >>> +                    "in the script",
> >>> +                    argv[0]);
> >>> +      r = ERROR;
> >>> +    }
> >>> +    else
> >>> +      nbdkit_debug ("%s: write to script failed because
of a broken pipe; "
> >>> +                    "assuming this was an intentional
data sink, although it "
> >>> +                    "may indicate a bug in the
script",
> >>> +                    argv[0]);
> >>> +  }
> >>>    return handle_script_error (argv[0], &ebuf, r);
> >>>  }
> >>
> >> The patch appears to do what it says on the tin, but I'm
unconvinced we
> >> should relax the requirement. The "may indicate a bug"
debug message is
> >> easy to miss. A hard failure upon EPIPE is noticeable, and as you
write
> >> the script can just drain unwanted input with "cat
>/dev/null".
> >>
> >> I think I may not be appreciating this patch due to not
understanding
> >> how difficult it could be to update such a script. Can you
elaborate on
> >> that? Can you provide an example where adding "cat
>/dev/null" to the
> >> pwrite script is a disproportionate chore?
> > 
> > This is related to the following commit which Eric pushed yesterday:
> > 
> >
https://gitlab.com/nbdkit/libnbd/-/commit/4df70870420d1be9348ac45f4aa467501eca5089
> > 
> > The commit is correct.  But it's an ongoing problem that we forget
to
> > discard all the input in pwrite.  Note you have to do this on every
> > path out of the pwrite method, including errors or (like this case)
> > where the test just isn't interested in the data.  The error is
also
> > intermittent because it is a race between two possible paths through
> > the poll(2) loop in plugins/sh/call.c:call3() Also this case only
> > applies to the sh plugin.  Other plugins (eg written in C) ignore or
> > only partially consume the buffer in pwrite all the time, and
that's
> > never a problem.
> 
> I don't understand.
> 
> https://libguestfs.org/nbdkit-plugin.3.html#pwrite
> 
> """
> The callback must write the whole count bytes if it can. The NBD
> protocol doesn't allow partial writes (instead, these would be errors).
> If the whole count bytes was written successfully, the callback should
> return 0 to indicate there was no error.
> 
> If there is an error (including a short write which couldn't be
> recovered from), .pwrite should call nbdkit_error with an error message,
> and nbdkit_set_error to record an appropriate error (unless errno is
> sufficient), then return -1.
> """
> 
> How is it safe for a plugin (written in C) to *silently* throw away part
> of the data that the client wants written?
Plugins can do what they want and we have plugins which do discard
data, silently or otherwise:

https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/full/full.c#L128
https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/null/null.c#L125
https://gitlab.com/nbdkit/nbdkit/-/blob/c662932f2bc71b5879b6eb83c93478191e6efef2/plugins/ones/ones.c#L133
> I agree consistency between sh and other plugins is good, but the
> "baseline" seems unfathomable to me. I probably don't
understand it.
sh plugin is often used for testing where the data written doesn't
matter but we're interested in other metadata like what write calls
were issued, eg:

https://gitlab.com/nbdkit/libnbd/-/blob/c713529e9fd0641b2d73f764517b5f9c21a767fd/copy/copy-sparse-no-extents.sh#L49

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

Apparently Analagous Threads

Search for more reasonably related threads

Libguestfs - Aug 2023 - [nbdkit PATCH] sh: Allow pwrite to not consume all data

[Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data

[Libguestfs] [nbdkit PATCH] sh: Allow pwrite to not consume all data

Apparently Analagous Threads