thr3ads.net - Libguestfs - [Libguestfs] [libnbd PATCH] copy: Fail nbdcopy if NBD read or write fails [Feb 2022]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2022-Feb-03 13:49 UTC

[Libguestfs] [libnbd PATCH] copy: Fail nbdcopy if NBD read or write fails

On Thu, Feb 03, 2022 at 11:52:43AM +0100, Laszlo Ersek
wrote:> On 02/03/22 02:50, Eric Blake wrote:
> > FIXME: This is CVE-2022-XXXXX (still awaiting assignment of the CVE
number).
> > 
> > nbdcopy has a nasty bug when performing multi-threaded copies using
> > asynchronous nbd calls - it was blindly treating the completion of an
> > asynchronous command as successful, rather than checking the *error
> > parameter.  This can result in the silent creation of a corrupted
> > image in two different ways: when a read fails, we blindly wrote
> > garbage to the destination; when a write fails, we did not flag that
> > the destination was not written.
> > 
> > Since nbdcopy already calls exit() on a synchronous read or write
> > failure to a file, doing the same for an asynchronous op to an NBD
> > server is the simplest solution.  A nicer solution, but more invasive
> > to write and thus not done here, might be to allow up to N retries of
> > the transaction (in case the read or write failure was transient), or
> > even having a mode where as much data is copied as possible (portions
> > of the copy that failed would be logged on stderr, and nbdcopy would
> > still fail with a non-zero exit status, but this would copy more than
> > just stopping at the first error).
> 
> That seems more like a libnbd-backed rsync :)
I was thinking more like ddrescue, but the idea is the same ;)
> > 
> > I'm not sure if I like passing &errno in the manual calls to
> > cb.callback(); better might be 'int dummy = 0; cb.callback(...,
> > &dummy)', but changing that could be a separate patch.
> 
> I agree, a patch should likely be prepended for updating that; the
> callback may not expect that *error alias errno (IIUC).
Okay, I will do that as a preliminary patch.
> > +++ b/copy/Makefile.am
> > @@ -1,5 +1,5 @@
> >  # nbd client library in userspace
> > -# Copyright (C) 2020 Red Hat Inc.
> > +# Copyright (C) 2020-2022 Red Hat Inc.
> >  #
> >  # This library is free software; you can redistribute it and/or
> >  # modify it under the terms of the GNU Lesser General Public
> > @@ -33,6 +33,7 @@ EXTRA_DIST = \
> >  	copy-nbd-to-small-nbd-error.sh \
> >  	copy-nbd-to-sparse-file.sh \
> >  	copy-nbd-to-stdout.sh \
> > +  copy-nbd-error.sh \
> 
> TABs vs spaces
Huh; I wonder if this is yet another fallout from Nir's addition of
.editorconfig.  Makefile.am is odd - there are places where tabs are
important, but not the entire file; but I definitely agree that local
consistency is important even in the places where tab vs. space
doesn't affect automake/make semantics.
> > +# Run an nbdkit server that randomly fails.
> > +nbdkit --exit-with-parent -f -v -P $pidfile -U $sock \
> > +       --filter=error --filter=noextents \
> > +       memory size=5M error-pread-rate=0.5 error-pwrite-rate=0.5
&
> > +# Wait for the pidfile to appear.
> > +for i in {1..60}; do
> > +    if test -f $pidfile; then
> > +        break
> > +    fi
> > +    sleep 1
> > +done
> > +if ! test -f $pidfile; then
> > +    echo "$0: nbdkit did not start up"
> 
> redirect to stderr perhaps?
Copy-paste from other existing tests.  If I clean it up, it's worth
doing globally as a separate patch.  Ultimately, as long as it ends up
in the .log for debugging test failures, I don't care if it got there
through stdout or stderr.
> 
> > +    exit 1
> > +fi
> > +
> > +fail=0
> > +
> > +# Failure to read should be fatal
> > +$VG nbdcopy -- "nbd+unix:///?socket=$sock" null: &&
fail=1
> > +
> > +# Failure to write should be fatal
> > +$VG nbdcopy -- [ nbdkit --exit-with-parent -v pattern 5M ]
"nbd+unix:///?socket=$sock" && fail=1
Hmm, I should probably use \-newline to split this long line.
> 
> What's the reason to test writing if the read test fails first?
When running the testsuite, it's nice to uncover as many cases as
possible in a single run, rather than patching one early failure just
to encounter another failure on the next test run that was previously
skipped because of the first failure.  One way to do this is to write
two separate tests (one that tests failed reads from NBD as source, a
second that tests failed writes to NBD as destination), but writing
two nearly-identical .sh files is more boilerplate overhead than just
jamming related tests into the same .sh script.
> 
> Looks OK to me, but should be definitely reviewed by another reviewer.
Looks like Rich has given ACK as well.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Laszlo Ersek

2022-Feb-04 08:18 UTC

head link

[Libguestfs] [libnbd PATCH] copy: Fail nbdcopy if NBD read or write fails

On 02/03/22 14:49, Eric Blake wrote:> On Thu, Feb 03, 2022 at 11:52:43AM +0100, Laszlo Ersek wrote:
>> What's the reason to test writing if the read test fails first?
> 
> When running the testsuite, it's nice to uncover as many cases as
> possible in a single run, rather than patching one early failure just
> to encounter another failure on the next test run that was previously
> skipped because of the first failure.
This is a great idea, and I wish it worked consistently *between*
separate test cases too. I distinctly remember connect-tcp6 failing for
me (earlier), and myself dismissing it, and assuming everything else was
OK. Then, in the RHEL9 Brew environment, where connect-tcp6 was
specifically disabled, I was hit by the actual test failures :/

(And there was no easy way to skip connect-tcp6.)
> One way to do this is to write
> two separate tests (one that tests failed reads from NBD as source, a
> second that tests failed writes to NBD as destination), but writing
> two nearly-identical .sh files is more boilerplate overhead than just
> jamming related tests into the same .sh script.
Right, but this is exactly what I wish worked -- but IME doesn't always
work.
> 
>>
>> Looks OK to me, but should be definitely reviewed by another reviewer.
> 
> Looks like Rich has given ACK as well.
> 

Thanks!
Laszlo

Libguestfs - Feb 2022 - [libnbd PATCH] copy: Fail nbdcopy if NBD read or write fails

[Libguestfs] [libnbd PATCH] copy: Fail nbdcopy if NBD read or write fails

[Libguestfs] [libnbd PATCH] copy: Fail nbdcopy if NBD read or write fails