thr3ads.net - Libguestfs - [Libguestfs] [libnbd PATCH v2 5/5] copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails [Feb 2022]

If this information is useful, please help other people find it:
Share via:

Laszlo Ersek

2022-Feb-04 09:09 UTC

[Libguestfs] [libnbd PATCH v2 5/5] copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails

On 02/03/22 21:25, Eric Blake wrote:> nbdcopy has a nasty bug when performing multi-threaded copies using
> asynchronous nbd calls - it was blindly treating the completion of an
> asynchronous command as successful, rather than checking the *error
> parameter.  This can result in the silent creation of a corrupted
> image in two different ways: when a read fails, we blindly wrote
> garbage to the destination; when a write fails, we did not flag that
> the destination was not written.
> 
> Since nbdcopy already calls exit() on a synchronous read or write
> failure to a file, doing the same for an asynchronous op to an NBD
> server is the simplest solution.  A nicer solution, but more invasive
> to code and thus not done here, might be to allow up to N retries of
> the transaction (in case the read or write failure was transient), or
> even having a mode where as much data is copied as possible (portions
> of the copy that failed would be logged on stderr, and nbdcopy would
> still fail with a non-zero exit status, but this would copy more than
> just stopping at the first error, as can be done with rsync or
> ddrescue).
> 
> Note that since we rely on auto-retiring and do NOT call
> nbd_aio_command_completed, our completion callbacks must always return
> 1 (if they do not exit() first), even when acting on *error, so as not
> leave the command allocated until nbd_close.  As such, there is no
> sane way to return an error to a manual caller of the callback, and
> therefore we can drop dead code that calls perror() and exit() if the
> callback "failed".  It is also worth documenting the contract on
when
> we must manually call the callback during the asynch_zero callback, so
> that we do not leak or double-free the command; thankfully, all the
> existing code paths were correct.
> 
> The added testsuite script demonstrates several scenarios, some of
> which fail without the rest of this patch in place, and others which
> showcase ways in which sparse images can bypass errors.
> 
> Once backports are complete, a followup patch on the main branch will
> edit docs/libnbd-security.pod with the mailing list announcement of
> the stable branch commit ids and release versions that incorporate
> this fix.
> 
> Reported-by: Nir Soffer <nsoffer at redhat.com>
> Fixes: bc896eec4d ("copy: Implement multi-conn, multiple threads,
multiple requests in flight.", v1.5.6)
> Fixes: https://bugzilla.redhat.com/2046194
> ---
>  TODO                        |  1 +
>  copy/Makefile.am            |  4 +-
>  copy/copy-nbd-error.sh      | 78 +++++++++++++++++++++++++++++++++++++
>  copy/file-ops.c             | 17 +++-----
>  copy/multi-thread-copying.c | 15 +++++++
>  copy/nbdcopy.h              |  7 ++--
>  copy/null-ops.c             | 10 +----
>  7 files changed, 108 insertions(+), 24 deletions(-)
>  create mode 100755 copy/copy-nbd-error.sh
> 
> diff --git a/TODO b/TODO
> index da157942..7c9c15e2 100644
> --- a/TODO
> +++ b/TODO
> @@ -33,6 +33,7 @@ nbdcopy:
>   - Better page cache usage, see nbdkit-file-plugin options
>     fadvise=sequential cache=none.
>   - Consider io_uring if there are performance bottlenecks.
> + - Configurable retries in response to read or write failures.
> 
>  nbdfuse:
>   - If you write beyond the end of the virtual file, it returns EIO.
> diff --git a/copy/Makefile.am b/copy/Makefile.am
> index f2100853..e729f86a 100644
> --- a/copy/Makefile.am
> +++ b/copy/Makefile.am
> @@ -1,5 +1,5 @@
>  # nbd client library in userspace
> -# Copyright (C) 2020 Red Hat Inc.
> +# Copyright (C) 2020-2022 Red Hat Inc.
>  #
>  # This library is free software; you can redistribute it and/or
>  # modify it under the terms of the GNU Lesser General Public
> @@ -33,6 +33,7 @@ EXTRA_DIST = \
>  	copy-nbd-to-small-nbd-error.sh \
>  	copy-nbd-to-sparse-file.sh \
>  	copy-nbd-to-stdout.sh \
> +	copy-nbd-error.sh \
>  	copy-progress-bar.sh \
>  	copy-sparse.sh \
>  	copy-sparse-allocated.sh \
> @@ -124,6 +125,7 @@ TESTS += \
>  	copy-stdin-to-nbd.sh \
>  	copy-stdin-to-null.sh \
>  	copy-nbd-to-stdout.sh \
> +	copy-nbd-error.sh \
>  	copy-progress-bar.sh \
>  	copy-sparse.sh \
>  	copy-sparse-allocated.sh \
> diff --git a/copy/copy-nbd-error.sh b/copy/copy-nbd-error.sh
> new file mode 100755
> index 00000000..89f0a2f1
> --- /dev/null
> +++ b/copy/copy-nbd-error.sh
> @@ -0,0 +1,78 @@
> +#!/usr/bin/env bash
> +# nbd client library in userspace
> +# Copyright (C) 2022 Red Hat Inc.
> +#
> +# This library is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU Lesser General Public
> +# License as published by the Free Software Foundation; either
> +# version 2 of the License, or (at your option) any later version.
> +#
> +# This library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public
> +# License along with this library; if not, write to the Free Software
> +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
USA
> +
> +# Tests several scenarios of handling NBD server errors
> +# Serves as a regression test for the CVE-2022-0485 fix.
> +
> +. ../tests/functions.sh
> +
> +set -e
> +set -x
> +
> +requires nbdkit --exit-with-parent --version
> +
> +fail=0
> +
> +# Failure to get block status should not be fatal, but merely downgrade to
> +# reading the entire image as if data
> +echo "Testing extents failures on source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \
> +    error-extents-rate=1 ] null: || fail=1
> +
> +# Failure to read should be fatal
> +echo "Testing read failures on non-sparse source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \
> +    error-pread-rate=0.5 ] null: && fail=1
> +
> +# However, reliable block status on a sparse image can avoid the need to
read
> +echo "Testing read failures on sparse source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error null 5M \
> +    error-pread-rate=1 ] null: || fail=1
> +
> +# Failure to write data should be fatal
> +echo "Testing write data failures on arbitrary destination"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v pattern 5M ] \
> +    [ nbdkit --exit-with-parent -v --filter=error --filter=noextents \
> +        memory 5M error-pwrite-rate=0.5 ] && fail=1
> +
> +# However, writing zeroes can bypass the need for normal writes
> +echo "Testing write data failures from sparse source"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v null 5M ] \
> +    [ nbdkit --exit-with-parent -v --filter=error --filter=noextents \
> +        memory 5M error-pwrite-rate=1 ] || fail=1
> +
> +# Failure to write zeroes should be fatal
> +echo "Testing write zero failures on arbitrary destination"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v null 5M ] \
> +    [ nbdkit --exit-with-parent -v --filter=error memory 5M \
> +        error-zero-rate=1 ] && fail=1
> +
> +# However, assuming/learning destination is zero can skip need to write
> +echo "Testing write failures on pre-zeroed destination"
> +$VG nbdcopy --destination-is-zero -- \
> +    [ nbdkit --exit-with-parent -v null 5M ] \
> +    [ nbdkit --exit-with-parent -v --filter=error memory 5M \
> +        error-pwrite-rate=1 error-zero-rate=1 ] || fail=1
> +
> +# Likewise, when write zero is not advertised, fallback to normal write
works
> +echo "Testing write zeroes to destination without zero support"
> +$VG nbdcopy -- [ nbdkit --exit-with-parent -v null 5M ] \
> +    [ nbdkit --exit-with-parent -v --filter=nozero --filter=error memory
5M \
> +        error-zero-rate=1 ] || fail=1
> +
> +exit $fail
> diff --git a/copy/file-ops.c b/copy/file-ops.c
> index e37a5014..aaf04ade 100644
> --- a/copy/file-ops.c
> +++ b/copy/file-ops.c
> @@ -591,10 +591,8 @@ file_asynch_read (struct rw *rw,
> 
>    file_synch_read (rw, slice_ptr (command->slice),
>                     command->slice.len, command->offset);
> -  if (cb.callback (cb.user_data, &dummy) == -1) {
> -    perror (rw->name);
> -    exit (EXIT_FAILURE);
> -  }
> +  /* file_synch_read called exit() on error */
> +  cb.callback (cb.user_data, &dummy);
>  }
> 
>  static void
> @@ -606,10 +604,8 @@ file_asynch_write (struct rw *rw,
> 
>    file_synch_write (rw, slice_ptr (command->slice),
>                      command->slice.len, command->offset);
> -  if (cb.callback (cb.user_data, &dummy) == -1) {
> -    perror (rw->name);
> -    exit (EXIT_FAILURE);
> -  }
> +  /* file_synch_write called exit() on error */
> +  cb.callback (cb.user_data, &dummy);
>  }
> 
>  static bool
> @@ -620,10 +616,7 @@ file_asynch_zero (struct rw *rw, struct command
*command,
> 
>    if (!file_synch_zero (rw, command->offset, command->slice.len,
allocate))
>      return false;
> -  if (cb.callback (cb.user_data, &dummy) == -1) {
> -    perror (rw->name);
> -    exit (EXIT_FAILURE);
> -  }
> +  cb.callback (cb.user_data, &dummy);
>    return true;
>  }
> 
> diff --git a/copy/multi-thread-copying.c b/copy/multi-thread-copying.c
> index 815b8a02..2aa74d63 100644
> --- a/copy/multi-thread-copying.c
> +++ b/copy/multi-thread-copying.c
> @@ -28,6 +28,7 @@
>  #include <errno.h>
>  #include <assert.h>
>  #include <sys/stat.h>
> +#include <inttypes.h>
> 
>  #include <pthread.h>
> 
> @@ -376,6 +377,13 @@ finished_read (void *vp, int *error)
>  {
>    struct command *command = vp;
> 
> +  /* XXX - is it worth retrying a failed command? */
> +  if (*error) {
> +    fprintf (stderr, "read at offset 0x%" PRIx64 "failed:
%s\n",
Like Nir said, it should be '" failed..."'. (I'm neutral
on PRIx64 vs.
PRIu64.)

Reviewed-by: Laszlo Ersek <lersek at redhat.com>

Thanks
Laszlo
> +             command->offset, strerror (*error));
> +    exit (EXIT_FAILURE);
> +  }
> +
>    if (allocated || sparse_size == 0) {
>      /* If sparseness detection (see below) is turned off then we write
>       * the whole command.
> @@ -548,6 +556,13 @@ free_command (void *vp, int *error)
>    struct command *command = vp;
>    struct buffer *buffer = command->slice.buffer;
> 
> +  /* XXX - is it worth retrying a failed command? */
> +  if (*error) {
> +    fprintf (stderr, "write at offset 0x%" PRIx64 "failed:
%s\n",
> +             command->offset, strerror (*error));
> +    exit (EXIT_FAILURE);
> +  }
> +
>    if (buffer != NULL) {
>      if (--buffer->refs == 0) {
>        free (buffer->data);
> diff --git a/copy/nbdcopy.h b/copy/nbdcopy.h
> index e7fe1eab..c070f8d7 100644
> --- a/copy/nbdcopy.h
> +++ b/copy/nbdcopy.h
> @@ -1,5 +1,5 @@
>  /* NBD client library in userspace.
> - * Copyright (C) 2020-2021 Red Hat Inc.
> + * Copyright (C) 2020-2022 Red Hat Inc.
>   *
>   * This library is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU Lesser General Public
> @@ -157,7 +157,8 @@ struct rw_ops {
>                        bool allocate);
> 
>    /* Asynchronous I/O operations.  These start the operation and call
> -   * 'cb' on completion.
> +   * 'cb' on completion.  'cb' will return 1, for
auto-retiring with
> +   * asynchronous libnbd calls.
>     *
>     * The file_ops versions are actually implemented synchronously, but
>     * still call 'cb'.
> @@ -173,7 +174,7 @@ struct rw_ops {
>                          nbd_completion_callback cb);
> 
>    /* Asynchronously zero.  command->slice.buffer is not used.  If not
possible,
> -   * returns false.
> +   * returns false.  'cb' must be called only if returning true.
>     */
>    bool (*asynch_zero) (struct rw *rw, struct command *command,
>                         nbd_completion_callback cb, bool allocate);
> diff --git a/copy/null-ops.c b/copy/null-ops.c
> index 11cd5116..5f1fda50 100644
> --- a/copy/null-ops.c
> +++ b/copy/null-ops.c
> @@ -119,10 +119,7 @@ null_asynch_write (struct rw *rw,
>  {
>    int dummy = 0;
> 
> -  if (cb.callback (cb.user_data, &dummy) == -1) {
> -    perror (rw->name);
> -    exit (EXIT_FAILURE);
> -  }
> +  cb.callback (cb.user_data, &dummy);
>  }
> 
>  static bool
> @@ -131,10 +128,7 @@ null_asynch_zero (struct rw *rw, struct command
*command,
>  {
>    int dummy = 0;
> 
> -  if (cb.callback (cb.user_data, &dummy) == -1) {
> -    perror (rw->name);
> -    exit (EXIT_FAILURE);
> -  }
> +  cb.callback (cb.user_data, &dummy);
>    return true;
>  }
>

Eric Blake

2022-Feb-04 14:44 UTC

head link

[Libguestfs] [libnbd PATCH v2 5/5] copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails

On Fri, Feb 04, 2022 at 10:09:26AM +0100, Laszlo Ersek
wrote:> On 02/03/22 21:25, Eric Blake wrote:
> > nbdcopy has a nasty bug when performing multi-threaded copies using
> > asynchronous nbd calls - it was blindly treating the completion of an
> > asynchronous command as successful, rather than checking the *error
> > parameter.  This can result in the silent creation of a corrupted
> > image in two different ways: when a read fails, we blindly wrote
> > garbage to the destination; when a write fails, we did not flag that
> > the destination was not written.
> > 
> > +  /* XXX - is it worth retrying a failed command? */
> > +  if (*error) {
> > +    fprintf (stderr, "read at offset 0x%" PRIx64
"failed: %s\n",
> 
> Like Nir said, it should be '" failed..."'. (I'm
neutral on PRIx64 vs.
> PRIu64.)
> 
> Reviewed-by: Laszlo Ersek <lersek at redhat.com>
> 
> Thanks
> Laszlo
Now pushed upstream as a865526..8d444b4, with tweaks to patches 1 and
5 content and patch 4 commit message per review comments.

I'm starting the backport process to stable branches, and will
followup with a top-level post as the security announcement (although
given my schedule today, the announcement may be delayed to Monday).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Libguestfs - Feb 2022 - [libnbd PATCH v2 5/5] copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails

[Libguestfs] [libnbd PATCH v2 5/5] copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails

[Libguestfs] [libnbd PATCH v2 5/5] copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails