Nir Soffer
2022-Jan-26 23:49 UTC
[Libguestfs] [PATCH libnbd 2/2] examples: copy-libev.c: Fix error handling
This example failed to check the *error parameter to the completion and extent callbacks. - If the source NBD server failed a read, we wrote stale data from the request buffer to the destination image, corrupting the image. - If the destination NBD server failed a write or zero command, we ignored the error, leaving previous content on the destination image, corrupting the image. - Error in the extents callbacks were ignored. I'm not sure if this was a real problem, but it is a very bad example. In all cases, the copy would end with zero exit code creating a corrupted image. Fixed by failing the copy if read, write, zero commands completed with non-zero *error parameter. If an extent callback completed with non-zero *error, we log the error and abort the callback, leaving the extents array as NULL. If extents completion callback completed with non-zero *error we don't need to check it, since we already handle a NULL extents array by disabling can_extents. Here is an example failure: $ ./copy-libev nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock copy-libev: r0: read failed: Input/output error $ echo $? 1 Signed-off-by: Nir Soffer <nsoffer at redhat.com> --- examples/copy-libev.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/examples/copy-libev.c b/examples/copy-libev.c index 13db898a..ad50b64c 100644 --- a/examples/copy-libev.c +++ b/examples/copy-libev.c @@ -184,20 +184,26 @@ static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, uint32_t *entries, size_t nr_entries, int *error) { struct request *r = user_data; if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) != 0) { DEBUG ("Unexpected meta context: %s", metacontext); return 1; } + if (*error) { + DEBUG ("r%zu: extent callback for %s failed: %s", + r->index, LIBNBD_CONTEXT_BASE_ALLOCATION, strerror (*error)); + return 1; + } + /* Libnbd returns uint32_t pair (length, flags) for each extent. */ extents_len = nr_entries / 2; extents = malloc (extents_len * sizeof *extents); if (extents == NULL) FAIL ("Cannot allocated extents: %s", strerror (errno)); /* Copy libnbd entries to extents array. */ for (int i = 0, j = 0; i < extents_len; i++, j=i*2) { extents[i].length = entries[j]; @@ -240,20 +246,26 @@ static void wake_up_requests () static int extents_completed (void *user_data, int *error) { struct request *r = (struct request *)user_data; DEBUG ("r%zu: extents completed time=%.6f", r->index, ev_now (loop) - r->started); extents_in_progress = false; + /* + * If extents failed (*error != 0), the extent callback was not + * called, or called with an error, so we did not allocate new + * extents array. + */ + if (extents == NULL) { DEBUG ("r%zu: received no extents, disabling extents", r->index); src.can_extents = false; } /* Start the request to process recvievd extents. This must be done on the * next loop iteration, to avoid deadlock if we need to start a read. */ start_request_soon (r); @@ -434,20 +446,23 @@ start_read(struct request *r) 0); if (cookie == -1) FAIL ("Cannot start read: %s", nbd_get_error ()); } static int read_completed (void *user_data, int *error) { struct request *r = (struct request *)user_data; + if (*error) + FAIL ("r%zu: read failed: %s", r->index, strerror (*error)); + DEBUG ("r%zu: read completed offset=%" PRIi64 " len=%zu", r->index, r->offset, r->length); if (dst.can_zero && is_zero (r->data, r->length)) start_zero (r); else start_write (r); return 1; } @@ -489,20 +504,24 @@ start_zero(struct request *r) if (cookie == -1) FAIL ("Cannot start zero: %s", nbd_get_error ()); } /* Called when async copy or zero request completed. */ static int request_completed (void *user_data, int *error) { struct request *r = (struct request *)user_data; + if (*error) + FAIL ("r%zu: %s failed: %s", + r->index, request_state (r), strerror (*error)); + written += r->length; DEBUG ("r%zu: %s completed offset=%" PRIi64 " len=%zu, time=%.6f", r->index, request_state (r), r->offset, r->length, ev_now (loop) - r->started); if (written == size) { /* The last write completed. Stop all watchers and break out * from the event loop. */ -- 2.34.1
Richard W.M. Jones
2022-Jan-27 08:49 UTC
[Libguestfs] [PATCH libnbd 2/2] examples: copy-libev.c: Fix error handling
ACK series Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Eric Blake
2022-Jan-27 18:35 UTC
[Libguestfs] [PATCH libnbd 2/2] examples: copy-libev.c: Fix error handling
On Thu, Jan 27, 2022 at 01:49:31AM +0200, Nir Soffer wrote:> This example failed to check the *error parameter to the completion and > extent callbacks. > > - If the source NBD server failed a read, we wrote stale data from the > request buffer to the destination image, corrupting the > image. > > - If the destination NBD server failed a write or zero command, we > ignored the error, leaving previous content on the destination image, > corrupting the image. > > - Error in the extents callbacks were ignored. I'm not sure if this was > a real problem, but it is a very bad example. > > In all cases, the copy would end with zero exit code creating a > corrupted image.Oh well. This silent data corruption affects more than copy-libev.c; I'm in process of getting a CVE assigned for the same bug in nbdcopy. (I had been hoping to keep it under embargo if needed, but the cat's out of the bag now, so watch for similar patches to other affected clients.) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org