Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 3/3] copy: Do not initialize read buffer
nbdcopy checks pread error now, so we will never leak uninitialized data from the heap to the destination server. Testing show 3-8% speedup when copying a real image. On laptop with 12 cores and 2 consumer NVMe drives: $ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \ --socket /tmp/src.sock --format raw fedora-35-data.raw & $ qemu-nbd --persistent --shared 8 --cache none --aio native --discard unmap \ --socket /tmp/dst.sock --format raw dst.raw & $ hyperfine -p "sleep 5" "nbdcopy $SRC $DST" ".libs/nbdcopy $SRC $DST" Benchmark 1: nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.065 s ? 0.057 s [User: 0.296 s, System: 1.414 s] Range (min ? max): 2.000 s ? 2.163 s 10 runs Benchmark 2: .libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 1.911 s ? 0.050 s [User: 0.059 s, System: 1.544 s] Range (min ? max): 1.827 s ? 1.980 s 10 runs Summary '.libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran 1.08 ? 0.04 times faster than 'nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' On server with 80 cores and a fast NVMe drive: $ hyperfine "./nbdcopy-init $SRC $DST" "./nbdcopy-no-init $SRC $DST" Benchmark 1: ./nbdcopy-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.652 s ? 0.033 s [User: 0.345 s, System: 1.306 s] Range (min ? max): 2.619 s ? 2.709 s 10 runs Benchmark 2: ./nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.572 s ? 0.031 s [User: 0.055 s, System: 1.409 s] Range (min ? max): 2.537 s ? 2.629 s 10 runs Summary './nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran 1.03 ? 0.02 times faster than './nbdcopy-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' Signed-off-by: Nir Soffer <nsoffer at redhat.com> --- copy/nbd-ops.c | 1 + 1 file changed, 1 insertion(+) diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c index adfe4de5..1218d6d0 100644 --- a/copy/nbd-ops.c +++ b/copy/nbd-ops.c @@ -52,20 +52,21 @@ static void open_one_nbd_handle (struct rw_nbd *rwn) { struct nbd_handle *nbd; nbd = nbd_create (); if (nbd == NULL) { fprintf (stderr, "%s: %s\n", prog, nbd_get_error ()); exit (EXIT_FAILURE); } + nbd_set_pread_initialize (nbd, false); nbd_set_debug (nbd, verbose); /* Set the handle name for debugging. We could use rwn->rw.name * here but it is usually set to the lengthy NBD URI * (eg. "nbd://localhost:10809") which makes debug messages very * long. */ if (verbose) { char *name; const size_t index = rwn->handles.len; -- 2.35.1
Richard W.M. Jones
2022-Mar-06 20:37 UTC
[Libguestfs] [PATCH libnbd 3/3] copy: Do not initialize read buffer
On Sun, Mar 06, 2022 at 10:27:30PM +0200, Nir Soffer wrote:> nbdcopy checks pread error now, so we will never leak uninitialized data > from the heap to the destination server. Testing show 3-8% speedup when > copying a real image. > > On laptop with 12 cores and 2 consumer NVMe drives: > > $ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \ > --socket /tmp/src.sock --format raw fedora-35-data.raw & > > $ qemu-nbd --persistent --shared 8 --cache none --aio native --discard unmap \ > --socket /tmp/dst.sock --format raw dst.raw & > > $ hyperfine -p "sleep 5" "nbdcopy $SRC $DST" ".libs/nbdcopy $SRC $DST" > > Benchmark 1: nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock > Time (mean ? ?): 2.065 s ? 0.057 s [User: 0.296 s, System: 1.414 s] > Range (min ? max): 2.000 s ? 2.163 s 10 runs > > Benchmark 2: .libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock > Time (mean ? ?): 1.911 s ? 0.050 s [User: 0.059 s, System: 1.544 s] > Range (min ? max): 1.827 s ? 1.980 s 10 runs > > Summary > '.libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran > 1.08 ? 0.04 times faster than 'nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' > > On server with 80 cores and a fast NVMe drive: > > $ hyperfine "./nbdcopy-init $SRC $DST" "./nbdcopy-no-init $SRC $DST" > > Benchmark 1: ./nbdcopy-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock > Time (mean ? ?): 2.652 s ? 0.033 s [User: 0.345 s, System: 1.306 s] > Range (min ? max): 2.619 s ? 2.709 s 10 runs > > Benchmark 2: ./nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock > Time (mean ? ?): 2.572 s ? 0.031 s [User: 0.055 s, System: 1.409 s] > Range (min ? max): 2.537 s ? 2.629 s 10 runs > > Summary > './nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran > 1.03 ? 0.02 times faster than './nbdcopy-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' > > Signed-off-by: Nir Soffer <nsoffer at redhat.com> > --- > copy/nbd-ops.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c > index adfe4de5..1218d6d0 100644 > --- a/copy/nbd-ops.c > +++ b/copy/nbd-ops.c > @@ -52,20 +52,21 @@ static void > open_one_nbd_handle (struct rw_nbd *rwn) > { > struct nbd_handle *nbd; > > nbd = nbd_create (); > if (nbd == NULL) { > fprintf (stderr, "%s: %s\n", prog, nbd_get_error ()); > exit (EXIT_FAILURE); > } > > + nbd_set_pread_initialize (nbd, false);I feel this is worth a small comment too even though it's not an example. Something simple like this is enough: + /* This is only safe because we always check for errors from pread. */ + nbd_set_pread_initialize (nbd, false); ACK series - with comments added. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Eric Blake
2022-Mar-10 15:58 UTC
[Libguestfs] [PATCH libnbd 3/3] copy: Do not initialize read buffer
On Sun, Mar 06, 2022 at 10:27:30PM +0200, Nir Soffer wrote:> nbdcopy checks pread error now, so we will never leak uninitialized data > from the heap to the destination server. Testing show 3-8% speedup when > copying a real image. >> +++ b/copy/nbd-ops.c > @@ -52,20 +52,21 @@ static void > open_one_nbd_handle (struct rw_nbd *rwn) > { > struct nbd_handle *nbd; > > nbd = nbd_create (); > if (nbd == NULL) { > fprintf (stderr, "%s: %s\n", prog, nbd_get_error ()); > exit (EXIT_FAILURE); > } > > + nbd_set_pread_initialize (nbd, false); > nbd_set_debug (nbd, verbose);Pre-existing that we did not check for failure from nbd_set_debug(), so it is not made worse by not checking for failure of nbd_set_pread_initialize(). Then again, nbd_set_debug() is currently documented as being able to fail, but in practice cannot - we do not restrict it to a subset of states, and its implementation is dirt-simple in lib/debug.c. We may want (as a separate patch) to tweak this function to be mared as may_set_error=false, the way nbd_get_debug() is (as long as such change does not impact the API). Similarly, nbd_set_pread_initialize() has no restrictions on which states it can be used in, so maybe we should also mark it as may_set_error=false. Contrast that with things like nbd_set_request_block_size(), which really do make sense to limit to certain states (once negotiation is done, changing the flag has no effect). So we may have further cleanups to do, but once you add the comments requested by Rich throughout the series, and the error checking I suggested in 2/3, with the series. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org