Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 0/3] Disablble pread buffer initiialization
Disbaling pread initialization is atually measurable and gives small speedup in nbdcopy and some examples. Some cases are safer; in copy-libev example, we allocate all buffers with calloc() and resuse them for the entire copy, so the initialization is completly useless. In nbdcopy and Go aio_copy example, we allocate new buffer using malloc(), so when working with bad server that does not return all data chunks in structured reply, we can leak uninitialized memory to the destination server. I think this issue should be solved by libnbd; it should verfify that the server return all the expected chunks and fail the request if not. Nir Soffer (3): golang: examples: Do not initialize pread buffer examples: copy-libev: Do not initialize pread buffer copy: Do not initialize read buffer copy/nbd-ops.c | 1 + examples/copy-libev.c | 6 ++++++ golang/examples/aio_copy/aio_copy.go | 5 +++++ golang/examples/simple_copy/simple_copy.go | 5 +++++ 4 files changed, 17 insertions(+) -- 2.35.1
Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 1/3] golang: examples: Do not initialize pread buffer
The aio_copy example checks errors properly, so it will not leak uninitialized memory to the destination image. Testing shows 5% speedup when copying a real image. $ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \ --socket /tmp/src.sock --format raw fedora-35-data.raw & $ hyperfine -p "sleep 5" "./aio_copy-init $SRC >/dev/null" "./aio_copy-no-init $SRC >/dev/null" Benchmark 1: ./aio_copy-init nbd+unix:///?socket=/tmp/src.sock >/dev/null Time (mean ? ?): 1.452 s ? 0.027 s [User: 0.330 s, System: 0.489 s] Range (min ? max): 1.426 s ? 1.506 s 10 runs Benchmark 2: ./aio_copy-no-init nbd+unix:///?socket=/tmp/src.sock >/dev/null Time (mean ? ?): 1.378 s ? 0.009 s [User: 0.202 s, System: 0.484 s] Range (min ? max): 1.369 s ? 1.399 s 10 runs Summary './aio_copy-no-init nbd+unix:///?socket=/tmp/src.sock >/dev/null' ran 1.05 ? 0.02 times faster than './aio_copy-init nbd+unix:///?socket=/tmp/src.sock >/dev/null' Signed-off-by: Nir Soffer <nsoffer at redhat.com> --- golang/examples/aio_copy/aio_copy.go | 5 +++++ golang/examples/simple_copy/simple_copy.go | 5 +++++ 2 files changed, 10 insertions(+) diff --git a/golang/examples/aio_copy/aio_copy.go b/golang/examples/aio_copy/aio_copy.go index bb20b478..89eac4df 100644 --- a/golang/examples/aio_copy/aio_copy.go +++ b/golang/examples/aio_copy/aio_copy.go @@ -84,20 +84,25 @@ func main() { err = h.ConnectUri(flag.Arg(0)) if err != nil { panic(err) } size, err := h.GetSize() if err != nil { panic(err) } + err = h.SetPreadInitialize(false) + if err != nil { + panic(err) + } + var offset uint64 for offset < size || queue.Len() > 0 { for offset < size && inflightRequests() < *requests { length := *requestSize if size-offset < uint64(length) { length = uint(size - offset) } startRead(offset, length) diff --git a/golang/examples/simple_copy/simple_copy.go b/golang/examples/simple_copy/simple_copy.go index e8fa1f76..2a2ed0ff 100644 --- a/golang/examples/simple_copy/simple_copy.go +++ b/golang/examples/simple_copy/simple_copy.go @@ -63,20 +63,25 @@ func main() { err = h.ConnectUri(flag.Arg(0)) if err != nil { panic(err) } size, err := h.GetSize() if err != nil { panic(err) } + err = h.SetPreadInitialize(false) + if err != nil { + panic(err) + } + buf := make([]byte, *requestSize) var offset uint64 for offset < size { if size-offset < uint64(len(buf)) { buf = buf[:offset-size] } err = h.Pread(buf, offset, nil) if err != nil { -- 2.35.1
Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 2/3] examples: copy-libev: Do not initialize pread buffer
This safety feature is not needed since we handle pread errors, and even if we did not, we use initialized buffers. Testing shows 10% speedup when copying a real image. $ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \ --socket /tmp/src.sock --format raw fedora-35-data.raw & $ qemu-nbd --persistent --shared 8 --cache none --aio native --discard unmap \ --socket /tmp/dst.sock --format raw dst.raw & $ hyperfine -p "sleep 5" "./copy-libev-init $SRC $DST" "./copy-libev-no-init $SRC $DST" Benchmark 1: ./copy-libev-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.895 s ? 0.026 s [User: 0.387 s, System: 1.645 s] Range (min ? max): 2.853 s ? 2.933 s 10 runs Benchmark 2: ./copy-libev-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.636 s ? 0.034 s [User: 0.052 s, System: 1.628 s] Range (min ? max): 2.557 s ? 2.677 s 10 runs Summary './copy-libev-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran 1.10 ? 0.02 times faster than './copy-libev-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' Signed-off-by: Nir Soffer <nsoffer at redhat.com> --- examples/copy-libev.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/examples/copy-libev.c b/examples/copy-libev.c index ad50b64c..f36939a7 100644 --- a/examples/copy-libev.c +++ b/examples/copy-libev.c @@ -647,20 +647,26 @@ main (int argc, char *argv[]) size = nbd_get_size (src.nbd); if (size > nbd_get_size (dst.nbd)) FAIL ("Destinatio is not large enough\n"); /* Check destination server capabilities. */ dst.can_zero = nbd_can_zero (dst.nbd) > 0; + /* Disable pread buffer initialization. This is not needed since we + * handle pread errors, and even if we fail to handle errors, we use + * initialized buffers. */ + + nbd_set_pread_initialize (src.nbd, false); + /* Start the copy "loop". When request completes, it starts the * next request, until entire image was copied. */ for (i = 0; i < MAX_REQUESTS; i++) { struct request *r = &requests[i]; r->index = i; /* * Clear the buffer before starting the copy, so if we fail to * handle a read error we will not write uninitilized data to -- 2.35.1
Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 3/3] copy: Do not initialize read buffer
nbdcopy checks pread error now, so we will never leak uninitialized data from the heap to the destination server. Testing show 3-8% speedup when copying a real image. On laptop with 12 cores and 2 consumer NVMe drives: $ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \ --socket /tmp/src.sock --format raw fedora-35-data.raw & $ qemu-nbd --persistent --shared 8 --cache none --aio native --discard unmap \ --socket /tmp/dst.sock --format raw dst.raw & $ hyperfine -p "sleep 5" "nbdcopy $SRC $DST" ".libs/nbdcopy $SRC $DST" Benchmark 1: nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.065 s ? 0.057 s [User: 0.296 s, System: 1.414 s] Range (min ? max): 2.000 s ? 2.163 s 10 runs Benchmark 2: .libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 1.911 s ? 0.050 s [User: 0.059 s, System: 1.544 s] Range (min ? max): 1.827 s ? 1.980 s 10 runs Summary '.libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran 1.08 ? 0.04 times faster than 'nbdcopy nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' On server with 80 cores and a fast NVMe drive: $ hyperfine "./nbdcopy-init $SRC $DST" "./nbdcopy-no-init $SRC $DST" Benchmark 1: ./nbdcopy-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.652 s ? 0.033 s [User: 0.345 s, System: 1.306 s] Range (min ? max): 2.619 s ? 2.709 s 10 runs Benchmark 2: ./nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock Time (mean ? ?): 2.572 s ? 0.031 s [User: 0.055 s, System: 1.409 s] Range (min ? max): 2.537 s ? 2.629 s 10 runs Summary './nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran 1.03 ? 0.02 times faster than './nbdcopy-init nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' Signed-off-by: Nir Soffer <nsoffer at redhat.com> --- copy/nbd-ops.c | 1 + 1 file changed, 1 insertion(+) diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c index adfe4de5..1218d6d0 100644 --- a/copy/nbd-ops.c +++ b/copy/nbd-ops.c @@ -52,20 +52,21 @@ static void open_one_nbd_handle (struct rw_nbd *rwn) { struct nbd_handle *nbd; nbd = nbd_create (); if (nbd == NULL) { fprintf (stderr, "%s: %s\n", prog, nbd_get_error ()); exit (EXIT_FAILURE); } + nbd_set_pread_initialize (nbd, false); nbd_set_debug (nbd, verbose); /* Set the handle name for debugging. We could use rwn->rw.name * here but it is usually set to the lengthy NBD URI * (eg. "nbd://localhost:10809") which makes debug messages very * long. */ if (verbose) { char *name; const size_t index = rwn->handles.len; -- 2.35.1