Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 0/3] Disablble pread buffer initiialization
Disbaling pread initialization is atually measurable and gives small speedup in nbdcopy and some examples. Some cases are safer; in copy-libev example, we allocate all buffers with calloc() and resuse them for the entire copy, so the initialization is completly useless. In nbdcopy and Go aio_copy example, we allocate new buffer using malloc(), so when working with bad server that does not return all data chunks in structured reply, we can leak uninitialized memory to the destination server. I think this issue should be solved by libnbd; it should verfify that the server return all the expected chunks and fail the request if not. Nir Soffer (3): golang: examples: Do not initialize pread buffer examples: copy-libev: Do not initialize pread buffer copy: Do not initialize read buffer copy/nbd-ops.c | 1 + examples/copy-libev.c | 6 ++++++ golang/examples/aio_copy/aio_copy.go | 5 +++++ golang/examples/simple_copy/simple_copy.go | 5 +++++ 4 files changed, 17 insertions(+) -- 2.35.1
Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 1/3] golang: examples: Do not initialize pread buffer
The aio_copy example checks errors properly, so it will not leak
uninitialized memory to the destination image. Testing shows 5% speedup
when copying a real image.
$ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \
--socket /tmp/src.sock --format raw fedora-35-data.raw &
$ hyperfine -p "sleep 5" "./aio_copy-init $SRC
>/dev/null" "./aio_copy-no-init $SRC >/dev/null"
Benchmark 1: ./aio_copy-init nbd+unix:///?socket=/tmp/src.sock >/dev/null
Time (mean ? ?): 1.452 s ? 0.027 s [User: 0.330 s, System: 0.489 s]
Range (min ? max): 1.426 s ? 1.506 s 10 runs
Benchmark 2: ./aio_copy-no-init nbd+unix:///?socket=/tmp/src.sock >/dev/null
Time (mean ? ?): 1.378 s ? 0.009 s [User: 0.202 s, System: 0.484 s]
Range (min ? max): 1.369 s ? 1.399 s 10 runs
Summary
'./aio_copy-no-init nbd+unix:///?socket=/tmp/src.sock >/dev/null'
ran
1.05 ? 0.02 times faster than './aio_copy-init
nbd+unix:///?socket=/tmp/src.sock >/dev/null'
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
golang/examples/aio_copy/aio_copy.go | 5 +++++
golang/examples/simple_copy/simple_copy.go | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/golang/examples/aio_copy/aio_copy.go
b/golang/examples/aio_copy/aio_copy.go
index bb20b478..89eac4df 100644
--- a/golang/examples/aio_copy/aio_copy.go
+++ b/golang/examples/aio_copy/aio_copy.go
@@ -84,20 +84,25 @@ func main() {
err = h.ConnectUri(flag.Arg(0))
if err != nil {
panic(err)
}
size, err := h.GetSize()
if err != nil {
panic(err)
}
+ err = h.SetPreadInitialize(false)
+ if err != nil {
+ panic(err)
+ }
+
var offset uint64
for offset < size || queue.Len() > 0 {
for offset < size && inflightRequests() < *requests {
length := *requestSize
if size-offset < uint64(length) {
length = uint(size - offset)
}
startRead(offset, length)
diff --git a/golang/examples/simple_copy/simple_copy.go
b/golang/examples/simple_copy/simple_copy.go
index e8fa1f76..2a2ed0ff 100644
--- a/golang/examples/simple_copy/simple_copy.go
+++ b/golang/examples/simple_copy/simple_copy.go
@@ -63,20 +63,25 @@ func main() {
err = h.ConnectUri(flag.Arg(0))
if err != nil {
panic(err)
}
size, err := h.GetSize()
if err != nil {
panic(err)
}
+ err = h.SetPreadInitialize(false)
+ if err != nil {
+ panic(err)
+ }
+
buf := make([]byte, *requestSize)
var offset uint64
for offset < size {
if size-offset < uint64(len(buf)) {
buf = buf[:offset-size]
}
err = h.Pread(buf, offset, nil)
if err != nil {
--
2.35.1
Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 2/3] examples: copy-libev: Do not initialize pread buffer
This safety feature is not needed since we handle pread errors, and even
if we did not, we use initialized buffers. Testing shows 10% speedup
when copying a real image.
$ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \
--socket /tmp/src.sock --format raw fedora-35-data.raw &
$ qemu-nbd --persistent --shared 8 --cache none --aio native --discard unmap \
--socket /tmp/dst.sock --format raw dst.raw &
$ hyperfine -p "sleep 5" "./copy-libev-init $SRC $DST"
"./copy-libev-no-init $SRC $DST"
Benchmark 1: ./copy-libev-init nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock
Time (mean ? ?): 2.895 s ? 0.026 s [User: 0.387 s, System: 1.645 s]
Range (min ? max): 2.853 s ? 2.933 s 10 runs
Benchmark 2: ./copy-libev-no-init nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock
Time (mean ? ?): 2.636 s ? 0.034 s [User: 0.052 s, System: 1.628 s]
Range (min ? max): 2.557 s ? 2.677 s 10 runs
Summary
'./copy-libev-no-init nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock' ran
1.10 ? 0.02 times faster than './copy-libev-init
nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock'
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
examples/copy-libev.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/examples/copy-libev.c b/examples/copy-libev.c
index ad50b64c..f36939a7 100644
--- a/examples/copy-libev.c
+++ b/examples/copy-libev.c
@@ -647,20 +647,26 @@ main (int argc, char *argv[])
size = nbd_get_size (src.nbd);
if (size > nbd_get_size (dst.nbd))
FAIL ("Destinatio is not large enough\n");
/* Check destination server capabilities. */
dst.can_zero = nbd_can_zero (dst.nbd) > 0;
+ /* Disable pread buffer initialization. This is not needed since we
+ * handle pread errors, and even if we fail to handle errors, we use
+ * initialized buffers. */
+
+ nbd_set_pread_initialize (src.nbd, false);
+
/* Start the copy "loop". When request completes, it starts the
* next request, until entire image was copied. */
for (i = 0; i < MAX_REQUESTS; i++) {
struct request *r = &requests[i];
r->index = i;
/*
* Clear the buffer before starting the copy, so if we fail to
* handle a read error we will not write uninitilized data to
--
2.35.1
Nir Soffer
2022-Mar-06 20:27 UTC
[Libguestfs] [PATCH libnbd 3/3] copy: Do not initialize read buffer
nbdcopy checks pread error now, so we will never leak uninitialized data
from the heap to the destination server. Testing show 3-8% speedup when
copying a real image.
On laptop with 12 cores and 2 consumer NVMe drives:
$ qemu-nbd --read-only --persistent --shared 8 --cache none --aio native \
--socket /tmp/src.sock --format raw fedora-35-data.raw &
$ qemu-nbd --persistent --shared 8 --cache none --aio native --discard unmap \
--socket /tmp/dst.sock --format raw dst.raw &
$ hyperfine -p "sleep 5" "nbdcopy $SRC $DST"
".libs/nbdcopy $SRC $DST"
Benchmark 1: nbdcopy nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock
Time (mean ? ?): 2.065 s ? 0.057 s [User: 0.296 s, System: 1.414 s]
Range (min ? max): 2.000 s ? 2.163 s 10 runs
Benchmark 2: .libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock
Time (mean ? ?): 1.911 s ? 0.050 s [User: 0.059 s, System: 1.544 s]
Range (min ? max): 1.827 s ? 1.980 s 10 runs
Summary
'.libs/nbdcopy nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock' ran
1.08 ? 0.04 times faster than 'nbdcopy nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock'
On server with 80 cores and a fast NVMe drive:
$ hyperfine "./nbdcopy-init $SRC $DST" "./nbdcopy-no-init $SRC
$DST"
Benchmark 1: ./nbdcopy-init nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock
Time (mean ? ?): 2.652 s ? 0.033 s [User: 0.345 s, System: 1.306 s]
Range (min ? max): 2.619 s ? 2.709 s 10 runs
Benchmark 2: ./nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock
Time (mean ? ?): 2.572 s ? 0.031 s [User: 0.055 s, System: 1.409 s]
Range (min ? max): 2.537 s ? 2.629 s 10 runs
Summary
'./nbdcopy-no-init nbd+unix:///?socket=/tmp/src.sock
nbd+unix:///?socket=/tmp/dst.sock' ran
1.03 ? 0.02 times faster than './nbdcopy-init
nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock'
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
copy/nbd-ops.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c
index adfe4de5..1218d6d0 100644
--- a/copy/nbd-ops.c
+++ b/copy/nbd-ops.c
@@ -52,20 +52,21 @@ static void
open_one_nbd_handle (struct rw_nbd *rwn)
{
struct nbd_handle *nbd;
nbd = nbd_create ();
if (nbd == NULL) {
fprintf (stderr, "%s: %s\n", prog, nbd_get_error ());
exit (EXIT_FAILURE);
}
+ nbd_set_pread_initialize (nbd, false);
nbd_set_debug (nbd, verbose);
/* Set the handle name for debugging. We could use rwn->rw.name
* here but it is usually set to the lengthy NBD URI
* (eg. "nbd://localhost:10809") which makes debug messages very
* long.
*/
if (verbose) {
char *name;
const size_t index = rwn->handles.len;
--
2.35.1