Eric Blake
2022-Jun-10 15:55 UTC
[Libguestfs] [nbdkit PATCH 0/3] Fix some NBDKIT_EMULATE_ bugs
Rich pointed out an assertion failure in the luks filter caused by a bug in backend.c's handling of NBDKIT_EMULATE_ZERO for .can_zero; while fixing it, I found a different bug in NBDKIT_EMULATE_CACHE. Eric Blake (3): server: Fix NBDKIT_ZERO_EMULATE from filters server: Fix NBDKIT_CACHE_EMULATE tests: Add regression test for NBDKIT_EMULATE_CACHE fix docs/nbdkit-filter.pod | 8 ++-- docs/nbdkit-plugin.pod | 3 +- tests/Makefile.am | 2 + server/backend.c | 38 +++++++++++++++++- filters/nozero/nozero.c | 39 ++---------------- tests/test-eval-cache.sh | 85 ++++++++++++++++++++++++++++++++++++++++ tests/test-nozero.sh | 78 ++++++++++++++++-------------------- 7 files changed, 167 insertions(+), 86 deletions(-) create mode 100755 tests/test-eval-cache.sh -- 2.36.1
Eric Blake
2022-Jun-10 15:55 UTC
[Libguestfs] [nbdkit PATCH 1/3] server: Fix NBDKIT_ZERO_EMULATE from filters
When we turned the result of .can_zero into a tri-state for filters back in v1.16, the intention was that the backend would emulate by calling into .pwrite instead of .zero. The nozero filter already had an implementation of that algorithm, but it needs to live in backend.c to be used by all filters, rather than repeatedly recoded in each filter that wants .zero support by .pwrite emulation. Discovered because the luks filter asked for .pwrite emulation in .can_zero without providing a .zero override, resulting in: $ qemu-img create -f luks --object secret,data=LETMEPASS,id=sec0 -o key-secret=sec0 encrypted.img 100M Formatting 'encrypted.img', fmt=luks size=104857600 key-secret=sec0 $ rm -f data.img $ truncate -s 100M data.img $ nbdkit file encrypted.img --filter=luks passphrase=LETMEPASS --run 'nbdcopy data.img $nbd' nbdkit: backend.c:718: backend_zero: Assertion `c->can_zero > NBDKIT_ZERO_NONE' failed. write at offset 0 failed: Transport endpoint is not connected nbdkit: nbdkit command was killed by signal 6 As a result of moving it into the backend, the nozero filter is simplified, but the corresponding test has to change. Since --filter=log does not alter .can_zero, the emulation actually occurs earlier in the stack (prior to calling into any filter, regardless of whether log is placed before or after nozero, rather than at the point of the nozero filter), so there is now no discernable difference between sock3 and sock4 (we can eliminate the duplicate), and the old sock5a no longer shows a Zero request. [Historically, when the nozero filter was first written, we were relying on qemu to send packets, and the use of the log filter was essential to ensure that we were getting the desired NBD_CMD_WRITE_ZEROES across various versions of qemu; but now that the test uses nbdkit, we know that h.zero() is doing exactly that, so the reduced length of the testsuite is no longer risky.] Reported-by: Richard W.M. Jones <rjones at redhat.com> Fixes: fd361554 ("filters: Cache and change semantics of can_zero", v1.15.1) --- docs/nbdkit-filter.pod | 4 +-- server/backend.c | 37 ++++++++++++++++++- filters/nozero/nozero.c | 39 +++------------------ tests/test-nozero.sh | 78 +++++++++++++++++------------------------ 4 files changed, 75 insertions(+), 83 deletions(-) diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod index 4ce5bda0..c4ec1004 100644 --- a/docs/nbdkit-filter.pod +++ b/docs/nbdkit-filter.pod @@ -898,8 +898,8 @@ This intercepts the plugin C<.zero> method and can be used to modify zero requests. This function will not be called if C<.can_zero> returned -C<NBDKIT_ZERO_NONE>; in turn, the filter should not call -C<next-E<gt>zero> if C<next-E<gt>can_zero> returned +C<NBDKIT_ZERO_NONE> or C<NBDKIT_ZERO_EMULATE>; in turn, the filter +should not call C<next-E<gt>zero> if C<next-E<gt>can_zero> returned C<NBDKIT_ZERO_NONE>. On input, the parameter C<flags> may include C<NBDKIT_FLAG_MAY_TRIM> diff --git a/server/backend.c b/server/backend.c index 7d46b99d..30ec4c24 100644 --- a/server/backend.c +++ b/server/backend.c @@ -728,7 +728,42 @@ backend_zero (struct context *c, b->name, count, offset, !!(flags & NBDKIT_FLAG_MAY_TRIM), fua, fast); - r = b->zero (c, count, offset, flags, err); + if (c->can_zero == NBDKIT_ZERO_NATIVE) + r = b->zero (c, count, offset, flags, err); + else { /* NBDKIT_ZERO_EMULATE */ + int writeflags = 0; + bool need_flush = false; + + if (fast) { + *err = ENOTSUP; + return -1; + } + + if (fua) { + if (c->can_fua == NBDKIT_FUA_EMULATE) + need_flush = true; + else + writeflags = NBDKIT_FLAG_FUA; + } + + while (count) { + /* Always contains zeroes, but we can't use const or else gcc 9 + * will use .rodata instead of .bss and inflate the binary size. + */ + static /* const */ char buffer[MAX_REQUEST_SIZE]; + uint32_t limit = MIN (count, MAX_REQUEST_SIZE); + + if (limit == count && need_flush) + writeflags = NBDKIT_FLAG_FUA; + + if (backend_pwrite (c, buffer, limit, offset, writeflags, err) == -1) + return -1; + offset += limit; + count -= limit; + } + r = 0; + } + if (r == -1) { assert (*err); if (!fast) diff --git a/filters/nozero/nozero.c b/filters/nozero/nozero.c index 7c5d67c4..5c04975b 100644 --- a/filters/nozero/nozero.c +++ b/filters/nozero/nozero.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018-2020 Red Hat Inc. + * Copyright (C) 2018-2022 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -49,8 +49,6 @@ #undef IGNORE #endif -#define MAX_WRITE (64 * 1024 * 1024) - static enum ZeroMode { NONE, EMULATE, @@ -159,14 +157,10 @@ nozero_zero (nbdkit_next *next, void *handle, uint32_t count, uint64_t offs, uint32_t flags, int *err) { - int writeflags = 0; - bool need_flush = false; - - assert (zeromode != NONE); + assert (zeromode != NONE && zeromode != EMULATE); if (flags & NBDKIT_FLAG_FAST_ZERO) { assert (fastzeromode != NOFAST); - if (fastzeromode == SLOW || - (fastzeromode == DEFAULT && zeromode == EMULATE)) { + if (fastzeromode == SLOW) { *err = ENOTSUP; return -1; } @@ -177,32 +171,7 @@ nozero_zero (nbdkit_next *next, if (zeromode == NOTRIM) flags &= ~NBDKIT_FLAG_MAY_TRIM; - if (zeromode != EMULATE) - return next->zero (next, count, offs, flags, err); - - if (flags & NBDKIT_FLAG_FUA) { - if (next->can_fua (next) == NBDKIT_FUA_EMULATE) - need_flush = true; - else - writeflags = NBDKIT_FLAG_FUA; - } - - while (count) { - /* Always contains zeroes, but we can't use const or else gcc 9 - * will use .rodata instead of .bss and inflate the binary size. - */ - static /* const */ char buffer[MAX_WRITE]; - uint32_t size = MIN (count, MAX_WRITE); - - if (size == count && need_flush) - writeflags = NBDKIT_FLAG_FUA; - - if (next->pwrite (next, buffer, size, offs, writeflags, err) == -1) - return -1; - offs += size; - count -= size; - } - return 0; + return next->zero (next, count, offs, flags, err); } static struct nbdkit_filter filter = { diff --git a/tests/test-nozero.sh b/tests/test-nozero.sh index 61911a42..5940ad33 100755 --- a/tests/test-nozero.sh +++ b/tests/test-nozero.sh @@ -1,6 +1,6 @@ #!/usr/bin/env bash # nbdkit -# Copyright (C) 2018-2020 Red Hat Inc. +# Copyright (C) 2018-2022 Red Hat Inc. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are @@ -36,17 +36,15 @@ set -x sock2=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) sock3=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) -sock4=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) -sock5a=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) -sock5b=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) -sock6=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) +sock4a=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) +sock4b=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) +sock5=$(mktemp -u /tmp/nbdkit-test-sock.XXXXXX) files="nozero1.img nozero1.log nozero2.img nozero2.log $sock2 nozero2.pid nozero3.img nozero3.log $sock3 nozero3.pid - nozero4.img nozero4.log $sock4 nozero4.pid - nozero5.img nozero5a.log nozero5b.log $sock5a $sock5b - nozero5a.pid nozero5b.pid - nozero6.img nozero6.log $sock6 nozero6.pid" + nozero4.img nozero4a.log nozero4b.log $sock4a $sock4b + nozero4a.pid nozero4b.pid + nozero5.img nozero5.log $sock5 nozero5.pid" rm -f $files fail=0 @@ -60,14 +58,12 @@ cleanup () cat nozero2.log || : echo "Log 3 file contents:" cat nozero3.log || : - echo "Log 4 file contents:" - cat nozero4.log || : - echo "Log 5a file contents:" - cat nozero5a.log || : - echo "Log 5b file contents:" - cat nozero5b.log || : - echo "Log 6 file contents:" - cat nozero6.log || : + echo "Log 4a file contents:" + cat nozero4a.log || : + echo "Log 4b file contents:" + cat nozero4b.log || : + echo "Log 5 file contents:" + cat nozero5.log || : rm -f $files } cleanup_fn cleanup @@ -79,10 +75,9 @@ cp nozero1.img nozero2.img cp nozero1.img nozero3.img cp nozero1.img nozero4.img cp nozero1.img nozero5.img -cp nozero1.img nozero6.img # Debug number of blocks and block size in the images. -for f in {1..6}; do +for f in {1..5}; do stat -c "%n: %b allocated blocks of size %B bytes, total size %s" \ nozero$f.img sizes[$f]=$(stat -c %b nozero$f.img) @@ -99,24 +94,21 @@ fi # Run several parallel nbdkit; to compare the logs and see what changes. # 1: unfiltered (above), check that nbdsh sends ZERO request and plugin trims # 2: log before filter with zeromode=none (default), to ensure no ZERO request -# 3: log before filter with zeromode=emulate, to ensure ZERO from client -# 4: log after filter with zeromode=emulate, to ensure no ZERO to plugin -# 5a/b: both sides of nbd plugin: even though server side does not advertise +# 3: log after filter with zeromode=emulate, to ensure no ZERO to plugin +# 4a/b: both sides of nbd plugin: even though server side does not advertise # ZERO, the client side still exposes it, and just skips calling nbd's .zero -# 6: log after filter with zeromode=notrim, to ensure plugin does not trim +# 5: log after filter with zeromode=notrim, to ensure plugin does not trim start_nbdkit -P nozero2.pid -U $sock2 --filter=log --filter=nozero \ file logfile=nozero2.log nozero2.img -start_nbdkit -P nozero3.pid -U $sock3 --filter=log --filter=nozero \ +start_nbdkit -P nozero3.pid -U $sock3 --filter=nozero --filter=log \ file logfile=nozero3.log nozero3.img zeromode=emulate -start_nbdkit -P nozero4.pid -U $sock4 --filter=nozero --filter=log \ - file logfile=nozero4.log nozero4.img zeromode=emulate -# Start 5b before 5a so that cleanup visits the client before the server -start_nbdkit -P nozero5b.pid -U $sock5b --filter=log \ - nbd logfile=nozero5b.log socket=$sock5a -start_nbdkit -P nozero5a.pid -U $sock5a --filter=log --filter=nozero \ - file logfile=nozero5a.log nozero5.img -start_nbdkit -P nozero6.pid -U $sock6 --filter=nozero --filter=log \ - file logfile=nozero6.log nozero6.img zeromode=notrim +# Start 4b before 4a so that cleanup visits the client before the server +start_nbdkit -P nozero4b.pid -U $sock4b --filter=log \ + nbd logfile=nozero4b.log socket=$sock4a +start_nbdkit -P nozero4a.pid -U $sock4a --filter=log --filter=nozero \ + file logfile=nozero4a.log nozero4.img +start_nbdkit -P nozero5.pid -U $sock5 --filter=nozero --filter=log \ + file logfile=nozero5.log nozero5.img zeromode=notrim # Perform the zero write. nbdsh -u "nbd+unix://?socket=$sock2" -c ' @@ -124,38 +116,34 @@ assert not h.can_zero() h.pwrite (bytearray(1024*1024), 0) ' nbdsh -u "nbd+unix://?socket=$sock3" -c 'h.zero(1024*1024, 0)' -nbdsh -u "nbd+unix://?socket=$sock4" -c 'h.zero(1024*1024, 0)' -nbdsh -u "nbd+unix://?socket=$sock5b" -c 'h.zero(1024*1024, 0)' -nbdsh -u "nbd+unix://?socket=$sock6" -c 'h.zero(1024*1024, 0)' +nbdsh -u "nbd+unix://?socket=$sock4b" -c 'h.zero(1024*1024, 0)' +nbdsh -u "nbd+unix://?socket=$sock5" -c 'h.zero(1024*1024, 0)' # Check for expected ZERO vs. WRITE results -grep 'connection=1 Zero' nozero1.log +grep 'connection=1 Zero' nozero1.log || fail=1 if grep 'connection=1 Zero' nozero2.log; then echo "filter should have prevented zero" fail=1 fi -grep 'connection=1 Zero' nozero3.log -if grep 'connection=1 Zero' nozero4.log; then +if grep 'connection=1 Zero' nozero3.log; then echo "filter should have converted zero into write" fail=1 fi -grep 'connection=1 Zero' nozero5b.log -if grep 'connection=1 Zero' nozero5a.log; then +if grep 'connection=1 Zero' nozero4a.log; then echo "nbdkit should have converted zero into write before nbd plugin" fail=1 fi -grep 'connection=1 Zero' nozero6.log +grep 'connection=1 Zero' nozero5.log # Sanity check on contents - all 5 files should read identically cmp nozero1.img nozero2.img cmp nozero2.img nozero3.img cmp nozero3.img nozero4.img cmp nozero4.img nozero5.img -cmp nozero5.img nozero6.img -# Sanity check on sparseness: images 2-6 should not be sparse (although the +# Sanity check on sparseness: images 2-5 should not be sparse (although the # filesystem may have reserved additional space due to our writes) -for i in {2..6}; do +for i in {2..5}; do if test "$(stat -c %b nozero$i.img)" -lt "${sizes[$i]}"; then echo "nozero$i.img was trimmed by mistake" fail=1 -- 2.36.1
Eric Blake
2022-Jun-10 15:55 UTC
[Libguestfs] [nbdkit PATCH 2/3] server: Fix NBDKIT_CACHE_EMULATE
When support for NBD_CMD_CACHE was first introduced, I coded up a generic fallback under NBDKIT_CACHE_EMULATE that defers to .pread. But due to a bug, the fallback repeatedly re-reads the first 64M of the request, rather than actually advancing over the entire cache request. Sadly, I didn't even notice the bug when moving the NBDKIT_CACHE_EMULATE code to the backend for use by all filters and plugins, in 14eb056c (v1.15.1). The next patch will add testsuite coverage. Fixes: fdf06817 ("server: Internal hooks for implementing NBD_CMD_CACHE", v1.13.4) --- docs/nbdkit-filter.pod | 4 ++-- docs/nbdkit-plugin.pod | 3 ++- server/backend.c | 1 + 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod index c4ec1004..f3d83b6e 100644 --- a/docs/nbdkit-filter.pod +++ b/docs/nbdkit-filter.pod @@ -1054,8 +1054,8 @@ cache requests. This function will not be called if C<.can_cache> returned C<NBDKIT_CACHE_NONE> or C<NBDKIT_CACHE_EMULATE>; in turn, the filter -should not call C<next-E<gt>cache> unless -C<next-E<gt>can_cache> returned C<NBDKIT_CACHE_NATIVE>. +should not call C<next-E<gt>cache> if +C<next-E<gt>can_cache> returned C<NBDKIT_CACHE_NONE>. The parameter C<flags> exists in case of future NBD protocol extensions; at this time, it will be 0 on input, and the filter should diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod index 383dcffe..6a1d5ea5 100644 --- a/docs/nbdkit-plugin.pod +++ b/docs/nbdkit-plugin.pod @@ -1101,7 +1101,8 @@ result. =item C<NBDKIT_CACHE_NATIVE> -The C<.cache> callback will be called. +Cache support is advertised to the client. The C<.cache> callback will +be called if it exists, otherwise all cache requests instantly succeed. =back diff --git a/server/backend.c b/server/backend.c index 30ec4c24..3bbba601 100644 --- a/server/backend.c +++ b/server/backend.c @@ -828,6 +828,7 @@ backend_cache (struct context *c, if (backend_pread (c, buf, limit, offset, flags, err) == -1) return -1; count -= limit; + offset += limit; } return 0; } -- 2.36.1
Eric Blake
2022-Jun-10 15:55 UTC
[Libguestfs] [nbdkit PATCH 3/3] tests: Add regression test for NBDKIT_EMULATE_CACHE fix
Add testsuite coverage for the bug fixed in the previous patch (done separately, to make it easier to prove the test fails without the patch). --- tests/Makefile.am | 2 + tests/test-eval-cache.sh | 85 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+) create mode 100755 tests/test-eval-cache.sh diff --git a/tests/Makefile.am b/tests/Makefile.am index 67e7618b..2ed21dc2 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -712,11 +712,13 @@ TESTS += \ test-eval.sh \ test-eval-file.sh \ test-eval-exports.sh \ + test-eval-cache.sh \ $(NULL) EXTRA_DIST += \ test-eval.sh \ test-eval-file.sh \ test-eval-exports.sh \ + test-eval-cache.sh \ $(NULL) # file plugin test. diff --git a/tests/test-eval-cache.sh b/tests/test-eval-cache.sh new file mode 100755 index 00000000..f523e03c --- /dev/null +++ b/tests/test-eval-cache.sh @@ -0,0 +1,85 @@ +#!/usr/bin/env bash +# nbdkit +# Copyright (C) 2022 Red Hat Inc. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# * Neither the name of Red Hat nor the names of its contributors may be +# used to endorse or promote products derived from this software without +# specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +# SUCH DAMAGE. + +# Check that NBDKIT_CACHE_EMULATE works on large requests. + +source ./functions.sh +set -e +set -x + +requires_plugin eval +requires_nbdsh_uri + +files="eval-cache.witness eval-cache.cache" +rm -f $files +cleanup_fn rm -f $files +fail=0 + +# This plugin requests nbdkit to emulate caching with pread. When the witness +# file exists, cache reads; when absent, reads fail if not already cached. +export witness="$PWD/eval-cache.witness" +export cache="$PWD/eval-cache.cache" +export script=' +import os +import errno + +witness = os.getenv("witness") + +def touch(path): + open(path, "a").close() + +# Test that uncached read fails +try: + h.pread(1024 * 1024, 0) +except nbd.Error as ex: + assert ex.errnum == errno.EIO + +# Cache the entire image; nbdkit should break it into 64k preads +touch(witness) +h.cache(h.get_size(), 0) +os.unlink(witness) + +# Now read should succeed +buf = h.pread(64 * 1024 * 1024, 64 * 1024 * 1024) +if hasattr(buf, "is_zero"): + assert buf.is_zero() +' +nbdkit -U - -v eval \ + get_size='echo 128M' can_cache='echo emulate' open='touch "$cache"' \ + pread=' + if test -f "$witness"; then + echo "$3 $4" >> "$cache" + elif ! grep -q "^$3 $4$" "$cache"; then + echo EIO >&2; exit 1 + fi + dd if=/dev/zero count=$3 iflag=count_bytes + ' --run 'nbdsh -u "$uri" -c "$script"' -- 2.36.1