thr3ads.net - Libguestfs - [Libguestfs] [nbdkit PATCH 2/4] tests: Test retry with different fua/fast-zero flags [Oct 2019]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-Oct-03 02:50 UTC

[Libguestfs] [nbdkit PATCH 0/4] More work with retry safety

I'm still working on another set of patches to have reopen call
.finalize/.prepare (so that another filter can safely appear between
retry and the plugin), but for tonight, these are the patches I think
are ready to go.

Eric Blake (4):
  retry: Handle can_fua and can_fast_zero changes
  tests: Test retry with different fua/fast-zero flags
  server: Close backends if a filter's .open fails
  server: Better documentation of .open ordering

 docs/nbdkit-filter.pod         |  17 +++--
 server/backend.c               |   5 +-
 server/connections.c           |   2 +-
 server/filters.c               |   8 ++-
 filters/retry/retry.c          |  20 ++++++
 tests/test-layers-filter.c     |   5 +-
 tests/test-layers.c            |  14 ++--
 tests/Makefile.am              |   2 +
 tests/test-retry-zero-flags.sh | 126 +++++++++++++++++++++++++++++++++
 9 files changed, 183 insertions(+), 16 deletions(-)
 create mode 100755 tests/test-retry-zero-flags.sh

-- 
2.21.0

Eric Blake

2019-Oct-03 02:50 UTC

head link

[Libguestfs] [nbdkit PATCH 1/4] retry: Handle can_fua and can_fast_zero changes

Although it is less common for a plugin to change its mind on whether
FUA or fast zeroes are supported, the backend will assert if the
original connection advertised support to the client, but a retry
causes the client's request with a flag to pass to a reopened plugin
without support.

For FUA, act the same as can_write and similar failures, forcing a
retry to see if the support appears yet again (and not trying to
fallback to a direct flush call).

For fast zero, fail fast without any retry.

As with other retry fixes, the test will be in the next patch.

Fixes: f0f0ec49
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 filters/retry/retry.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/filters/retry/retry.c b/filters/retry/retry.c
index ad31552c..840d7383 100644
--- a/filters/retry/retry.c
+++ b/filters/retry/retry.c
@@ -230,6 +230,11 @@ retry_pwrite (struct nbdkit_next_ops *next_ops, void
*nxdata,
     *err = EROFS;
     r = -1;
   }
+  else if (flags & NBDKIT_FLAG_FUA &&
+           next_ops->can_fua (nxdata) <= NBDKIT_FUA_NONE) {
+    *err = EIO;
+    r = -1;
+  }
   else
     r = next_ops->pwrite (nxdata, buf, count, offset, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;
@@ -257,6 +262,11 @@ retry_trim (struct nbdkit_next_ops *next_ops, void *nxdata,
     *err = EROFS;
     r = -1;
   }
+  else if (flags & NBDKIT_FLAG_FUA &&
+           next_ops->can_fua (nxdata) <= NBDKIT_FUA_NONE) {
+    *err = EIO;
+    r = -1;
+  }
   else
     r = next_ops->trim (nxdata, count, offset, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;
@@ -302,10 +312,20 @@ retry_zero (struct nbdkit_next_ops *next_ops, void
*nxdata,
     *err = EROFS;
     return -1;
   }
+  if (flags & NBDKIT_FLAG_FAST_ZERO &&
+           next_ops->can_fast_zero (nxdata) != 1) {
+    *err = EOPNOTSUPP;
+    return -1;
+  }
   if (next_ops->can_zero (nxdata) <= NBDKIT_ZERO_NONE) {
     *err = EROFS;
     r = -1;
   }
+  else if (flags & NBDKIT_FLAG_FUA &&
+           next_ops->can_fua (nxdata) <= NBDKIT_FUA_NONE) {
+    *err = EIO;
+    r = -1;
+  }
   else
     r = next_ops->zero (nxdata, count, offset, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;
-- 
2.21.0

Eric Blake

2019-Oct-03 02:50 UTC

head link

[Libguestfs] [nbdkit PATCH 2/4] tests: Test retry with different fua/fast-zero flags

Add coverage for the previous patch.  Although most plugins don't
change can_fua or can_fast_zero on the fly, the code should still
not abort when calling .zero with a flag that was permitted by
the previous open but not permitted by the current open.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 tests/Makefile.am              |   2 +
 tests/test-retry-zero-flags.sh | 126 +++++++++++++++++++++++++++++++++
 2 files changed, 128 insertions(+)
 create mode 100755 tests/test-retry-zero-flags.sh

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 60cba6c5..9d95d567 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -123,6 +123,7 @@ EXTRA_DIST = \
 	test-retry-extents.sh \
 	test-retry-readonly.sh \
 	test-retry-reopen-fail.sh \
+	test-retry-zero-flags.sh \
 	test-shutdown.sh \
 	test-ssh.sh \
 	test.tcl \
@@ -1062,6 +1063,7 @@ TESTS += \
 	test-retry-readonly.sh \
 	test-retry-extents.sh \
 	test-retry-reopen-fail.sh \
+	test-retry-zero-flags.sh \
 	$(NULL)

 # truncate filter tests.
diff --git a/tests/test-retry-zero-flags.sh b/tests/test-retry-zero-flags.sh
new file mode 100755
index 00000000..df2ea0c3
--- /dev/null
+++ b/tests/test-retry-zero-flags.sh
@@ -0,0 +1,126 @@
+#!/usr/bin/env bash
+# nbdkit
+# Copyright (C) 2018-2019 Red Hat Inc.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution.
+#
+# * Neither the name of Red Hat nor the names of its contributors may be
+# used to endorse or promote products derived from this software without
+# specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS
IS'' AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
+# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+# SUCH DAMAGE.
+
+source ./functions.sh
+set -e
+set -x
+
+requires nbdsh -c 'i = nbd.CMD_FLAG_FAST_ZERO
+exit(not h.supports_uri())'
+
+files="retry-zero-flags-count retry-zero-flags-open-count"
+rm -f $files
+cleanup_fn rm -f $files
+
+touch retry-zero-flags-count retry-zero-flags-open-count
+start_t=$SECONDS
+
+# Create a custom plugin which will test retrying.
+nbdkit -v -U - \
+       sh - \
+       --filter=retry retry-delay=1 \
+       --run 'nbdsh --uri $uri -c "
+h.zero (512, 0)
+try:
+    h.zero (512, 0,
+            nbd.CMD_FLAG_FUA | nbd.CMD_FLAG_NO_HOLE | nbd.CMD_FLAG_FAST_ZERO)
+except nbd.Error as ex:
+    assert ex.errno == \"ENOTSUP\"
+h.zero (512, 0, nbd.CMD_FLAG_FUA)
+       "' <<'EOF'
+#!/usr/bin/env bash
+case "$1" in
+    open)
+        # Count how many times the connection is (re-)opened.
+        read i < retry-zero-flags-open-count
+        echo $((i+1)) > retry-zero-flags-open-count
+        ;;
+    can_write | can_zero) exit 0 ;;
+    can_fua)
+        read i < retry-zero-flags-open-count
+	case $i in
+            1 | 4 | 5) echo native ;;
+            *)         echo none   ;;
+        esac
+        ;;
+    can_fast_zero)
+        read i < retry-zero-flags-open-count
+        case $i in
+            1 | 2 | 5) exit 0 ;;
+            *)         exit 3 ;;
+        esac
+        ;;
+    zero)
+        # First zero fails, thereafter it works
+        read i < retry-zero-flags-count
+        ((i++))
+        echo $i > retry-zero-flags-count
+        if [ $i -le 1 ]; then
+            echo "EIO zero failed" >&2
+            exit 1
+        fi
+        ;;
+
+    get_size) echo 512 ;;
+    *) exit 2 ;;
+esac
+EOF
+
+# In this test we should see the following pattern:
+# both fua and fast_zero supported
+# first zero FAILS
+# retry and wait 1 seconds
+# only fast_zero supported
+# first zero succeeds
+# second zero FAILS due to missing fua support
+# retry and wait 1 seconds
+# neither fua nor fast_zero supported
+# second zero FAILS fast due to missing fast_zero support
+# third zero FAILS due to missing fua support
+# retry and wait 1 seconds
+# only fua supported
+# third zero succeeds
+
+# The minimum time for the test should be 1+1+1 = 3 seconds.
+end_t=$SECONDS
+if [ $((end_t - start_t)) -lt 3 ]; then
+    echo "$0: test ran too quickly"
+    exit 1
+fi
+
+# Check the handle was opened 4 times (first open + one reopen for
+# each retry).
+read open_count < retry-zero-flags-open-count
+if [ $open_count -ne 4 ]; then
+    echo "$0: open-count ($open_count) != 4"
+    exit 1
+fi
-- 
2.21.0

Eric Blake

2019-Oct-03 02:50 UTC

head link

[Libguestfs] [nbdkit PATCH 3/4] server: Close backends if a filter's .open fails

We already had code that ensured that a filter's .open cannot succeed
unless its backend is also open.  However, we did not have the
converse: if a filter's .open fails, the backend handle could still be
open.  As long as we don't use a connection after a filter's open
failure, the plugin stays open, the backend eventually gets closed
when the connection goes away.  But now that we allow retries after
NBD_OPT_GO failure, a second attempt from the same connection could
run into a backend that is still open - fix it by closing the backend
right away, rather than delaying until connection close.  (This one is
not easy to trigger without a custom client).

Conversely, now that we have the retry filter, we can end up in the
situation where a filter is open, but the backend is not, when reopen
fails.  And that scenario is now more common when a filter .open
fails, with the above change to leave the plugin closed.  Since our
connection code only called the close chain if the plugin was still
open, this can result in memory leaks, as shown by:

$ NBDKIT_VALGRIND=1 ./nbdkit -U- --filter=log --filter=retry \
  null size=512 retries=1 retry-delay=1 \
  --run 'qemu-io -f raw $nbd -c "r 0 1"' logfile=/dev/stderr

Fix it by always running the close chain regardless of whether the
plugin is open.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 server/backend.c     | 5 ++++-
 server/connections.c | 2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/server/backend.c b/server/backend.c
index 64267bfe..5cbbe2c1 100644
--- a/server/backend.c
+++ b/server/backend.c
@@ -186,8 +186,11 @@ backend_open (struct backend *b, struct connection *conn,
int readonly)
     if (b->i) /* A filter must not succeed unless its backend did also */
       assert (conn->handles[b->i - 1].handle);
   }
-  else
+  else {
     assert (h->handle == NULL);
+    if (b->i) /* Do not strand backend if this layer failed */
+      backend_close (b->next, conn);
+  }
   return r;
 }

diff --git a/server/connections.c b/server/connections.c
index 27cf202b..df5e09af 100644
--- a/server/connections.c
+++ b/server/connections.c
@@ -360,7 +360,7 @@ free_connection (struct connection *conn)
    * thread will be in the process of unloading it.  The plugin.unload
    * callback should always be called.
    */
-  if (!quit && connection_get_handle (conn, 0)) {
+  if (!quit) {
     lock_request (conn);
     backend_close (backend, conn);
     unlock_request (conn);
-- 
2.21.0

Eric Blake

2019-Oct-03 02:50 UTC

head link

[Libguestfs] [nbdkit PATCH 4/4] server: Better documentation of .open ordering

Because we pass a next() pointer to .open, the filter can choose
whether to call it first (inner-to-outer completion) or last
(outer-to-inner).  However, all our filters currently call it first,
and this logically lines up with the mirror ordering of outer-to-inner
used during .close.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/nbdkit-filter.pod     | 17 +++++++++++++----
 server/filters.c           |  8 ++++++--
 tests/test-layers-filter.c |  5 +++--
 tests/test-layers.c        | 14 ++++++++------
 4 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod
index 00be376e..3c98d89a 100644
--- a/docs/nbdkit-filter.pod
+++ b/docs/nbdkit-filter.pod
@@ -310,14 +310,21 @@ read-only backend.  However, it is also acceptable to
attempt write
 access to the plugin even if this filter is readonly, such as when a
 file system mounted read-only still requires write access to the
 underlying device in case a journal needs to be replayed for
-consistency as part of the mounting process.
+consistency as part of the mounting process.  The filter should
+generally call C<next> as its first step, to allocate from the plugin
+outwards, so that C<.close> running from the outer filter to the
+plugin will be in reverse.

 =head2 C<.close>

  void (*close) (void *handle);

 This is called when the client closes the connection.  It should clean
-up any per-connection resources used by the filter.
+up any per-connection resources used by the filter.  It is called
+beginning with the outermost filter and ending with the plugin (the
+opposite order of C<.open> if all filters call C<next> first),
+although this order technically does not matter since the callback
+cannot report failures or access the underlying plugin.

 =head2 C<.prepare>

@@ -348,8 +355,10 @@ Unlike other filter methods, prepare and finalize are not
chained
 through the C<next_ops> structure.  Instead the core nbdkit server
 calls the prepare and finalize methods of all filters.  Prepare
 methods are called starting with the filter closest to the plugin and
-proceeding outwards.  Finalize methods are called in the reverse order
-of prepare methods.
+proceeding outwards (matching the order of C<.open> if all filters
+call C<next> before doing anything locally).  Finalize methods are
+called in the reverse order of prepare methods, with the outermost
+filter first (and matching the order of C<.close>).

 If there is an error, both callbacks should call C<nbdkit_error> with
 an error message and return C<-1>.
diff --git a/server/filters.c b/server/filters.c
index 78e32bc5..37e8b51e 100644
--- a/server/filters.c
+++ b/server/filters.c
@@ -205,6 +205,9 @@ filter_open (struct backend *b, struct connection *conn, int
readonly)
   struct b_conn nxdata = { .b = b->next, .conn = conn };
   void *handle;

+  /* Most filters will call next_open first, resulting in
+   * inner-to-outer ordering.
+   */
   if (f->filter.open) {
     handle = f->filter.open (next_open, &nxdata, readonly);
     if (handle == NULL)
@@ -225,6 +228,7 @@ filter_close (struct backend *b, struct connection *conn)
   struct backend_filter *f = container_of (b, struct backend_filter, backend);
   void *handle = connection_get_handle (conn, b->i);

+  /* outer-to-inner order, opposite .open */
   if (handle && f->filter.close)
     f->filter.close (handle);
   backend_close (b->next, conn);
@@ -409,7 +413,7 @@ filter_prepare (struct backend *b, struct connection *conn,
int readonly)
   struct b_conn nxdata = { .b = b->next, .conn = conn };

   /* Call these in order starting from the filter closest to the
-   * plugin.
+   * plugin, similar to typical .open order.
    */
   if (backend_prepare (b->next, conn) == -1)
     return -1;
@@ -431,7 +435,7 @@ filter_finalize (struct backend *b, struct connection *conn)
   debug ("%s: finalize", b->name);

   /* Call these in reverse order to .prepare above, starting from the
-   * filter furthest away from the plugin.
+   * filter furthest away from the plugin, and matching .close order.
    */
   if (f->filter.finalize &&
       f->filter.finalize (&next_ops, &nxdata, handle) == -1)
diff --git a/tests/test-layers-filter.c b/tests/test-layers-filter.c
index cccfb654..d590cf95 100644
--- a/tests/test-layers-filter.c
+++ b/tests/test-layers-filter.c
@@ -79,11 +79,12 @@ test_layers_filter_open (nbdkit_next_open *next, void
*nxdata, int readonly)
 {
   static int handle;

-  DEBUG_FUNCTION;
-
   if (next (nxdata, readonly) == -1)
     return NULL;

+  /* Debug after recursing, to show opposite order from .close */
+  DEBUG_FUNCTION;
+
   return &handle;
 }

diff --git a/tests/test-layers.c b/tests/test-layers.c
index 93b7770c..eac7f152 100644
--- a/tests/test-layers.c
+++ b/tests/test-layers.c
@@ -282,19 +282,20 @@ main (int argc, char *argv[])
      "test_layers_plugin_config_complete",
      NULL);

-  /* open methods called in order. */
+  /* open methods called in outer-to-inner order, but thanks to next
+   * pointer, complete in inner-to-outer order. */
   log_verify_seen_in_order
     ("testlayersfilter3: open readonly=0",
-     "filter3: test_layers_filter_open",
      "testlayersfilter2: open readonly=0",
-     "filter2: test_layers_filter_open",
      "testlayersfilter1: open readonly=0",
-     "filter1: test_layers_filter_open",
      "testlayersplugin: open readonly=0",
      "test_layers_plugin_open",
+     "filter1: test_layers_filter_open",
+     "filter2: test_layers_filter_open",
+     "filter3: test_layers_filter_open",
      NULL);

-  /* prepare methods called in order.
+  /* prepare methods called in inner-to-outer order.
    *
    * Note that prepare methods only exist for filters, and they must
    * be called from inner to outer (but finalize methods below are
@@ -595,7 +596,8 @@ main (int argc, char *argv[])
      "filter1: test_layers_filter_finalize",
      NULL);

-  /* close methods called in order */
+  /* close methods called outer-to-inner, which is reverse of completion
+   * of open */
   log_verify_seen_in_order
     ("filter3: test_layers_filter_close",
      "filter2: test_layers_filter_close",
-- 
2.21.0

Richard W.M. Jones

2019-Oct-03 07:46 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 2/4] tests: Test retry with different fua/fast-zero flags

On Wed, Oct 02, 2019 at 09:50:45PM -0500, Eric Blake
wrote:> +case "$1" in
> +    open)
> +        # Count how many times the connection is (re-)opened.
> +        read i < retry-zero-flags-open-count
> +        echo $((i+1)) > retry-zero-flags-open-count
> +        ;;
> +    can_write | can_zero) exit 0 ;;
> +    can_fua)
> +        read i < retry-zero-flags-open-count
> +	case $i in
> +            1 | 4 | 5) echo native ;;
> +            *)         echo none   ;;
> +        esac
Whitespace problem?

This answers the question I had in the first patch: how can it
possibly happen that can_fua changes on a retry.  It seems unlikely a
real plugin would be written like this, but obviously it's possible :-(

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

Richard W.M. Jones

2019-Oct-03 07:48 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 4/4] server: Better documentation of .open ordering

The series looks good so:

ACK series

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Richard W.M. Jones

2019-Oct-03 19:38 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 3/4] server: Close backends if a filter's .open fails

$ ./nbdkit -s memory 1M < fuzzing/testcase_dir/newstyle-cflags
NBDMAGICIHAVEOPTnbdkit: plugins.c:274: plugin_close: Assertion
`connection_get_handle (conn, 0)' failed.
Aborted (core dumped)

git bisect implicates this patch:

2f80ce1209d5898cb9a567c0b29e7736ff4d03eb is the first bad commit

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

Reasonably Related Threads

Search for more possibly parallel threads

Libguestfs - Oct 2019 - [nbdkit PATCH 2/4] tests: Test retry with different fua/fast-zero flags

[Libguestfs] [nbdkit PATCH 0/4] More work with retry safety

[Libguestfs] [nbdkit PATCH 1/4] retry: Handle can_fua and can_fast_zero changes

[Libguestfs] [nbdkit PATCH 2/4] tests: Test retry with different fua/fast-zero flags

[Libguestfs] [nbdkit PATCH 3/4] server: Close backends if a filter's .open fails

[Libguestfs] [nbdkit PATCH 4/4] server: Better documentation of .open ordering

Re: [Libguestfs] [nbdkit PATCH 2/4] tests: Test retry with different fua/fast-zero flags

Re: [Libguestfs] [nbdkit PATCH 4/4] server: Better documentation of .open ordering

Re: [Libguestfs] [nbdkit PATCH 3/4] server: Close backends if a filter's .open fails

Reasonably Related Threads