thr3ads.net - Libguestfs - [Libguestfs] [nbdkit PATCH 6/6] tests: Test retry after partial extents [Oct 2019]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-Oct-01 03:17 UTC

[Libguestfs] [nbdkit PATCH v2 0/6] Improve retry filter

Includes a rework of the previously posted patch for --run
improvements (mostly with improved comments and commit message; I
decided that waiting for the captive nbdkit to exit was overkill), and
four new patches. The tests are intentionally separate, to allow
rearranging the order of the series to see the failures being fixed.

Eric Blake (6):
  server: Propagate unexpected nbdkit failure with --run
  tests: Enhance captive test
  retry: Check next_ops->can_FOO on retry
  tests: Test for retry-readonly behavior
  retry: Avoid assertion during retried extents
  tests: Test retry after partial extents

 server/captive.c             |  47 +++++++++++----
 filters/retry/retry.c        |  80 ++++++++++++++++++++++--
 tests/Makefile.am            |   4 ++
 tests/test-captive.sh        |  46 ++++++++++++--
 tests/test-retry-extents.sh  | 114 +++++++++++++++++++++++++++++++++++
 tests/test-retry-readonly.sh |  96 +++++++++++++++++++++++++++++
 6 files changed, 366 insertions(+), 21 deletions(-)
 create mode 100755 tests/test-retry-extents.sh
 create mode 100755 tests/test-retry-readonly.sh

-- 
2.21.0

Eric Blake

2019-Oct-01 03:17 UTC

head link

[Libguestfs] [nbdkit PATCH v2 1/6] server: Propagate unexpected nbdkit failure with --run

When running captive, we were blindly calling kill(pid) of the captive
nbdkit child process and ignoring failures, even if that process has
already exited unexpectedly (most likely, from an assertion failure).
Thankfully, because we did not wait on the process, we are guaranteed
that the nbdkit child process is either still running or in zombie
state, so no other process can recycle the pid yet.

Merely sending SIGTERM should be enough to cause a properly-working
nbdkit to exit with status 0.  We could make the code more complex to
sleep for a bit and follow up with SIGKILL, but it's not worth
blocking the code right now (and if we do exit before the child, the
child becomes an orphan process auto-reaped by init).

While at it, fix the fact that system() can fail with a value that is
not appropriate to hand to WIFEXITED() if the child was not even
spawned, but cannot fail with WIFSTOPPED.  Prefer EXIT_FAILURE over
hard-coded 1.  Use exit() instead of _exit() to allow I/O to flush and
any atexit handlers to run. Also, reflect death from signal to a
status > 128 rather than 1 (we could be even fancier and also re-raise
the signal so that we die from the same thing, but again, it's not
obvious we need that much work...).

Signed-off-by: Eric Blake <eblake at redhat.com>
---
 server/captive.c | 47 ++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 36 insertions(+), 11 deletions(-)

diff --git a/server/captive.c b/server/captive.c
index c4cec238..4bb738fc 100644
--- a/server/captive.c
+++ b/server/captive.c
@@ -54,7 +54,7 @@ run_command (void)
   FILE *fp;
   char *cmd = NULL;
   size_t len = 0;
-  int r;
+  int r, status;
   pid_t pid;

   if (!run)
@@ -149,22 +149,47 @@ run_command (void)

   if (pid > 0) {              /* Parent process is the run command. */
     r = system (cmd);
-    if (WIFEXITED (r))
+    if (r == -1) {
+      nbdkit_error ("failure to execute external command: %m");
+      r = EXIT_FAILURE;
+    }
+    else if (WIFEXITED (r))
       r = WEXITSTATUS (r);
-    else if (WIFSIGNALED (r)) {
+    else {
+      assert (WIFSIGNALED (r));
       fprintf (stderr, "%s: external command was killed by signal
%d\n",
                program_name, WTERMSIG (r));
-      r = 1;
-    }
-    else if (WIFSTOPPED (r)) {
-      fprintf (stderr, "%s: external command was stopped by signal
%d\n",
-               program_name, WSTOPSIG (r));
-      r = 1;
+      r = WTERMSIG (r) + 128;
     }

-    kill (pid, SIGTERM);        /* Kill captive nbdkit. */
+    switch (waitpid (pid, &status, WNOHANG)) {
+    case -1:
+      nbdkit_error ("waitpid: %m");
+      r = EXIT_FAILURE;
+      break;
+    case 0:
+      /* Captive nbdkit still running; kill it, but no need to wait
+       * for it, as the captive program's exit status is good enough
+       * (if the captive fails to exit after SIGTERM, we have a bigger
+       * bug to fix).
+       */
+      kill (pid, SIGTERM);
+      break;
+    default:
+      /* Captive nbdkit exited unexpectedly; update the exit status. */
+      if (WIFEXITED (status)) {
+        if (r == 0)
+          r = WEXITSTATUS (status);
+      }
+      else {
+        assert (WIFSIGNALED (status));
+        fprintf (stderr, "%s: nbdkit command was killed by signal
%d\n",
+                 program_name, WTERMSIG (status));
+        r = WTERMSIG (status) + 128;
+      }
+    }

-    _exit (r);
+    exit (r);
   }

   free (cmd);
-- 
2.21.0

Eric Blake

2019-Oct-01 03:17 UTC

head link

[Libguestfs] [nbdkit PATCH v2 2/6] tests: Enhance captive test

Test the just-fixed bug in --run failing to detect an nbdkit assertion
failure.

While at it, sleep less when we don't actually need to wait for the
socket to be opened.

Signed-off-by: Eric Blake <eblake at redhat.com>
---
 tests/test-captive.sh | 46 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/tests/test-captive.sh b/tests/test-captive.sh
index e89c387d..88c0d818 100755
--- a/tests/test-captive.sh
+++ b/tests/test-captive.sh
@@ -1,6 +1,6 @@
 #!/usr/bin/env bash
 # nbdkit
-# Copyright (C) 2014-2018 Red Hat Inc.
+# Copyright (C) 2014-2019 Red Hat Inc.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions are
@@ -36,13 +36,15 @@ set -x

 # Test nbdkit --run (captive nbdkit) option.

+fail=0
+
 sock=`mktemp -u`
-files="$sock captive.out"
+files="$sock captive.out captive.pid"
 rm -f $files
 cleanup_fn rm -f $files

 nbdkit -U $sock example1 --run '
-    sleep 5; echo nbd=$nbd; echo port=$port; echo socket=$unixsocket
+    sleep 1; echo nbd=$nbd; echo port=$port; echo socket=$unixsocket
   ' > captive.out

 # Check the output.
@@ -51,5 +53,41 @@ port socket=$sock" ]; then
     echo "$0: unexpected output"
     cat captive.out
-    exit 1
+    fail=1
 fi
+
+# Check that a failed --run process affects exit status
+status=0
+nbdkit -U - example1 --run 'exit 2' > captive.out || status=$?
+if test $status != 2; then
+    echo "$0: unexpected exit status $status"
+    fail=1
+fi
+if test -s captive.out; then
+    echo "$0: unexpected output"
+    cat captive.out
+    fail=1
+fi
+
+# Check that nbdkit death from unhandled signal affects exit status
+status=0
+nbdkit -U - -P captive.pid example1 --run '
+test ! -s captive.pid || sleep 1
+if test ! -s captive.pid; then
+  echo "no pidfile yet"
+  exit 10
+fi
+kill -s ABRT $(cat captive.pid) || exit 10
+sleep 1
+' > captive.out || status=$?
+if test $status != $(( 128 + $(kill -l ABRT) )); then
+    echo "$0: unexpected exit status $status"
+    fail=1
+fi
+if test -s captive.out; then
+    echo "$0: unexpected output"
+    cat captive.out
+    fail=1
+fi
+
+exit $fail
-- 
2.21.0

Eric Blake

2019-Oct-01 03:17 UTC

head link

[Libguestfs] [nbdkit PATCH 3/6] retry: Check next_ops->can_FOO on retry

After a retry, if the second connection has fewer permissions than the
first, but we blindly call next_ops->FOO, we end up triggering an
assertion failure in backend.c.  This is particularly noticeable when
the force_readonly flag is in effect, as that makes it much easier for
there to be fewer permissions than before.

Add testsuite coverage of pwrite to demonstrate.

Signed-off-by: Eric Blake <eblake at redhat.com>
---
 filters/retry/retry.c | 58 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 52 insertions(+), 6 deletions(-)

diff --git a/filters/retry/retry.c b/filters/retry/retry.c
index 59e7d4b8..00dbd091 100644
--- a/filters/retry/retry.c
+++ b/filters/retry/retry.c
@@ -106,6 +106,7 @@ retry_config (nbdkit_next_config *next, void *nxdata,

 struct retry_handle {
   int readonly;                 /* Save original readonly setting. */
+  unsigned reopens;
 };

 static void *
@@ -123,6 +124,7 @@ retry_open (nbdkit_next_open *next, void *nxdata, int
readonly)
   }

   h->readonly = readonly;
+  h->reopens = 0;

   return h;
 }
@@ -132,6 +134,7 @@ retry_close (void *handle)
 {
   struct retry_handle *h = handle;

+  nbdkit_debug ("reopens needed: %u", h->reopens);
   free (h);
 }

@@ -179,6 +182,7 @@ do_retry (struct retry_handle *h,
     data->delay *= 2;

   /* Reopen the connection. */
+  h->reopens++;
   if (next_ops->reopen (nxdata, h->readonly || force_readonly) == -1) {
     /* If the reopen fails we treat it the same way as a command
      * failing.
@@ -218,7 +222,16 @@ retry_pwrite (struct nbdkit_next_ops *next_ops, void
*nxdata,
   int r;

  again:
-  r = next_ops->pwrite (nxdata, buf, count, offset, flags, err);
+  if (h->reopens && force_readonly) {
+    *err = EROFS;
+    return -1;
+  }
+  if (next_ops->can_write (nxdata) != 1) {
+    *err = EROFS;
+    r = -1;
+  }
+  else
+    r = next_ops->pwrite (nxdata, buf, count, offset, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;

   return r;
@@ -236,7 +249,16 @@ retry_trim (struct nbdkit_next_ops *next_ops, void *nxdata,
   int r;

  again:
-  r = next_ops->trim (nxdata, count, offset, flags, err);
+  if (h->reopens && force_readonly) {
+    *err = EROFS;
+    return -1;
+  }
+  if (next_ops->can_trim (nxdata) != 1) {
+    *err = EROFS;
+    r = -1;
+  }
+  else
+    r = next_ops->trim (nxdata, count, offset, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;

   return r;
@@ -253,7 +275,12 @@ retry_flush (struct nbdkit_next_ops *next_ops, void
*nxdata,
   int r;

  again:
-  r = next_ops->flush (nxdata, flags, err);
+  if (next_ops->can_flush (nxdata) != 1) {
+    *err = EIO;
+    r = -1;
+  }
+  else
+    r = next_ops->flush (nxdata, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;

   return r;
@@ -271,7 +298,16 @@ retry_zero (struct nbdkit_next_ops *next_ops, void *nxdata,
   int r;

  again:
-  r = next_ops->zero (nxdata, count, offset, flags, err);
+  if (h->reopens && force_readonly) {
+    *err = EROFS;
+    return -1;
+  }
+  if (next_ops->can_zero (nxdata) < NBDKIT_ZERO_EMULATE) {
+    *err = EROFS;
+    r = -1;
+  }
+  else
+    r = next_ops->zero (nxdata, count, offset, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;

   return r;
@@ -289,7 +325,12 @@ retry_extents (struct nbdkit_next_ops *next_ops, void
*nxdata,
   int r;

  again:
-  r = next_ops->extents (nxdata, count, offset, flags, extents, err);
+  if (next_ops->can_extents (nxdata) != 1) {
+    *err = EIO;
+    r = -1;
+  }
+  else
+    r = next_ops->extents (nxdata, count, offset, flags, extents, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;

   return r;
@@ -307,7 +348,12 @@ retry_cache (struct nbdkit_next_ops *next_ops, void
*nxdata,
   int r;

  again:
-  r = next_ops->cache (nxdata, count, offset, flags, err);
+  if (next_ops->can_cache (nxdata) != 1) {
+    *err = EIO;
+    r = -1;
+  }
+  else
+    r = next_ops->cache (nxdata, count, offset, flags, err);
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;

   return r;
-- 
2.21.0

Eric Blake

2019-Oct-01 03:17 UTC

head link

[Libguestfs] [nbdkit PATCH 4/6] tests: Test for retry-readonly behavior

Add testsuite coverage for the previous patch.  When the retry filter
forces a readonly reconnection, all subsequent attempts to read should
instantly fail without trying further reconnects.

Signed-off-by: Eric Blake <eblake at redhat.com>
---
 tests/Makefile.am            |  2 +
 tests/test-retry-readonly.sh | 96 ++++++++++++++++++++++++++++++++++++
 2 files changed, 98 insertions(+)
 create mode 100755 tests/test-retry-readonly.sh

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 325ca84d..4cf0325e 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -120,6 +120,7 @@ EXTRA_DIST = \
 	test.rb \
 	test-readahead-copy.sh \
 	test-retry.sh \
+	test-retry-readonly.sh \
 	test-retry-reopen-fail.sh \
 	test-shutdown.sh \
 	test-ssh.sh \
@@ -1057,6 +1058,7 @@ test_readahead_LDADD = libtest.la $(LIBGUESTFS_LIBS)
 # retry filter test.
 TESTS += \
 	test-retry.sh \
+	test-retry-readonly.sh \
 	test-retry-reopen-fail.sh \
 	$(NULL)

diff --git a/tests/test-retry-readonly.sh b/tests/test-retry-readonly.sh
new file mode 100755
index 00000000..710c2e93
--- /dev/null
+++ b/tests/test-retry-readonly.sh
@@ -0,0 +1,96 @@
+#!/usr/bin/env bash
+# nbdkit
+# Copyright (C) 2018-2019 Red Hat Inc.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution.
+#
+# * Neither the name of Red Hat nor the names of its contributors may be
+# used to endorse or promote products derived from this software without
+# specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS
IS'' AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
+# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+# SUCH DAMAGE.
+
+source ./functions.sh
+set -e
+set -x
+
+requires qemu-io --version
+
+files="retry-readonly-count retry-readonly-open-count"
+rm -f $files
+cleanup_fn rm -f $files
+
+touch retry-readonly-count retry-readonly-open-count
+start_t=$SECONDS
+
+# Create a custom plugin which will test retrying.
+nbdkit -v -U - \
+       sh - \
+       --filter=retry retry-delay=1 retry-readonly=yes \
+       --run 'qemu-io -f raw -c "w 0 512" -c "w 0 512"
$nbd || :' <<'EOF'
+#!/usr/bin/env bash
+case "$1" in
+    open)
+        # Count how many times the connection is (re-)opened.
+        read i < retry-readonly-open-count
+        echo $((i+1)) > retry-readonly-open-count
+        ;;
+    can_write) ;;
+    pwrite)
+        # Fail 3 times then succeed - however, force-readonly means we will
+        # never return here after the first failure.
+        read i < retry-readonly-count
+        ((i++))
+        echo $i > retry-readonly-count
+        if [ $i -le 3 ]; then
+            echo "EIO pwrite failed" >&2
+            exit 1
+        else
+            : drop data
+        fi
+        ;;
+
+    get_size) echo 512 ;;
+    *) exit 2 ;;
+esac
+EOF
+
+# In this test we should see 3 failures:
+# first pwrite FAILS
+# retry and wait 1 seconds
+# first pwrite FAILS immediately now that connection is readonly
+# second pwrite FAILS immediately
+
+# The minimum time for the test should be 1 second.
+end_t=$SECONDS
+if [ $((end_t - start_t)) -lt 1 ]; then
+    echo "$0: test ran too quickly"
+    exit 1
+fi
+
+# Check the handle was opened 2 times (first open + one reopen).
+read open_count < retry-readonly-open-count
+if [ $open_count -ne 2 ]; then
+    echo "$0: open-count ($open_count) != 2"
+    exit 1
+fi
-- 
2.21.0

Eric Blake

2019-Oct-01 03:17 UTC

head link

[Libguestfs] [nbdkit PATCH 5/6] retry: Avoid assertion during retried extents

If the plugin's .extents made progress before failing, retrying the
call with the original offset that differs from the offset expected by
extents will cause an assertion failure.  We have to perform each
retry iteration with a fresh extents object, and copy it on success.

The sh plugin could trigger this with an extents callback that
produces valid data followed by garbage, as will be shown in the next
patch.

Fixes: f0f0ec49
Signed-off-by: Eric Blake <eblake at redhat.com>
---
 filters/retry/retry.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/filters/retry/retry.c b/filters/retry/retry.c
index 00dbd091..9747b4eb 100644
--- a/filters/retry/retry.c
+++ b/filters/retry/retry.c
@@ -322,17 +322,39 @@ retry_extents (struct nbdkit_next_ops *next_ops, void
*nxdata,
 {
   struct retry_handle *h = handle;
   struct retry_data data = {0};
+  CLEANUP_EXTENTS_FREE struct nbdkit_extents *extents2 = NULL;
   int r;
+  size_t i;

  again:
   if (next_ops->can_extents (nxdata) != 1) {
     *err = EIO;
     r = -1;
   }
-  else
-    r = next_ops->extents (nxdata, count, offset, flags, extents, err);
+  else {
+    /* Each retry must begin with extents reset to the right beginning. */
+    nbdkit_extents_free (extents2);
+    extents2 = nbdkit_extents_new (offset, next_ops->get_size (nxdata));
+    if (extents2 == NULL) {
+      *err = errno;
+      return -1; /* Not worth a retry after ENOMEM. */
+    }
+    r = next_ops->extents (nxdata, count, offset, flags, extents2, err);
+  }
   if (r == -1 && do_retry (h, &data, next_ops, nxdata, err)) goto
again;

+  if (r == 0) {
+    /* Transfer the successful extents back to the caller. */
+    for (i = 0; i < nbdkit_extents_count (extents2); ++i) {
+      struct nbdkit_extent e = nbdkit_get_extent (extents2, i);
+
+      if (nbdkit_add_extent (extents, e.offset, e.length, e.type) == -1) {
+        *err = errno;
+        return -1;
+      }
+    }
+  }
+
   return r;
 }

-- 
2.21.0

Eric Blake

2019-Oct-01 03:17 UTC

head link

[Libguestfs] [nbdkit PATCH 6/6] tests: Test retry after partial extents

Add coverage for the previous patch.  Each retry of .extents must be
into a fresh extents object, to avoid mismatch between provided and
expected offsets.

Signed-off-by: Eric Blake <eblake at redhat.com>
---
 tests/Makefile.am           |   2 +
 tests/test-retry-extents.sh | 114 ++++++++++++++++++++++++++++++++++++
 2 files changed, 116 insertions(+)
 create mode 100755 tests/test-retry-extents.sh

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 4cf0325e..60cba6c5 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -120,6 +120,7 @@ EXTRA_DIST = \
 	test.rb \
 	test-readahead-copy.sh \
 	test-retry.sh \
+	test-retry-extents.sh \
 	test-retry-readonly.sh \
 	test-retry-reopen-fail.sh \
 	test-shutdown.sh \
@@ -1059,6 +1060,7 @@ test_readahead_LDADD = libtest.la $(LIBGUESTFS_LIBS)
 TESTS += \
 	test-retry.sh \
 	test-retry-readonly.sh \
+	test-retry-extents.sh \
 	test-retry-reopen-fail.sh \
 	$(NULL)

diff --git a/tests/test-retry-extents.sh b/tests/test-retry-extents.sh
new file mode 100755
index 00000000..ffcef311
--- /dev/null
+++ b/tests/test-retry-extents.sh
@@ -0,0 +1,114 @@
+#!/usr/bin/env bash
+# nbdkit
+# Copyright (C) 2018-2019 Red Hat Inc.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+#
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution.
+#
+# * Neither the name of Red Hat nor the names of its contributors may be
+# used to endorse or promote products derived from this software without
+# specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS
IS'' AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
+# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
+# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+# SUCH DAMAGE.
+
+source ./functions.sh
+set -e
+set -x
+
+requires nbdsh --base-allocation -c 'quit()'
+
+files="retry-extents-count retry-extents-open-count"
+rm -f $files
+cleanup_fn rm -f $files
+
+touch retry-extents-count retry-extents-open-count
+start_t=$SECONDS
+
+# Create a custom plugin which will test retrying.
+nbdkit -v -U - \
+       sh - \
+       --filter=retry retry-delay=1 \
+       --run 'nbdsh --base-allocation --uri $uri -c "
+entries = []
+def f (metacontext, offset, e, err):
+    global entries
+    assert err.value == 0
+    assert metacontext == nbd.CONTEXT_BASE_ALLOCATION
+    entries = e
+h.block_status (1024, 0, f)
+assert entries == [ 512, 0,
+                    512, 3]
+       "' <<'EOF'
+#!/usr/bin/env bash
+case "$1" in
+    open)
+        # Count how many times the connection is (re-)opened.
+        read i < retry-extents-open-count
+        echo $((i+1)) > retry-extents-open-count
+        ;;
+    extents)
+        # Fail in three different ways then succeed.
+        read i < retry-extents-count
+        ((i++))
+        echo $i > retry-extents-count
+        case $i in
+          1) echo "EIO pread failed" >&2; exit 1;;
+          2) echo "garbage"
+             exit 0;;
+          3) echo "0 512 0"
+             echo "garbage"
+             exit 0;;
+          *) echo "0 512 0"
+             echo "512 512 3"
+             exit 0;;
+        esac
+        ;;
+
+    can_extents) exit 0 ;;
+    get_size) echo 1024 ;;
+    *) exit 2 ;;
+esac
+EOF
+
+# In this test we should see 3 failures:
+# extents FAILS
+# retry and wait 1 seconds
+# extents FAILS
+# retry and wait 2 seconds
+# extents FAILS
+# retry and wait 4 seconds
+# extents succeeds
+
+# The minimum time for the test should be 1+2+4 = 7 seconds.
+end_t=$SECONDS
+if [ $((end_t - start_t)) -lt 7 ]; then
+    echo "$0: test ran too quickly"
+    exit 1
+fi
+
+# Check the handle was opened 4 times (first open + one reopen for
+# each retry).
+read open_count < retry-extents-open-count
+if [ $open_count -ne 4 ]; then
+    echo "$0: open-count ($open_count) != 4"
+    exit 1
+fi
-- 
2.21.0

Richard W.M. Jones

2019-Oct-01 16:24 UTC

head link

Re: [Libguestfs] [nbdkit PATCH v2 1/6] server: Propagate unexpected nbdkit failure with --run

On Mon, Sep 30, 2019 at 10:17:01PM -0500, Eric Blake
wrote:> Use exit() instead of _exit() to allow I/O to flush and
> any atexit handlers to run.
Ah ha, I think this may explain why I have seen disappearing
error messages ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Richard W.M. Jones

2019-Oct-01 16:31 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 6/6] tests: Test retry after partial extents

Yes this series looks good, and also passes tests & valgrind, so:

ACK series

Thanks,
Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

Eric Blake

2019-Oct-02 16:22 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 3/6] retry: Check next_ops->can_FOO on retry

On 9/30/19 10:17 PM, Eric Blake wrote:> After a retry, if the second connection has fewer permissions than the
> first, but we blindly call next_ops->FOO, we end up triggering an
> assertion failure in backend.c.  This is particularly noticeable when
> the force_readonly flag is in effect, as that makes it much easier for
> there to be fewer permissions than before.
> 
> Add testsuite coverage of pwrite to demonstrate.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>   filters/retry/retry.c | 58 ++++++++++++++++++++++++++++++++++++++-----
>   1 file changed, 52 insertions(+), 6 deletions(-)
> @@ -307,7 +348,12 @@ retry_cache (struct nbdkit_next_ops *next_ops, void
*nxdata,
>     int r;
> 
>    again:
> -  r = next_ops->cache (nxdata, count, offset, flags, err);
> +  if (next_ops->can_cache (nxdata) != 1) {
This actually needs to be <= NBDKIT_CACHE_NONE (otherwise, it penalizes 
a change between EMULATE and NATIVE).

I'm also suspecting that I'll have to check can_fua and can_fast_zero 
for proper usage, but that will be a followup patch.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Seemingly Similar Threads

Search for more possibly parallel threads

Libguestfs - Oct 2019 - [nbdkit PATCH 6/6] tests: Test retry after partial extents

[Libguestfs] [nbdkit PATCH v2 0/6] Improve retry filter

[Libguestfs] [nbdkit PATCH v2 1/6] server: Propagate unexpected nbdkit failure with --run

[Libguestfs] [nbdkit PATCH v2 2/6] tests: Enhance captive test

[Libguestfs] [nbdkit PATCH 3/6] retry: Check next_ops->can_FOO on retry

[Libguestfs] [nbdkit PATCH 4/6] tests: Test for retry-readonly behavior

[Libguestfs] [nbdkit PATCH 5/6] retry: Avoid assertion during retried extents

[Libguestfs] [nbdkit PATCH 6/6] tests: Test retry after partial extents

Re: [Libguestfs] [nbdkit PATCH v2 1/6] server: Propagate unexpected nbdkit failure with --run

Re: [Libguestfs] [nbdkit PATCH 6/6] tests: Test retry after partial extents

Re: [Libguestfs] [nbdkit PATCH 3/6] retry: Check next_ops->can_FOO on retry

Seemingly Similar Threads