thr3ads.net - Libguestfs - [Libguestfs] [libnbd PATCH 0/6] new APIs: aio_in_flight, aio_FOO

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-Jun-29 13:28 UTC

[Libguestfs] [libnbd PATCH 0/6] new APIs: aio_in_flight, aio_FOO_notify

I still need to wire in the use of *_notify functions into nbdkit to
prove whether it makes the code any faster or easier to maintain, but
at least the added example shows one good use case for the new API.

Eric Blake (6):
  api: Add nbd_aio_in_flight
  generator: Allow DEAD state actions to run
  generator: Allow Int64 in callbacks
  states: Prepare for aio notify callback
  api: Add new nbd_aio_FOO_notify functions
  examples: New example for strict read validations

 .gitignore                                    |   1 +
 docs/libnbd.pod                               |  22 +-
 examples/Makefile.am                          |  14 +
 examples/batched-read-write.c                 |  17 +-
 examples/strict-structured-reads.c            | 270 +++++++++++++
 generator/generator                           | 375 ++++++++++++++++--
 generator/states-connect.c                    |  24 +-
 generator/states-issue-command.c              |   4 +-
 generator/states-magic.c                      |   6 +-
 generator/states-newstyle-opt-export-name.c   |   8 +-
 generator/states-newstyle-opt-go.c            |  20 +-
 .../states-newstyle-opt-set-meta-context.c    |  24 +-
 generator/states-newstyle-opt-starttls.c      |  18 +-
 .../states-newstyle-opt-structured-reply.c    |  10 +-
 generator/states-newstyle.c                   |   6 +-
 generator/states-oldstyle.c                   |   4 +-
 generator/states-reply-simple.c               |   2 +-
 generator/states-reply-structured.c           |  58 +--
 generator/states-reply.c                      |  16 +-
 generator/states.c                            |  33 +-
 lib/aio.c                                     |   9 +
 lib/internal.h                                |   6 +-
 lib/rw.c                                      | 105 ++++-
 tests/aio-parallel-load.c                     |  29 +-
 tests/aio-parallel.c                          |  15 +-
 tests/server-death.c                          |  17 +-
 26 files changed, 925 insertions(+), 188 deletions(-)
 create mode 100644 examples/strict-structured-reads.c

-- 
2.20.1

Eric Blake

2019-Jun-29 13:28 UTC

head link

[Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

Some clients need to know when it is safe to issue NBD_CMD_DISC, or to
decide whether calling poll(POLLIN) will block indefinitely because
the server isn't expected to respond.  Make this easier to learn by
tracking the count of commands we have queued up to send, as well as
the count of commands where we are waiting on the server's response.

Update tests/aio-parallel* and examples/batched-read-write to use
nbd's own in-flight counter instead of reimplementing it ourselves.

Note that h->in_flight is only ever updated while the lock is held;
but we may want to consider also making it atomic and therefore
readable as a lock-less function.
---
 examples/batched-read-write.c | 17 +++++---------
 generator/generator           | 42 ++++++++++++++++++++++++++---------
 lib/aio.c                     |  9 ++++++++
 lib/internal.h                |  4 +++-
 lib/rw.c                      |  6 +++++
 tests/aio-parallel-load.c     | 29 +++++++++++++-----------
 tests/aio-parallel.c          | 15 +++++--------
 7 files changed, 77 insertions(+), 45 deletions(-)

diff --git a/examples/batched-read-write.c b/examples/batched-read-write.c
index 90dfe86..194ad1c 100644
--- a/examples/batched-read-write.c
+++ b/examples/batched-read-write.c
@@ -48,26 +48,22 @@ try_deadlock (void *arg)
   struct pollfd fds[1];
   size_t i;
   int64_t handles[2], done;
-  size_t in_flight;        /* counts number of requests in flight */
   int dir, r;

   /* Issue commands. */
-  in_flight = 0;
   handles[0] = nbd_aio_pread (nbd, in, packetsize, 0, 0);
   if (handles[0] == -1) {
     fprintf (stderr, "%s\n", nbd_get_error ());
     goto error;
   }
-  in_flight++;
   handles[1] = nbd_aio_pwrite (nbd, out, packetsize, packetsize, 0);
   if (handles[1] == -1) {
     fprintf (stderr, "%s\n", nbd_get_error ());
     goto error;
   }
-  in_flight++;

   /* Now wait for commands to retire, or for deadlock to occur */
-  while (in_flight > 0) {
+  while (nbd_aio_in_flight (nbd) > 0) {
     if (nbd_aio_is_dead (nbd) || nbd_aio_is_closed (nbd)) {
       fprintf (stderr, "connection is dead or closed\n");
       goto error;
@@ -96,23 +92,20 @@ try_deadlock (void *arg)

     /* If a command is ready to retire, retire it. */
     while ((done = nbd_aio_peek_command_completed (nbd)) > 0) {
-      for (i = 0; i < in_flight; ++i) {
+      for (i = 0; i < sizeof handles / sizeof handles[0]; ++i) {
         if (handles[i] == done) {
           r = nbd_aio_command_completed (nbd, handles[i]);
           if (r == -1) {
             fprintf (stderr, "%s\n", nbd_get_error ());
             goto error;
           }
-          assert (r);
-          memmove (&handles[i], &handles[i+1],
-                   sizeof (handles[0]) * (in_flight - i - 1));
-          break;
+          assert (r == 1);
+          handles[i] = 0;
         }
       }
-      assert (i < in_flight);
-      in_flight--;
     }
   }
+  assert (nbd_aio_in_flight (nbd) == 0);

   printf ("finished OK\n");

diff --git a/generator/generator b/generator/generator
index 9192988..5cc3e80 100755
--- a/generator/generator
+++ b/generator/generator
@@ -1749,6 +1749,12 @@ not a normal command because NBD servers are not obliged
 to send a reply.  Instead you should wait for
 C<nbd_aio_is_closed> to become true on the connection.

+Although libnbd does not prevent you from issuing this command while
+still waiting on the replies to previous commands, the NBD protocol
+recommends that you wait until there are no other commands in flight
+(see C<nbd_aio_in_flight>), to give the server a better chance at a
+clean shutdown.
+
 The C<flags> parameter must be C<0> for now (it exists for future
NBD
 protocol extensions).  There is no direct synchronous counterpart;
 however, C<nbd_shutdown> will call this function if appropriate.";
@@ -1867,16 +1873,16 @@ you would set C<events = POLLIN>.  If
C<revents> returns C<POLLIN>
 or C<POLLHUP> you would then call C<nbd_aio_notify_read>.

 Note that once libnbd reaches C<nbd_aio_is_ready>, this direction is
-returned even before a command is issued via C<nbd_aio_pwrite> and
-friends. In a single-threaded use of libnbd, it is not worth polling
-until after issuing a command, as otherwise the server will never wake
-up the poll. In a multi-threaded scenario, you can have one thread
-begin a polling loop prior to any commands, but any other thread that
-issues a command will need a way to kick the polling thread out of
-poll in case issuing the command changes the needed polling
-direction. Possible ways to do this include polling for activity on a
-pipe-to-self, or using L<pthread_kill(3)> to send a signal that is
-masked except during L<ppoll(2)>.
+returned even when there are no commands in flight (see
+C<nbd_aio_in_flight>). In a single-threaded use of libnbd, it is not
+worth polling until after issuing a command, as otherwise the server
+will never wake up the poll. In a multi-threaded scenario, you can
+have one thread begin a polling loop prior to any commands, but any
+other thread that issues a command will need a way to kick the
+polling thread out of poll in case issuing the command changes the
+needed polling direction. Possible ways to do this include polling
+for activity on a pipe-to-self, or using L<pthread_kill(3)> to send
+a signal that is masked except during L<ppoll(2)>.

 =item C<LIBNBD_AIO_DIRECTION_WRITE> = 2

@@ -2012,6 +2018,22 @@ C<nbd_aio_command_completed> to actually retire the
command and learn
 whether the command was successful.";
   };

+  "aio_in_flight", {
+    default_call with
+    args = []; ret = RInt;
+    permitted_states = [ Connected; Closed; Dead ];
+    (* XXX is_locked = false ? *)
+    shortdesc = "check how many aio commands are still in flight";
+    longdesc = "\
+Return the number of in-flight aio commands that are still awaiting a
+response from the server before they can be retired.  If this returns
+a non-zero value when requesting a disconnect from the server (see
+C<nbd_aio_disconnect> and C<nbd_shutdown>), libnbd does not try to
+wait for those commands to complete gracefully; if the server strands
+commands while shutting down, C<nbd_aio_command_completed> will not
+be able to report status on those commands.";
+  };
+
   "connection_state", {
     default_call with
     args = []; ret = RConstString;
diff --git a/lib/aio.c b/lib/aio.c
index c68a059..b29378b 100644
--- a/lib/aio.c
+++ b/lib/aio.c
@@ -23,6 +23,7 @@
 #include <stdbool.h>
 #include <errno.h>
 #include <inttypes.h>
+#include <assert.h>

 #include "internal.h"

@@ -84,6 +85,8 @@ nbd_unlocked_aio_command_completed (struct nbd_handle *h,
     prev_cmd->next = cmd->next;
   else
     h->cmds_done = cmd->next;
+  h->in_flight--;
+  assert (h->in_flight >= 0);

   free (cmd);

@@ -110,3 +113,9 @@ nbd_unlocked_aio_peek_command_completed (struct nbd_handle
*h)
   set_error (EINVAL, "no commands are in flight");
   return -1;
 }
+
+int
+nbd_unlocked_aio_in_flight (struct nbd_handle *h)
+{
+  return h->in_flight;
+}
diff --git a/lib/internal.h b/lib/internal.h
index 5aa9f22..15f4b64 100644
--- a/lib/internal.h
+++ b/lib/internal.h
@@ -186,9 +186,11 @@ struct nbd_handle {
    * to be issued.  The second list contains commands which have been
    * issued and waiting for replies.  The third list contains commands
    * which we have received replies, waiting for the main program to
-   * acknowledge them.
+   * acknowledge them.  in_flight tracks the combined length of the
+   * first two lists.
    */
   struct command_in_flight *cmds_to_issue, *cmds_in_flight, *cmds_done;
+  int in_flight;
   /* Current command during a REPLY cycle */
   struct command_in_flight *reply_cmd;

diff --git a/lib/rw.c b/lib/rw.c
index 6b57f11..53cd521 100644
--- a/lib/rw.c
+++ b/lib/rw.c
@@ -25,6 +25,7 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <assert.h>
+#include <limits.h>

 #include "internal.h"

@@ -167,6 +168,10 @@ nbd_internal_command_common (struct nbd_handle *h,
       set_error (EINVAL, "cannot request more commands after
NBD_CMD_DISC");
       return -1;
   }
+  if (h->in_flight == INT_MAX) {
+      set_error (ENOMEM, "too many commands already in flight");
+      return -1;
+  }

   switch (type) {
     /* Commands which send or receive data are limited to MAX_REQUEST_SIZE. */
@@ -236,6 +241,7 @@ nbd_internal_command_common (struct nbd_handle *h,
       return -1;
   }

+  h->in_flight++;
   return cmd->handle;
 }

diff --git a/tests/aio-parallel-load.c b/tests/aio-parallel-load.c
index 7922acd..a68c714 100644
--- a/tests/aio-parallel-load.c
+++ b/tests/aio-parallel-load.c
@@ -189,7 +189,6 @@ start_thread (void *arg)
   size_t i;
   uint64_t offset, handle;
   uint64_t handles[MAX_IN_FLIGHT];
-  size_t in_flight;        /* counts number of requests in flight */
   int dir, r, cmd;
   time_t t;
   bool expired = false;
@@ -231,8 +230,8 @@ start_thread (void *arg)
   assert (nbd_read_only (nbd) == 0);

   /* Issue commands. */
-  in_flight = 0;
-  while (!expired || in_flight > 0) {
+  assert (nbd_aio_in_flight (nbd) == 0);
+  while (!expired || nbd_aio_in_flight (nbd) > 0) {
     if (nbd_aio_is_dead (nbd) || nbd_aio_is_closed (nbd)) {
       fprintf (stderr, "thread %zu: connection is dead or closed\n",
                status->i);
@@ -243,12 +242,12 @@ start_thread (void *arg)
     time (&t);
     if (t > status->end_time) {
       expired = true;
-      if (!in_flight)
+      if (nbd_aio_in_flight (nbd) <= 0)
         break;
     }

     /* If we can issue another request, do so. */
-    while (!expired && in_flight < MAX_IN_FLIGHT) {
+    while (!expired && nbd_aio_in_flight (nbd) < MAX_IN_FLIGHT) {
       offset = rand () % (EXPORTSIZE - buf_size);
       cmd = rand () & 1;
       if (cmd == 0) {
@@ -263,10 +262,14 @@ start_thread (void *arg)
         fprintf (stderr, "%s\n", nbd_get_error ());
         goto error;
       }
-      handles[in_flight] = handle;
-      in_flight++;
-      if (in_flight > status->most_in_flight)
-        status->most_in_flight = in_flight;
+      for (i = 0; i < MAX_IN_FLIGHT; i++) {
+        if (handles[i] == 0) {
+          handles[i] = handle;
+          break;
+        }
+      }
+      if (nbd_aio_in_flight (nbd)  > status->most_in_flight)
+        status->most_in_flight = nbd_aio_in_flight (nbd);
     }

     fds[0].fd = nbd_aio_get_fd (nbd);
@@ -291,16 +294,16 @@ start_thread (void *arg)
       nbd_aio_notify_write (nbd);

     /* If a command is ready to retire, retire it. */
-    for (i = 0; i < in_flight; ++i) {
+    for (i = 0; i < MAX_IN_FLIGHT; ++i) {
+      if (handles[i] == 0)
+        continue;
       r = nbd_aio_command_completed (nbd, handles[i]);
       if (r == -1) {
         fprintf (stderr, "%s\n", nbd_get_error ());
         goto error;
       }
       if (r) {
-        memmove (&handles[i], &handles[i+1],
-                 sizeof (handles[0]) * (in_flight - i - 1));
-        in_flight--;
+        handles[i] = 0;
         status->requests++;
       }
     }
diff --git a/tests/aio-parallel.c b/tests/aio-parallel.c
index a9b0fd9..b5d126a 100644
--- a/tests/aio-parallel.c
+++ b/tests/aio-parallel.c
@@ -198,7 +198,6 @@ start_thread (void *arg)
   size_t i;
   int64_t offset, handle;
   char *buf;
-  size_t in_flight;        /* counts number of requests in flight */
   int dir, r, cmd;
   time_t t;
   bool expired = false;
@@ -238,8 +237,8 @@ start_thread (void *arg)
   assert (nbd_read_only (nbd) == 0);

   /* Issue commands. */
-  in_flight = 0;
-  while (!expired || in_flight > 0) {
+  assert (nbd_aio_in_flight (nbd) == 0);
+  while (!expired || nbd_aio_in_flight (nbd) > 0) {
     if (nbd_aio_is_dead (nbd) || nbd_aio_is_closed (nbd)) {
       fprintf (stderr, "thread %zu: connection is dead or closed\n",
                status->i);
@@ -250,12 +249,12 @@ start_thread (void *arg)
     time (&t);
     if (t > status->end_time) {
       expired = true;
-      if (!in_flight)
+      if (nbd_aio_in_flight (nbd) <= 0)
         break;
     }

     /* If we can issue another request, do so. */
-    while (!expired && in_flight < MAX_IN_FLIGHT) {
+    while (!expired && nbd_aio_in_flight (nbd) < MAX_IN_FLIGHT) {
       /* Find a free command slot. */
       for (i = 0; i < MAX_IN_FLIGHT; ++i)
         if (commands[status->i][i].offset == -1)
@@ -282,9 +281,8 @@ start_thread (void *arg)
       commands[status->i][i].offset = offset;
       commands[status->i][i].handle = handle;
       commands[status->i][i].cmd = cmd;
-      in_flight++;
-      if (in_flight > status->most_in_flight)
-        status->most_in_flight = in_flight;
+      if (nbd_aio_in_flight (nbd) > status->most_in_flight)
+        status->most_in_flight = nbd_aio_in_flight (nbd);
     }

     fds[0].fd = nbd_aio_get_fd (nbd);
@@ -329,7 +327,6 @@ start_thread (void *arg)
           }

           commands[status->i][i].offset = -1;
-          in_flight--;
           status->requests++;
         }
       }
-- 
2.20.1

Eric Blake

2019-Jun-29 13:28 UTC

head link

[Libguestfs] [libnbd PATCH 2/6] generator: Allow DEAD state actions to run

Most of the states were calling SET_NEXT_STATE(%.DEAD) then using
return -1 on error, to reflect the fact that they had also called
set_error() and wanted the caller to notice the failure.
Unfortunately, the state machine engine refuses to run the entry code
of the next state when the current state returned -1, which meant the
DEAD state entry code never runs.

A concrete example of the problems this creates: qemu-nbd defaults to
allowing only one connection at a time (newer nbdkit with the
noparallel filter can do the same, althoug it's not as picky on export
names).  The DEAD state claims to close() the fd as soon as we detect
a problem, where an example problem is requesting an export name not
present on the server (NBD_OPT_GO fails, so we move to the DEAD state
since we can't retry nbd_set_export_name without a new handle).  But
since the libnbd code gave up by returning -1, the DEAD state cleanup
never runs, and we end up leaving the connection fd open instead,
blocking out a second connection until we use nbd_close() to free all
resources.

Worse, we haven't yet wired up nbd_close into the python bindings;
until that happens, there is no way in nbdsh to wipe out an existing
failed connection and replace it with another attempt where we set the
correct export name, short of completely exiting nbdsh:

$ qemu-nbd -k /tmp/a -f raw -x a file
$ ./run nbdsh
nbd> h.connect_unix('/tmp/a')
Traceback (most recent call last):
  File "/usr/lib64/python3.7/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<console>", line 1, in <module>
  File "/home/eblake/libnbd/python/nbd.py", line 340, in connect_unix
    return libnbdmod.connect_unix (self._o, unixsocket)
nbd.Error: nbd_connect_unix: handshake: server has no export named '':
No such file or directory (ENOENT)
nbd> h = nbd.NBD()
nbd> h.set_export_name('a')
nbd> h.connect_unix('/tmp/a')
 ... HANGS ...

This patch does not address the missing Python binding, but does fix
the cleanup problem by ensuring that any transition to DEAD closes the
client end of the fd right away, rather than waiting until the final
nbd_close(), while still returning an error to the caller. This is
done by mandating that other states never move to DEAD without first
setting an error but returning 0, then letting DEAD's code return -1.
---
 generator/generator                           | 30 +++++++---
 generator/states-connect.c                    | 24 ++++----
 generator/states-issue-command.c              |  4 +-
 generator/states-magic.c                      |  6 +-
 generator/states-newstyle-opt-export-name.c   |  8 +--
 generator/states-newstyle-opt-go.c            | 20 +++----
 .../states-newstyle-opt-set-meta-context.c    | 24 ++++----
 generator/states-newstyle-opt-starttls.c      | 18 +++---
 .../states-newstyle-opt-structured-reply.c    | 10 ++--
 generator/states-newstyle.c                   |  6 +-
 generator/states-oldstyle.c                   |  4 +-
 generator/states-reply-simple.c               |  2 +-
 generator/states-reply-structured.c           | 58 +++++++++----------
 generator/states-reply.c                      |  8 +--
 generator/states.c                            |  4 +-
 15 files changed, 121 insertions(+), 105 deletions(-)

diff --git a/generator/generator b/generator/generator
index 5cc3e80..45a030f 100755
--- a/generator/generator
+++ b/generator/generator
@@ -47,15 +47,26 @@ if not (Sys.file_exists "lib/handle.c") then
  * Each handle starts in the top level START state.
  *
  * When you enter a state, the associated C code for that state
- * runs.  If the C code calls SET_NEXT_STATE then the connection
- * enters the next state without blocking.  If the C code does _not_
- * call SET_NEXT_STATE before returning then the state machine
- * blocks and will not be re-entered until an external event
- * happens (see below).
+ * runs.  If the C code calls SET_NEXT_STATE and returns 0 then
+ * the connection enters the next state without blocking.  If the
+ * C code calls SET_NEXT_STATE_AND_BLOCK and returns 0 then the
+ * connection blocks, but will resume with the code for the next
+ * state on the next external event.  If the C code does _not_
+ * call either macro but returns 0, the state machine is blocked
+ * and will not be re-entered until an external event happens
+ * (see below), where the same C code will be executed again on
+ * re-entry.  If the C code calls returns -1 after using
+ * set_error(), then the state machine blocks and the caller
+ * should report failure; the next external event will resume the
+ * state machine according to whether SET_NEXT_STATE was used.
  *
  * There are various end states such as CLOSED and DEAD.  These
- * are not special, it's just that they have no way to move to
- * another state.
+ * are not special in relation to the above state transition rules,
+ * it's just that they have no way to move to another state.  However,
+ * the DEAD state expects that set_error() was used in the previous
+ * state, and will return -1 itself after performing cleanup actions;
+ * the earlier state that wants to transition to DEAD should return 0
+ * rather than -1, so as not to bypass this cleanup.
  *
  * An external event is something like the file descriptor being
  * ready to read or write, or the main program calling a function
@@ -2708,7 +2719,10 @@ let generate_lib_states_c ()    pr "      abort ();
/* Should never happen, but keeps GCC happy. */\n";
   pr "    }\n";
   pr "\n";
-  pr "    if (r == -1) return -1;\n";
+  pr "    if (r == -1) {\n";
+  pr "      assert (nbd_get_error () != NULL);\n";
+  pr "      return -1;\n";
+  pr "    }\n";
   pr "  } while (!blocked);\n";
   pr "  return 0;\n";
   pr "}\n";
diff --git a/generator/states-connect.c b/generator/states-connect.c
index c6a4ae7..9e2e1d4 100644
--- a/generator/states-connect.c
+++ b/generator/states-connect.c
@@ -55,12 +55,12 @@ disable_nagle (int sock)
   if (fd == -1) {
     SET_NEXT_STATE (%.DEAD);
     set_error (errno, "socket");
-    return -1;
+    return 0;
   }
   h->sock = nbd_internal_socket_create (fd);
   if (!h->sock) {
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }

   disable_nagle (fd);
@@ -70,7 +70,7 @@ disable_nagle (int sock)
     if (errno != EINPROGRESS) {
       SET_NEXT_STATE (%.DEAD);
       set_error (errno, "connect");
-      return -1;
+      return 0;
     }
   }
   return 0;
@@ -83,7 +83,7 @@ disable_nagle (int sock)
                   SOL_SOCKET, SO_ERROR, &status, &len) == -1) {
     SET_NEXT_STATE (%.DEAD);
     set_error (errno, "getsockopt: SO_ERROR");
-    return -1;
+    return 0;
   }
   /* This checks the status of the original connect call. */
   if (status == 0) {
@@ -93,7 +93,7 @@ disable_nagle (int sock)
   else {
     SET_NEXT_STATE (%.DEAD);
     set_error (status, "connect");
-    return -1;
+    return 0;
   }

  CONNECT_UNIX.START:
@@ -107,7 +107,7 @@ disable_nagle (int sock)
   if (namelen > sizeof sun.sun_path) {
     set_error (ENAMETOOLONG, "socket name too long: %s",
h->unixsocket);
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }
   memcpy (sun.sun_path, h->unixsocket, namelen);
   len = sizeof sun;
@@ -175,7 +175,7 @@ disable_nagle (int sock)
   h->sock = nbd_internal_socket_create (fd);
   if (!h->sock) {
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }

   disable_nagle (fd);
@@ -198,7 +198,7 @@ disable_nagle (int sock)
                   SOL_SOCKET, SO_ERROR, &status, &len) == -1) {
     SET_NEXT_STATE (%.DEAD);
     set_error (errno, "getsockopt: SO_ERROR");
-    return -1;
+    return 0;
   }
   /* This checks the status of the original connect call. */
   if (status == 0)
@@ -228,7 +228,7 @@ disable_nagle (int sock)
   if (socketpair (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, sv) == -1) {
     SET_NEXT_STATE (%.DEAD);
     set_error (errno, "socketpair");
-    return -1;
+    return 0;
   }

   pid = fork ();
@@ -237,7 +237,7 @@ disable_nagle (int sock)
     set_error (errno, "fork");
     close (sv[0]);
     close (sv[1]);
-    return -1;
+    return 0;
   }
   if (pid == 0) {         /* child - run command */
     close (0);
@@ -268,14 +268,14 @@ disable_nagle (int sock)
       fcntl (sv[0], F_SETFL, flags|O_NONBLOCK) == -1) {
     SET_NEXT_STATE (%.DEAD);
     close (sv[0]);
-    return -1;
+    return 0;
   }

   h->sock = nbd_internal_socket_create (sv[0]);
   if (!h->sock) {
     SET_NEXT_STATE (%.DEAD);
     close (sv[0]);
-    return -1;
+    return 0;
   }

   /* The sockets are connected already, we can jump directly to
diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c
index 35f3c79..8890e1c 100644
--- a/generator/states-issue-command.c
+++ b/generator/states-issue-command.c
@@ -54,7 +54,7 @@

  ISSUE_COMMAND.SEND_REQUEST:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%PREPARE_WRITE_PAYLOAD);
   }
   return 0;
@@ -85,7 +85,7 @@

  ISSUE_COMMAND.SEND_WRITE_PAYLOAD:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%FINISH);
   }
   return 0;
diff --git a/generator/states-magic.c b/generator/states-magic.c
index 93c92fc..de8d235 100644
--- a/generator/states-magic.c
+++ b/generator/states-magic.c
@@ -27,7 +27,7 @@

  MAGIC.RECV_MAGIC:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK_MAGIC);
   }
   return 0;
@@ -38,7 +38,7 @@
   if (strncmp (h->sbuf.new_handshake.nbdmagic, "NBDMAGIC", 8) !=
0) {
     SET_NEXT_STATE (%.DEAD);
     set_error (0, "handshake: server did not send expected NBD
magic");
-    return -1;
+    return 0;
   }

   version = be64toh (h->sbuf.new_handshake.version);
@@ -49,7 +49,7 @@
   else {
     SET_NEXT_STATE (%.DEAD);
     set_error (0, "handshake: server is not either an oldstyle or fixed
newstyle NBD server");
-    return -1;
+    return 0;
   }
   return 0;

diff --git a/generator/states-newstyle-opt-export-name.c
b/generator/states-newstyle-opt-export-name.c
index 968cea8..ec73136 100644
--- a/generator/states-newstyle-opt-export-name.c
+++ b/generator/states-newstyle-opt-export-name.c
@@ -31,7 +31,7 @@

  NEWSTYLE.OPT_EXPORT_NAME.SEND:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->wbuf = h->export_name;
     h->wlen = strlen (h->export_name);
@@ -41,7 +41,7 @@

  NEWSTYLE.OPT_EXPORT_NAME.SEND_EXPORT:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->rbuf = &h->sbuf;
     h->rlen = sizeof h->sbuf.export_name_reply;
@@ -53,7 +53,7 @@

  NEWSTYLE.OPT_EXPORT_NAME.RECV_REPLY:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK_REPLY);
   }
   return 0;
@@ -66,7 +66,7 @@
   eflags = be16toh (h->sbuf.export_name_reply.eflags);
   if (nbd_internal_set_size_and_flags (h, exportsize, eflags) == -1) {
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }
   SET_NEXT_STATE (%.READY);
   return 0;
diff --git a/generator/states-newstyle-opt-go.c
b/generator/states-newstyle-opt-go.c
index e245c75..49875a5 100644
--- a/generator/states-newstyle-opt-go.c
+++ b/generator/states-newstyle-opt-go.c
@@ -34,7 +34,7 @@
   const uint32_t exportnamelen = strlen (h->export_name);

   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->sbuf.len = htobe32 (exportnamelen);
     h->wbuf = &h->sbuf;
@@ -46,7 +46,7 @@

  NEWSTYLE.OPT_GO.SEND_EXPORTNAMELEN:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->wbuf = h->export_name;
     h->wlen = strlen (h->export_name);
@@ -57,7 +57,7 @@

  NEWSTYLE.OPT_GO.SEND_EXPORT:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->sbuf.nrinfos = 0;
     h->wbuf = &h->sbuf;
@@ -68,7 +68,7 @@

  NEWSTYLE.OPT_GO.SEND_NRINFOS:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->rbuf = &h->sbuf;
     h->rlen = sizeof h->sbuf.or.option_reply;
@@ -78,11 +78,11 @@

  NEWSTYLE.OPT_GO.RECV_REPLY:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     if (prepare_for_reply_payload (h, NBD_OPT_GO) == -1) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }
     SET_NEXT_STATE (%RECV_REPLY_PAYLOAD);
   }
@@ -90,7 +90,7 @@

  NEWSTYLE.OPT_GO.RECV_REPLY_PAYLOAD:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK_REPLY);
   }
   return 0;
@@ -121,13 +121,13 @@
         if (len != sizeof h->sbuf.or.payload.export) {
           SET_NEXT_STATE (%.DEAD);
           set_error (0, "handshake: incorrect NBD_INFO_EXPORT option reply
length");
-          return -1;
+          return 0;
         }
         exportsize = be64toh (h->sbuf.or.payload.export.exportsize);
         eflags = be16toh (h->sbuf.or.payload.export.eflags);
         if (nbd_internal_set_size_and_flags (h, exportsize, eflags) == -1) {
           SET_NEXT_STATE (%.DEAD);
-          return -1;
+          return 0;
         }
         break;
       default:
@@ -177,7 +177,7 @@
       }
     }
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }

 } /* END STATE MACHINE */
diff --git a/generator/states-newstyle-opt-set-meta-context.c
b/generator/states-newstyle-opt-set-meta-context.c
index a00a411..7904fe7 100644
--- a/generator/states-newstyle-opt-set-meta-context.c
+++ b/generator/states-newstyle-opt-set-meta-context.c
@@ -53,7 +53,7 @@

  NEWSTYLE.OPT_SET_META_CONTEXT.SEND:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->sbuf.len = htobe32 (strlen (h->export_name));
     h->wbuf = &h->sbuf.len;
@@ -65,7 +65,7 @@

  NEWSTYLE.OPT_SET_META_CONTEXT.SEND_EXPORTNAMELEN:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->wbuf = h->export_name;
     h->wlen = strlen (h->export_name);
@@ -76,7 +76,7 @@

  NEWSTYLE.OPT_SET_META_CONTEXT.SEND_EXPORTNAME:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->sbuf.nrqueries        htobe32 (nbd_internal_string_list_length
(h->request_meta_contexts));
@@ -89,7 +89,7 @@

  NEWSTYLE.OPT_SET_META_CONTEXT.SEND_NRQUERIES:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->querynum = 0;
     SET_NEXT_STATE (%PREPARE_NEXT_QUERY);
@@ -115,7 +115,7 @@
   const char *query = h->request_meta_contexts[h->querynum];

   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->wbuf = query;
     h->wlen = strlen (query);
@@ -125,7 +125,7 @@

  NEWSTYLE.OPT_SET_META_CONTEXT.SEND_QUERY:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->querynum++;
     SET_NEXT_STATE (%PREPARE_NEXT_QUERY);
@@ -140,11 +140,11 @@

  NEWSTYLE.OPT_SET_META_CONTEXT.RECV_REPLY:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     if (prepare_for_reply_payload (h, NBD_OPT_SET_META_CONTEXT) == -1) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }
     SET_NEXT_STATE (%RECV_REPLY_PAYLOAD);
   }
@@ -152,7 +152,7 @@

  NEWSTYLE.OPT_SET_META_CONTEXT.RECV_REPLY_PAYLOAD:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK_REPLY);
   }
   return 0;
@@ -178,7 +178,7 @@
       if (meta_context == NULL) {
         set_error (errno, "malloc");
         SET_NEXT_STATE (%.DEAD);
-        return -1;
+        return 0;
       }
       meta_context->context_id          be32toh
(h->sbuf.or.payload.context.context.context_id);
@@ -189,7 +189,7 @@
         set_error (errno, "strdup");
         SET_NEXT_STATE (%.DEAD);
         free (meta_context);
-        return -1;
+        return 0;
       }
       debug (h, "negotiated %s with context ID %" PRIu32,
              meta_context->name, meta_context->context_id);
@@ -202,7 +202,7 @@
     /* Anything else is an error, ignore it */
     if (handle_reply_error (h) == -1) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }

     debug (h, "handshake: unexpected error from "
diff --git a/generator/states-newstyle-opt-starttls.c
b/generator/states-newstyle-opt-starttls.c
index 61f254f..0a18db0 100644
--- a/generator/states-newstyle-opt-starttls.c
+++ b/generator/states-newstyle-opt-starttls.c
@@ -36,7 +36,7 @@

  NEWSTYLE.OPT_STARTTLS.SEND:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->rbuf = &h->sbuf;
     h->rlen = sizeof (h->sbuf.or.option_reply);
@@ -46,11 +46,11 @@

  NEWSTYLE.OPT_STARTTLS.RECV_REPLY:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     if (prepare_for_reply_payload (h, NBD_OPT_STARTTLS) == -1) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }
     SET_NEXT_STATE (%RECV_REPLY_PAYLOAD);
   }
@@ -58,7 +58,7 @@

  NEWSTYLE.OPT_STARTTLS.RECV_REPLY_PAYLOAD:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK_REPLY);
   }
   return 0;
@@ -73,7 +73,7 @@
     new_sock = nbd_internal_crypto_create_session (h, h->sock);
     if (new_sock == NULL) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }
     h->sock = new_sock;
     if (nbd_internal_crypto_is_reading (h))
@@ -85,7 +85,7 @@
   default:
     if (handle_reply_error (h) == -1) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }

     /* Server refused to upgrade to TLS.  If h->tls is not require (2)
@@ -95,7 +95,7 @@
       SET_NEXT_STATE (%.DEAD);
       set_error (ENOTSUP, "handshake: server refused TLS, "
                  "but handle TLS setting is require (2)");
-      return -1;
+      return 0;
     }

     debug (h,
@@ -112,7 +112,7 @@
   r = nbd_internal_crypto_handshake (h);
   if (r == -1) {
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }
   if (r == 0) {
     /* Finished handshake. */
@@ -135,7 +135,7 @@
   r = nbd_internal_crypto_handshake (h);
   if (r == -1) {
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }
   if (r == 0) {
     /* Finished handshake. */
diff --git a/generator/states-newstyle-opt-structured-reply.c
b/generator/states-newstyle-opt-structured-reply.c
index 65d5958..d932248 100644
--- a/generator/states-newstyle-opt-structured-reply.c
+++ b/generator/states-newstyle-opt-structured-reply.c
@@ -30,7 +30,7 @@

  NEWSTYLE.OPT_STRUCTURED_REPLY.SEND:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     h->rbuf = &h->sbuf;
     h->rlen = sizeof h->sbuf.or.option_reply;
@@ -40,11 +40,11 @@

  NEWSTYLE.OPT_STRUCTURED_REPLY.RECV_REPLY:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     if (prepare_for_reply_payload (h, NBD_OPT_STRUCTURED_REPLY) == -1) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }
     SET_NEXT_STATE (%RECV_REPLY_PAYLOAD);
   }
@@ -52,7 +52,7 @@

  NEWSTYLE.OPT_STRUCTURED_REPLY.RECV_REPLY_PAYLOAD:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK_REPLY);
   }
   return 0;
@@ -69,7 +69,7 @@
   default:
     if (handle_reply_error (h) == -1) {
       SET_NEXT_STATE (%.DEAD);
-      return -1;
+      return 0;
     }

     debug (h, "structured replies are not supported by this server");
diff --git a/generator/states-newstyle.c b/generator/states-newstyle.c
index 912ecb5..c8f817e 100644
--- a/generator/states-newstyle.c
+++ b/generator/states-newstyle.c
@@ -119,7 +119,7 @@ handle_reply_error (struct nbd_handle *h)

  NEWSTYLE.RECV_GFLAGS:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK_GFLAGS);
   }
   return 0;
@@ -133,7 +133,7 @@ handle_reply_error (struct nbd_handle *h)
     SET_NEXT_STATE (%.DEAD);
     set_error (ENOTSUP, "handshake: server is not fixed newstyle, "
                "but handle TLS setting is require (2)");
-    return -1;
+    return 0;
   }

   cflags = h->gflags & (NBD_FLAG_FIXED_NEWSTYLE|NBD_FLAG_NO_ZEROES);
@@ -145,7 +145,7 @@ handle_reply_error (struct nbd_handle *h)

  NEWSTYLE.SEND_CFLAGS:
   switch (send_from_wbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:
     /* Start sending options. */
     if ((h->gflags & NBD_FLAG_FIXED_NEWSTYLE) == 0)
diff --git a/generator/states-oldstyle.c b/generator/states-oldstyle.c
index b5618af..668931b 100644
--- a/generator/states-oldstyle.c
+++ b/generator/states-oldstyle.c
@@ -32,7 +32,7 @@

  OLDSTYLE.RECV_REMAINING:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 0:  SET_NEXT_STATE (%CHECK);
   }
   return 0;
@@ -51,7 +51,7 @@

   if (nbd_internal_set_size_and_flags (h, exportsize, eflags) == -1) {
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }

   SET_NEXT_STATE (%.READY);
diff --git a/generator/states-reply-simple.c b/generator/states-reply-simple.c
index cab72d6..23b6b5f 100644
--- a/generator/states-reply-simple.c
+++ b/generator/states-reply-simple.c
@@ -52,7 +52,7 @@
   struct command_in_flight *cmd = h->reply_cmd;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
diff --git a/generator/states-reply-structured.c
b/generator/states-reply-structured.c
index 91c6215..9a8677d 100644
--- a/generator/states-reply-structured.c
+++ b/generator/states-reply-structured.c
@@ -37,7 +37,7 @@

  REPLY.STRUCTURED_REPLY.RECV_REMAINING:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
@@ -70,14 +70,14 @@
   if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.sr.payload.offset_data)
{
     set_error (0, "invalid server reply length");
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }

   if (NBD_REPLY_TYPE_IS_ERR (type)) {
     if (length < sizeof h->sbuf.sr.payload.error.error) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "too short length in structured reply error");
-      return -1;
+      return 0;
     }
     h->rbuf = &h->sbuf.sr.payload.error.error;
     h->rlen = sizeof h->sbuf.sr.payload.error.error;
@@ -88,12 +88,12 @@
     if (length != 0) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "invalid length in NBD_REPLY_TYPE_NONE");
-      return -1;
+      return 0;
     }
     if (!(flags & NBD_REPLY_FLAG_DONE)) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "NBD_REPLY_FLAG_DONE must be set in
NBD_REPLY_TYPE_NONE");
-      return -1;
+      return 0;
     }
     SET_NEXT_STATE (%FINISH);
     return 0;
@@ -109,12 +109,12 @@
                  "cmd->type=%" PRIu16 ", "
                  "this is likely to be a bug in the server",
                  cmd->type);
-      return -1;
+      return 0;
     }
     if (length < sizeof h->sbuf.sr.payload.offset_data) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "too short length in
NBD_REPLY_TYPE_OFFSET_DATA");
-      return -1;
+      return 0;
     }
     h->rbuf = &h->sbuf.sr.payload.offset_data;
     h->rlen = sizeof h->sbuf.sr.payload.offset_data;
@@ -128,12 +128,12 @@
                  "cmd->type=%" PRIu16 ", "
                  "this is likely to be a bug in the server",
                  cmd->type);
-      return -1;
+      return 0;
     }
     if (length != sizeof h->sbuf.sr.payload.offset_hole) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "invalid length in NBD_REPLY_TYPE_OFFSET_HOLE");
-      return -1;
+      return 0;
     }
     h->rbuf = &h->sbuf.sr.payload.offset_hole;
     h->rlen = sizeof h->sbuf.sr.payload.offset_hole;
@@ -147,18 +147,18 @@
                  "cmd->type=%" PRIu16 ", "
                  "this is likely to be a bug in the server",
                  cmd->type);
-      return -1;
+      return 0;
     }
     /* XXX We should be able to skip the bad reply in these two cases. */
     if (length < 12 || ((length-4) & 7) != 0) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "invalid length in NBD_REPLY_TYPE_BLOCK_STATUS");
-      return -1;
+      return 0;
     }
     if (cmd->cb.fn.extent == NULL) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "not expecting NBD_REPLY_TYPE_BLOCK_STATUS
here");
-      return -1;
+      return 0;
     }
     /* We read the context ID followed by all the entries into a
      * single array and deal with it at the end.
@@ -168,7 +168,7 @@
     if (h->bs_entries == NULL) {
       SET_NEXT_STATE (%.DEAD);
       set_error (errno, "malloc");
-      return -1;
+      return 0;
     }
     h->rbuf = h->bs_entries;
     h->rlen = length;
@@ -178,14 +178,14 @@
   else {
     SET_NEXT_STATE (%.DEAD);
     set_error (0, "unknown structured reply type (%" PRIu16
")", type);
-    return -1;
+    return 0;
   }

  REPLY.STRUCTURED_REPLY.RECV_ERROR:
   uint32_t length, msglen;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
@@ -197,7 +197,7 @@
         msglen > sizeof h->sbuf.sr.payload.error.msg) {
       SET_NEXT_STATE (%.DEAD);
       set_error (0, "error message length too large");
-      return -1;
+      return 0;
     }

     h->rbuf = h->sbuf.sr.payload.error.msg;
@@ -211,7 +211,7 @@
   uint16_t type;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
@@ -235,14 +235,14 @@
       if (length != 0) {
         SET_NEXT_STATE (%.DEAD);
         set_error (0, "error payload length too large");
-        return -1;
+        return 0;
       }
       break;
     case NBD_REPLY_TYPE_ERROR_OFFSET:
       if (length != sizeof h->sbuf.sr.payload.error.offset) {
         SET_NEXT_STATE (%.DEAD);
         set_error (0, "invalid error payload length");
-        return -1;
+        return 0;
       }
       h->rbuf = &h->sbuf.sr.payload.error.offset;
       break;
@@ -258,7 +258,7 @@
   uint16_t type;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
@@ -289,7 +289,7 @@
                    "cmd->count=%" PRIu32 ", "
                    "this is likely to be a bug in the server",
                    offset, cmd->offset, cmd->count);
-        return -1;
+        return 0;
       }
       if (cmd->type == NBD_CMD_READ && cmd->cb.fn.read) {
         int scratch = error;
@@ -319,7 +319,7 @@
   uint32_t length;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
@@ -343,7 +343,7 @@
                  "offset=%" PRIu64 ", cmd->offset=%"
PRIu64 ", "
                  "this is likely to be a bug in the server",
                  offset, cmd->offset);
-      return -1;
+      return 0;
     }
     /* Now this is the byte offset in the read buffer. */
     offset -= cmd->offset;
@@ -355,7 +355,7 @@
                  "cmd->count=%" PRIu32 ", "
                  "this is likely to be a bug in the server",
                  offset, length, cmd->count);
-      return -1;
+      return 0;
     }

     /* Set up to receive the data directly to the user buffer. */
@@ -371,7 +371,7 @@
   uint32_t length;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
@@ -401,7 +401,7 @@
   uint32_t length;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
@@ -422,7 +422,7 @@
                  "offset=%" PRIu64 ", cmd->offset=%"
PRIu64 ", "
                  "this is likely to be a bug in the server",
                  offset, cmd->offset);
-      return -1;
+      return 0;
     }
     /* Now this is the byte offset in the read buffer. */
     offset -= cmd->offset;
@@ -434,7 +434,7 @@
                  "cmd->count=%" PRIu32 ", "
                  "this is likely to be a bug in the server",
                  offset, length, cmd->count);
-      return -1;
+      return 0;
     }

     /* The spec states that 0-length requests are unspecified, but
@@ -464,7 +464,7 @@
   struct meta_context *meta_context;

   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1:
     save_reply_state (h);
     SET_NEXT_STATE (%.READY);
diff --git a/generator/states-reply.c b/generator/states-reply.c
index 6fb0a7a..88274bc 100644
--- a/generator/states-reply.c
+++ b/generator/states-reply.c
@@ -85,7 +85,7 @@ save_reply_state (struct nbd_handle *h)

     /* sock->ops->recv called set_error already. */
     SET_NEXT_STATE (%.DEAD);
-    return -1;
+    return 0;
   }
   if (r == 0) {
     SET_NEXT_STATE (%.CLOSED);
@@ -99,7 +99,7 @@ save_reply_state (struct nbd_handle *h)

  REPLY.RECV_REPLY:
   switch (recv_into_rbuf (h)) {
-  case -1: SET_NEXT_STATE (%.DEAD); return -1;
+  case -1: SET_NEXT_STATE (%.DEAD); return 0;
   case 1: SET_NEXT_STATE (%.READY); return 0;
   case 0: SET_NEXT_STATE (%CHECK_SIMPLE_OR_STRUCTURED_REPLY);
   }
@@ -120,7 +120,7 @@ save_reply_state (struct nbd_handle *h)
   else {
     SET_NEXT_STATE (%.DEAD); /* We've probably lost synchronization. */
     set_error (0, "invalid reply magic");
-    return -1;
+    return 0;
   }

   /* NB: This works for both simple and structured replies because the
@@ -141,7 +141,7 @@ save_reply_state (struct nbd_handle *h)
     SET_NEXT_STATE (%.DEAD);
     set_error (0, "no matching handle found for server reply, "
                "this is probably a bug in the server");
-    return -1;
+    return 0;
   }
   h->reply_cmd = cmd;
   return 0;
diff --git a/generator/states.c b/generator/states.c
index b0dab83..deea73c 100644
--- a/generator/states.c
+++ b/generator/states.c
@@ -125,11 +125,13 @@ send_from_wbuf (struct nbd_handle *h)
   return 0;

  DEAD:
+  /* The caller should have used set_error() before reaching here */
+  assert (nbd_get_error ());
   if (h->sock) {
     h->sock->ops->close (h->sock);
     h->sock = NULL;
   }
-  return 0;
+  return -1;

  CLOSED:
   if (h->sock) {
-- 
2.20.1

Eric Blake

2019-Jun-29 13:28 UTC

head link

[Libguestfs] [libnbd PATCH 3/6] generator: Allow Int64 in callbacks

An upcoming patch to add callbacks for aio completion notification
wants to expose Int64 as a callback parameter. It's time to wire that
up.
---
 generator/generator | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/generator/generator b/generator/generator
index 45a030f..c5988e2 100755
--- a/generator/generator
+++ b/generator/generator
@@ -3508,7 +3508,8 @@ let print_python_binding name { args; ret }             
pr "  for (size_t i = 0; i < %s; ++i)\n" len;
             pr "    PyList_SET_ITEM (py_%s, i, PyLong_FromUnsignedLong
(%s[i]));\n" n n
          | BytesIn _
-         | Int _ -> ()
+         | Int _
+         | Int64 _ -> ()
          | Mutable (Int n) ->
             pr "  PyObject *py_%s_modname = PyUnicode_FromString
(\"ctypes\");\n" n;
             pr "  if (!py_%s_modname) { PyErr_PrintEx (0); return -1;
}\n" n;
@@ -3525,7 +3526,7 @@ let print_python_binding name { args; ret }           |
ArrayAndLen _ | Bool _ | BytesOut _
          | BytesPersistIn _ | BytesPersistOut _
          | Callback _ | CallbackPersist _
-         | Flags _ | Int64 _ | Mutable _
+         | Flags _ | Mutable _
          | Path _ | SockAddrAndLen _ | StringList _
          | UInt _ | UInt32 _ -> assert false
        ) args;
@@ -3537,6 +3538,7 @@ let print_python_binding name { args; ret }           |
ArrayAndLen (UInt32 n, len) -> pr " \"O\""
          | BytesIn (n, len) -> pr " \"y#\""
          | Int n -> pr " \"i\""
+         | Int64 n -> pr " \"L\""
          | Mutable (Int n) -> pr " \"O\""
          | Opaque n -> pr " \"O\""
          | String n -> pr " \"s\""
@@ -3545,7 +3547,7 @@ let print_python_binding name { args; ret }           |
ArrayAndLen _ | Bool _ | BytesOut _
          | BytesPersistIn _ | BytesPersistOut _
          | Callback _ | CallbackPersist _
-         | Flags _ | Int64 _ | Mutable _
+         | Flags _ | Mutable _
          | Path _ | SockAddrAndLen _ | StringList _
          | UInt _ | UInt32 _ -> assert false
        ) args;
@@ -3556,14 +3558,14 @@ let print_python_binding name { args; ret }           |
BytesIn (n, len) -> pr ", %s, (int) %s" n len
          | Mutable (Int n) -> pr ", py_%s" n
          | Opaque _ -> pr ", _data->data"
-         | Int n
+         | Int n | Int64 n
          | String n
          | UInt64 n -> pr ", %s" n
          (* The following not yet implemented for callbacks XXX *)
          | ArrayAndLen _ | Bool _ | BytesOut _
          | BytesPersistIn _ | BytesPersistOut _
          | Callback _ | CallbackPersist _
-         | Flags _ | Int64 _ | Mutable _
+         | Flags _ | Mutable _
          | Path _ | SockAddrAndLen _ | StringList _
          | UInt _ | UInt32 _ -> assert false
        ) args;
@@ -3599,7 +3601,7 @@ let print_python_binding name { args; ret }             
pr "  Py_DECREF (py_%s_ret);\n" n;
             pr "  Py_DECREF (py_%s);\n" n
          | BytesIn _
-         | Int _
+         | Int _ | Int64 _
          | Opaque _
          | String _
          | UInt64 _ -> ()
@@ -3607,7 +3609,7 @@ let print_python_binding name { args; ret }           |
ArrayAndLen _ | Bool _ | BytesOut _
          | BytesPersistIn _ | BytesPersistOut _
          | Callback _ | CallbackPersist _
-         | Flags _ | Int64 _ | Mutable _
+         | Flags _ | Mutable _
          | Path _ | SockAddrAndLen _ | StringList _
          | UInt _ | UInt32 _ -> assert false
        ) args;
@@ -4345,13 +4347,14 @@ let print_ocaml_binding (name, { args; ret })          
List.map (
            function
            | ArrayAndLen (UInt32 n, _) | BytesIn (n, _)
-           | Int n | Mutable (Int n) | Opaque n | String n | UInt64 n ->
+           | Int n | Int64 n
+           | Mutable (Int n) | Opaque n | String n | UInt64 n ->
               n ^ "v"
            (* The following not yet implemented for callbacks XXX *)
            | ArrayAndLen _ | Bool _ | BytesOut _
            | BytesPersistIn _ | BytesPersistOut _
            | Callback _ | CallbackPersist _
-           | Flags _ | Int64 _ | Path _ | Mutable _
+           | Flags _ | Path _ | Mutable _
            | SockAddrAndLen _ | StringList _
            | UInt _ | UInt32 _ -> assert false
          ) args in
@@ -4379,6 +4382,8 @@ let print_ocaml_binding (name, { args; ret })             
pr "  memcpy (String_val (%sv), %s, %s);\n" n n len
          | Int n ->
             pr "  %sv = Val_int (%s);\n" n n
+         | Int64 n ->
+            pr "  %sv = caml_copy_int64 (%s);\n" n n
          | String n ->
             pr "  %sv = caml_copy_string (%s);\n" n n
          | UInt64 n ->
@@ -4394,7 +4399,7 @@ let print_ocaml_binding (name, { args; ret })           |
ArrayAndLen _ | Bool _ | BytesOut _
          | BytesPersistIn _ | BytesPersistOut _
          | Callback _ | CallbackPersist _
-         | Flags _ | Int64 _ | Mutable _
+         | Flags _ | Mutable _
          | Path _ | SockAddrAndLen _ | StringList _
          | UInt _ | UInt32 _ -> assert false
        ) args;
-- 
2.20.1

Eric Blake

2019-Jun-29 13:28 UTC

head link

[Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

Having the client polling thread perform an O(n) loop over all known
in-flight commands after each time the poll woke up is somewhat
inefficient, and in a multi-threaded setup requires additional locking
beyond libnbd to track the set of known command handles.  Better is a
way for aio commands to call a notify callback the moment a specific
command is ready to complete, and then a separate thread can gather
the final completion status using just libnbd's locking, making the
polling loop more efficient.  This also provides an opportunity to
clean up any opaque data and/or change the final command status (for
example, writing a strict validator for nbd_aio_pread_structured can
change the command from success to failure if the server violated
protocol by not returning chunks to cover the entire read).

We also want the client to be aware of any issued/in-flight commands
that failed because they were stranded when the state machine moved to
CLOSED or DEAD.  Previously, nbd_aio_command_completed() would never
locate such stranded commands, but adding a common point to fire the
notifier for such commands makes it also possible to move those
commands to the completion queue.

This patch sets up the framework, with observable effects for stranded
commands per the testsuite changes, but nothing yet actually sets the
notify callback; that will come in the next patch.
---
 generator/states-reply.c |  8 ++++++++
 generator/states.c       | 29 +++++++++++++++++++++++++++++
 lib/internal.h           |  2 ++
 tests/server-death.c     | 17 ++++++++++++++---
 4 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/generator/states-reply.c b/generator/states-reply.c
index 88274bc..c5cd790 100644
--- a/generator/states-reply.c
+++ b/generator/states-reply.c
@@ -180,6 +180,14 @@ save_reply_state (struct nbd_handle *h)
   else
     h->cmds_done = cmd;

+  /* Notify the user */
+  if (cmd->cb.notify) {
+    int error = cmd->error;
+
+    if (cmd->cb.notify (cmd->cb.opaque, handle, &error) == -1
&& error)
+      cmd->error = error;
+  }
+
   SET_NEXT_STATE (%.READY);
   return 0;

diff --git a/generator/states.c b/generator/states.c
index deea73c..c9c3ef7 100644
--- a/generator/states.c
+++ b/generator/states.c
@@ -111,6 +111,31 @@ send_from_wbuf (struct nbd_handle *h)
   return 0;                     /* move to next state */
 }

+/* Forcefully fail any remaining in-flight commands in list */
+void abort_commands (struct nbd_handle *h,
+                     struct command_in_flight **list)
+{
+  struct command_in_flight *prev_cmd, *cmd;
+
+  for (cmd = *list, prev_cmd = NULL;
+       cmd != NULL;
+       prev_cmd = cmd, cmd = cmd->next) {
+    if (cmd->cb.notify && cmd->type != NBD_CMD_DISC) {
+      int error = cmd->error ? cmd->error : ENOTCONN;
+
+      if (cmd->cb.notify (cmd->cb.opaque, cmd->handle, &error) ==
-1 && error)
+        cmd->error = error;
+    }
+    if (cmd->error == 0)
+      cmd->error = ENOTCONN;
+  }
+  if (prev_cmd) {
+    prev_cmd->next = h->cmds_done;
+    h->cmds_done = *list;
+    *list = NULL;
+  }
+}
+
 /*----- End of prologue. -----*/

 /* STATE MACHINE */ {
@@ -127,6 +152,8 @@ send_from_wbuf (struct nbd_handle *h)
  DEAD:
   /* The caller should have used set_error() before reaching here */
   assert (nbd_get_error ());
+  abort_commands (h, &h->cmds_to_issue);
+  abort_commands (h, &h->cmds_in_flight);
   if (h->sock) {
     h->sock->ops->close (h->sock);
     h->sock = NULL;
@@ -134,6 +161,8 @@ send_from_wbuf (struct nbd_handle *h)
   return -1;

  CLOSED:
+  abort_commands (h, &h->cmds_to_issue);
+  abort_commands (h, &h->cmds_in_flight);
   if (h->sock) {
     h->sock->ops->close (h->sock);
     h->sock = NULL;
diff --git a/lib/internal.h b/lib/internal.h
index 15f4b64..59074c2 100644
--- a/lib/internal.h
+++ b/lib/internal.h
@@ -239,6 +239,7 @@ typedef int (*extent_fn) (void *data, const char
*metacontext, uint64_t offset,
                           uint32_t *entries, size_t nr_entries, int *error);
 typedef int (*read_fn) (void *data, const void *buf, size_t count,
                         uint64_t offset, int *error, int status);
+typedef int (*notify_fn) (void *data, int64_t handle, int *error);

 struct command_cb {
   void *opaque;
@@ -246,6 +247,7 @@ struct command_cb {
     extent_fn extent;
     read_fn read;
   } fn;
+  notify_fn notify;
 };

 struct command_in_flight {
diff --git a/tests/server-death.c b/tests/server-death.c
index d490753..f8747e4 100644
--- a/tests/server-death.c
+++ b/tests/server-death.c
@@ -32,6 +32,8 @@ int
 main (int argc, char *argv[])
 {
   struct nbd_handle *nbd;
+  int err;
+  const char *msg;
   char buf[512];
   int64_t handle;
   char pidfile[] = "/tmp/libnbd-test-disconnectXXXXXX";
@@ -123,16 +125,25 @@ main (int argc, char *argv[])
     goto fail;
   }

-  /* Proof that the read was stranded */
-  if (nbd_aio_peek_command_completed (nbd) != 0) {
+  /* Detection of the dead server completes all remaining in-flight commands */
+  if (nbd_aio_peek_command_completed (nbd) != handle) {
     fprintf (stderr, "%s: test failed:
nbd_aio_peek_command_completed\n",
              argv[0]);
     goto fail;
   }
-  if (nbd_aio_command_completed (nbd, handle) != 0) {
+  if (nbd_aio_command_completed (nbd, handle) != -1) {
     fprintf (stderr, "%s: test failed: nbd_aio_command_completed\n",
argv[0]);
     goto fail;
   }
+  msg = nbd_get_error ();
+  err = nbd_get_errno ();
+  printf ("error: \"%s\"\n", msg);
+  printf ("errno: %d (%s)\n", err, strerror (err));
+  if (err != ENOTCONN) {
+    fprintf (stderr, "%s: test failed: unexpected errno %d (%s)\n",
argv[0],
+             err, strerror (err));
+    goto fail;
+  }

   close (fd);
   unlink (pidfile);
-- 
2.20.1

Eric Blake

2019-Jun-29 13:28 UTC

head link

[Libguestfs] [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions

As mentioned in the previous patch, there are situations where an aio
client wants instant notification when a given command is complete,
rather than having to maintain a separate data structure to track all
in-flight commands and then iterate over that structure to learn which
commands are complete.  It's also desirable when writing a server
validation program (such as for checking structured reads for
compliance) to be able to clean up the associated opaque data and have
a final chance to change the overall command status.

Introduce new nbd_aio_FOO_notify functions for each command. Rewire
the existing nbd_aio_FOO to forward to the new command.  (Perhaps the
generator could reduce some of the boilerplate duplication, if a later
patch wants to refactor this).
---
 docs/libnbd.pod     |  22 +++-
 generator/generator | 278 +++++++++++++++++++++++++++++++++++++++++---
 lib/rw.c            |  99 ++++++++++++++--
 3 files changed, 374 insertions(+), 25 deletions(-)

diff --git a/docs/libnbd.pod b/docs/libnbd.pod
index ede2539..93e80d4 100644
--- a/docs/libnbd.pod
+++ b/docs/libnbd.pod
@@ -169,7 +169,27 @@ has completed:
  }

 For almost all high level synchronous calls (eg. C<nbd_pread>) there
-is a low level asynchronous equivalent (eg. C<nbd_aio_pread>).
+are two low level asynchronous equivalents (eg. C<nbd_aio_pread> for
+starting a command, and C<nbd_aio_pread_notify> for also registering
+a callback to be invoked right before the command is complete).
+
+=head1 CALLBACKS
+
+Some of the high-level commands (C<nbd_pread_structured>,
+C<nbd_block_status>) involve the use of a callback function invoked by
+the state machine at appropriate points in the server's reply before
+the overall command is complete. Also, all of the low-level commands
+have a notify variant that registers a callback function used right
+before the command is marked complete.  These callback functions
+include a parameter C<error> containing the value of any error
+detected so far; if the callback function fails, it should assign back
+into C<error> and return C<-1> to change the resulting error of the
+overall command.
+
+The callbacks are invoked at a point where the libnbd lock is held; as
+such, it is unsafe for the callback to call any C<nbd_*> APIs on the
+same nbd object, as it would cause deadlock.  Functions that take two
+callback pointers share the same opaque data for both calls.

 =head1 ERROR HANDLING

diff --git a/generator/generator b/generator/generator
index c5988e2..fe73c15 100755
--- a/generator/generator
+++ b/generator/generator
@@ -1708,9 +1708,40 @@ on the connection.";
 Issue a read command to the NBD server.  This returns the
 unique positive 64 bit handle for this command, or C<-1> on
 error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Note that you must ensure
+C<nbd_aio_command_completed>, or use C<nbd_aio_pread_notify>.
+Note that you must ensure C<buf> is valid until the command
+has completed.  Other parameters behave as documented in
+C<nbd_pread>.";
+  };
+
+  "aio_pread_notify", {
+    default_call with
+    args = [ BytesPersistOut ("buf", "count"); UInt64
"offset";
+             Opaque "data";
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "read from the NBD server, and notify on completion";
+    longdesc = "\
+Issue a read command to the NBD server.  This returns the
+unique positive 64 bit handle for this command, or C<-1> on
+error.  If this command returns a handle, then the C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Note that you must ensure
 C<buf> is valid until the command has completed.  Other
-parameters behave as documented in C<nbd_pread>.";
+parameters behave as documented in C<nbd_pread>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_pread_structured", {
@@ -1730,8 +1761,43 @@ parameters behave as documented in
C<nbd_pread>.";
 Issue a read command to the NBD server.  This returns the
 unique positive 64 bit handle for this command, or C<-1> on
 error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Parameters behave as documented
-in C<nbd_pread_structured>.";
+C<nbd_aio_command_completed>, or use
+C<nbd_aio_pread_structured_notify>.  Parameters behave as
+documented in C<nbd_pread_structured>.";
+  };
+
+  "aio_pread_structured_notify", {
+    default_call with
+    args = [ BytesPersistOut ("buf", "count"); UInt64
"offset";
+             Opaque "data";
+             CallbackPersist ("chunk", [ Opaque "data";
+                                         BytesIn ("subbuf",
"count");
+                                         UInt64 "offset";
+                                         Mutable (Int "error");
+                                         Int "status" ]);
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "read from the NBD server, and notify on completion";
+    longdesc = "\
+Issue a read command to the NBD server.  This returns the
+unique positive 64 bit handle for this command, or C<-1> on
+error.  If this command returns a handle, then the C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Other parameters behave as
+documented in C<nbd_pread_structured>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_pwrite", {
@@ -1744,9 +1810,40 @@ in C<nbd_pread_structured>.";
 Issue a write command to the NBD server.  This returns the
 unique positive 64 bit handle for this command, or C<-1> on
 error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Note that you must ensure
+C<nbd_aio_command_completed>, or use C<nbd_aio_pwrite_notify>.
+Note that you must ensure C<buf> is valid until the command
+has completed.  Other parameters behave as documented in
+C<nbd_pwrite>.";
+  };
+
+  "aio_pwrite_notify", {
+    default_call with
+    args = [ BytesPersistIn ("buf", "count"); UInt64
"offset";
+             Opaque "data";
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "write to the NBD server, and notify on completion";
+    longdesc = "\
+Issue a write command to the NBD server.  This returns the
+unique positive 64 bit handle for this command, or C<-1> on
+error.    If this command returns a handle, then the C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Note that you must ensure
 C<buf> is valid until the command has completed.  Other
-parameters behave as documented in C<nbd_pwrite>.";
+parameters behave as documented in C<nbd_pwrite>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_disconnect", {
@@ -1780,8 +1877,36 @@ however, C<nbd_shutdown> will call this function if
appropriate.";
 Issue the flush command to the NBD server.  This returns the
 unique positive 64 bit handle for this command, or C<-1> on
 error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Parameters behave as documented
-in C<nbd_flush>.";
+C<nbd_aio_command_completed>, or use C<nbd_aio_flush_notify>.
+Parameters behave as documented in C<nbd_flush>.";
+  };
+
+  "aio_flush_notify", {
+    default_call with
+    args = [ Opaque "data";
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "send flush command to the NBD server, and notify on
completion";
+    longdesc = "\
+Issue the flush command to the NBD server.  This returns the
+unique positive 64 bit handle for this command, or C<-1> on
+error.  If this command returns a handle, then the C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Other parameters behave as
+documented in C<nbd_flush>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_trim", {
@@ -1794,8 +1919,37 @@ in C<nbd_flush>.";
 Issue a trim command to the NBD server.  This returns the
 unique positive 64 bit handle for this command, or C<-1> on
 error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Parameters behave as documented
-in C<nbd_trim>.";
+C<nbd_aio_command_completed>, or use C<nbd_aio_trim_notify>.
+Parameters behave as documented in C<nbd_trim>.";
+  };
+
+  "aio_trim_notify", {
+    default_call with
+    args = [ UInt64 "count"; UInt64 "offset";
+             Opaque "data";
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "send trim command to the NBD server, and notify on
completion";
+    longdesc = "\
+Issue a trim command to the NBD server.  This returns the
+unique positive 64 bit handle for this command, or C<-1> on
+error.  If this command returns a handle, then the C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Other parameters behave as
+documented in C<nbd_trim>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_cache", {
@@ -1808,8 +1962,37 @@ in C<nbd_trim>.";
 Issue the cache (prefetch) command to the NBD server.  This
 returns the unique positive 64 bit handle for this command, or
 C<-1> on error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Parameters behave as documented
-in C<nbd_cache>.";
+C<nbd_aio_command_completed>, or use C<nbd_aio_cache_notify>.
+Parameters behave as documented in C<nbd_cache>.";
+  };
+
+  "aio_cache_notify", {
+    default_call with
+    args = [ UInt64 "count"; UInt64 "offset";
+             Opaque "data";
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "send cache (prefetch) command to the NBD server, and
notify on completion";
+    longdesc = "\
+Issue the cache (prefetch) command to the NBD server.  This
+returns the unique positive 64 bit handle for this command, or
+C<-1> on error.  If this command returns a handle, then the
C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Other parameters behave as
+documented in C<nbd_cache>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_zero", {
@@ -1822,8 +2005,37 @@ in C<nbd_cache>.";
 Issue a write zeroes command to the NBD server.  This returns the
 unique positive 64 bit handle for this command, or C<-1> on
 error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Parameters behave as documented
-in C<nbd_zero>.";
+C<nbd_aio_command_completed>, or use C<nbd_aio_zero_notify>.
+Parameters behave as documented in C<nbd_zero>.";
+  };
+
+  "aio_zero_notify", {
+    default_call with
+    args = [ UInt64 "count"; UInt64 "offset";
+             Opaque "data";
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "send write zeroes command to the NBD server, and notify
on completion";
+    longdesc = "\
+Issue a write zeroes command to the NBD server.  This returns the
+unique positive 64 bit handle for this command, or C<-1> on
+error.  If this command returns a handle, then the C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Other parameters behave as
+documented in C<nbd_zero>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_block_status", {
@@ -1843,8 +2055,42 @@ in C<nbd_zero>.";
 Send the block status command to the NBD server.  This returns the
 unique positive 64 bit handle for this command, or C<-1> on
 error.  To check if the command completed, call
-C<nbd_aio_command_completed>.  Parameters behave as documented
-in C<nbd_block_status>.";
+C<nbd_aio_command_completed>, or use
C<nbd_aio_block_status_notify>.
+Parameters behave as documented in C<nbd_block_status>.";
+  };
+
+  "aio_block_status_notify", {
+    default_call with
+    args = [ UInt64 "count"; UInt64 "offset";
+             Opaque "data";
+             CallbackPersist ("extent", [Opaque "data";
String "metacontext";
+                                         UInt64 "offset";
+                                         ArrayAndLen (UInt32
"entries",
+                                                      "nr_entries");
+                                         Mutable (Int "error") ]);
+             CallbackPersist ("notify", [ Opaque "data";
Int64 "handle";
+                                          Mutable (Int "error") ]);
+             Flags "flags" ];
+    ret = RInt64;
+    permitted_states = [ Connected ];
+    shortdesc = "send block status command to the NBD server, and notify
on completion";
+    longdesc = "\
+Send the block status command to the NBD server.  This returns the
+unique positive 64 bit handle for this command, or C<-1> on
+error.  If this command returns a handle, then the C<notify>
+callback will be called when the server is done replying,
+although you must still use C<nbd_aio_command_completed> after
+the callback to retire the command.  Other parameters behave as
+documented in C<nbd_block_status>.
+
+The C<notify> callback is called with the same C<data> passed to
+this function, C<handle> set to the return value of this function,
+and C<error> containing the command's result so far. The callback
+may modify the overall status of the command by storing into
+C<error> and returning C<-1>, although attempts to undo non-zero
+status back to zero are ignored. The callback cannot call C<nbd_*>
+APIs on the same handle since it holds the handle lock and will
+cause a deadlock.";
   };

   "aio_get_fd", {
diff --git a/lib/rw.c b/lib/rw.c
index 53cd521..93388a9 100644
--- a/lib/rw.c
+++ b/lib/rw.c
@@ -249,6 +249,17 @@ int64_t
 nbd_unlocked_aio_pread (struct nbd_handle *h, void *buf,
                         size_t count, uint64_t offset, uint32_t flags)
 {
+  return nbd_unlocked_aio_pread_notify (h, buf, count, offset, NULL, NULL,
+                                        flags);
+}
+
+int64_t
+nbd_unlocked_aio_pread_notify (struct nbd_handle *h, void *buf,
+                               size_t count, uint64_t offset,
+                               void *opaque, notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = opaque, .notify = notify, };
+
   /* We could silently accept flag DF, but it really only makes sense
    * with callbacks, because otherwise there is no observable change
    * except that the server may fail where it would otherwise succeed.
@@ -259,7 +270,7 @@ nbd_unlocked_aio_pread (struct nbd_handle *h, void *buf,
   }

   return nbd_internal_command_common (h, 0, NBD_CMD_READ, offset, count,
-                                      buf, NULL);
+                                      buf, &cb);
 }

 int64_t
@@ -267,7 +278,18 @@ nbd_unlocked_aio_pread_structured (struct nbd_handle *h,
void *buf,
                                    size_t count, uint64_t offset,
                                    void *opaque, read_fn read, uint32_t flags)
 {
-  struct command_cb cb = { .opaque = opaque, .fn.read = read, };
+  return nbd_unlocked_aio_pread_structured_notify (h, buf, count, offset,
+                                                   opaque, read, NULL, flags);
+}
+
+int64_t
+nbd_unlocked_aio_pread_structured_notify (struct nbd_handle *h, void *buf,
+                                          size_t count, uint64_t offset,
+                                          void *opaque, read_fn read,
+                                          notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = opaque, .fn.read = read,
+                           .notify = notify, };

   if ((flags & ~LIBNBD_CMD_FLAG_DF) != 0) {
     set_error (EINVAL, "invalid flag: %" PRIu32, flags);
@@ -289,6 +311,17 @@ nbd_unlocked_aio_pwrite (struct nbd_handle *h, const void
*buf,
                          size_t count, uint64_t offset,
                          uint32_t flags)
 {
+  return nbd_unlocked_aio_pwrite_notify (h, buf, count, offset, NULL, NULL,
+                                         flags);
+}
+
+int64_t
+nbd_unlocked_aio_pwrite_notify (struct nbd_handle *h, const void *buf,
+                                size_t count, uint64_t offset,
+                                void *opaque, notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = opaque, .notify = notify, };
+
   if (nbd_unlocked_read_only (h) == 1) {
     set_error (EINVAL, "server does not support write operations");
     return -1;
@@ -306,12 +339,21 @@ nbd_unlocked_aio_pwrite (struct nbd_handle *h, const void
*buf,
   }

   return nbd_internal_command_common (h, flags, NBD_CMD_WRITE, offset, count,
-                                      (void *) buf, NULL);
+                                      (void *) buf, &cb);
 }

 int64_t
 nbd_unlocked_aio_flush (struct nbd_handle *h, uint32_t flags)
 {
+  return nbd_unlocked_aio_flush_notify (h, NULL, NULL, flags);
+}
+
+int64_t
+nbd_unlocked_aio_flush_notify (struct nbd_handle *h, void *opaque,
+                               notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = opaque, .notify = notify, };
+
   if (nbd_unlocked_can_flush (h) != 1) {
     set_error (EINVAL, "server does not support flush operations");
     return -1;
@@ -323,7 +365,7 @@ nbd_unlocked_aio_flush (struct nbd_handle *h, uint32_t
flags)
   }

   return nbd_internal_command_common (h, 0, NBD_CMD_FLUSH, 0, 0,
-                                      NULL, NULL);
+                                      NULL, &cb);
 }

 int64_t
@@ -331,6 +373,16 @@ nbd_unlocked_aio_trim (struct nbd_handle *h,
                        uint64_t count, uint64_t offset,
                        uint32_t flags)
 {
+  return nbd_unlocked_aio_trim_notify (h, count, offset, NULL, NULL, flags);
+}
+
+int64_t
+nbd_unlocked_aio_trim_notify (struct nbd_handle *h,
+                              uint64_t count, uint64_t offset,
+                              void *opaque, notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = opaque, .notify = notify, };
+
   if (nbd_unlocked_read_only (h) == 1) {
     set_error (EINVAL, "server does not support write operations");
     return -1;
@@ -353,13 +405,23 @@ nbd_unlocked_aio_trim (struct nbd_handle *h,
   }

   return nbd_internal_command_common (h, flags, NBD_CMD_TRIM, offset, count,
-                                      NULL, NULL);
+                                      NULL, &cb);
 }

 int64_t
 nbd_unlocked_aio_cache (struct nbd_handle *h,
                         uint64_t count, uint64_t offset, uint32_t flags)
 {
+  return nbd_unlocked_aio_cache_notify (h, count, offset, NULL, NULL, flags);
+}
+
+int64_t
+nbd_unlocked_aio_cache_notify (struct nbd_handle *h,
+                               uint64_t count, uint64_t offset,
+                               void *opaque, notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = opaque, .notify = notify, };
+
   /* Actually according to the NBD protocol document, servers do exist
    * that support NBD_CMD_CACHE but don't advertise the
    * NBD_FLAG_SEND_CACHE bit, but we ignore those.
@@ -375,7 +437,7 @@ nbd_unlocked_aio_cache (struct nbd_handle *h,
   }

   return nbd_internal_command_common (h, 0, NBD_CMD_CACHE, offset, count,
-                                      NULL, NULL);
+                                      NULL, &cb);
 }

 int64_t
@@ -383,6 +445,16 @@ nbd_unlocked_aio_zero (struct nbd_handle *h,
                        uint64_t count, uint64_t offset,
                        uint32_t flags)
 {
+  return nbd_unlocked_aio_zero_notify (h, count, offset, NULL, NULL, flags);
+}
+
+int64_t
+nbd_unlocked_aio_zero_notify (struct nbd_handle *h,
+                              uint64_t count, uint64_t offset,
+                              void *opaque, notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = opaque, .notify = notify, };
+
   if (nbd_unlocked_read_only (h) == 1) {
     set_error (EINVAL, "server does not support write operations");
     return -1;
@@ -405,7 +477,7 @@ nbd_unlocked_aio_zero (struct nbd_handle *h,
   }

   return nbd_internal_command_common (h, flags, NBD_CMD_WRITE_ZEROES, offset,
-                                      count, NULL, NULL);
+                                      count, NULL, &cb);
 }

 int64_t
@@ -414,7 +486,18 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h,
                                void *data, extent_fn extent,
                                uint32_t flags)
 {
-  struct command_cb cb = { .opaque = data, .fn.extent = extent, };
+  return nbd_unlocked_aio_block_status_notify (h, count, offset, data, extent,
+                                               NULL, flags);
+}
+
+int64_t
+nbd_unlocked_aio_block_status_notify (struct nbd_handle *h,
+                                      uint64_t count, uint64_t offset,
+                                      void *data, extent_fn extent,
+                                      notify_fn notify, uint32_t flags)
+{
+  struct command_cb cb = { .opaque = data, .fn.extent = extent,
+                           .notify = notify, };

   if (!h->structured_replies) {
     set_error (ENOTSUP, "server does not support structured
replies");
-- 
2.20.1

Eric Blake

2019-Jun-29 13:28 UTC

head link

[Libguestfs] [libnbd PATCH 6/6] examples: New example for strict read validations

Demonstrate a use of the new nbd_pread_structured_verify API by
writing a strict validation that a server's structured replies comply
with the specification (well, 99% strict, as I did not check that the
server does not return an error at the same offset twice).

I was able to test that qemu-nbd is compliant. An example run:

$ qemu-img create -f qcow2 file 32m
$ for i in `seq 32`; do
  qemu-io -f qcow2 -d unmap -c "w -zu $((i-1))m 512k" file; done
$ qemu-nbd -f qcow2 -p 10888 file
$ ./examples/strict-structured-reads nbd://localhost:10888
totals:
 data chunks:       1768
 data bytes:  1559232512
 hole chunks:       1284
 hole bytes:   537919488
 all chunks:        3052
 reads:             1000
 bytes read:  2097152000
 compliant:         1000

But since qemu-nbd always returns chunks in order, there may still be
lurking bugs in my code to handle out-of-order replies. Maybe someday
nbdkit will make it easy to write a server that returns out-of-order
chunks.
---
 .gitignore                         |   1 +
 examples/Makefile.am               |  14 ++
 examples/strict-structured-reads.c | 270 +++++++++++++++++++++++++++++
 3 files changed, 285 insertions(+)
 create mode 100644 examples/strict-structured-reads.c

diff --git a/.gitignore b/.gitignore
index d4828fa..edbf941 100644
--- a/.gitignore
+++ b/.gitignore
@@ -44,6 +44,7 @@ Makefile.in
 /examples/threaded-reads-and-writes
 /examples/simple-fetch-first-sector
 /examples/simple-reads-and-writes
+/examples/strict-structured-reads
 /generator/generator-cache.v1
 /generator/stamp-generator
 /html/*.?.html
diff --git a/examples/Makefile.am b/examples/Makefile.am
index f0d03f1..7560855 100644
--- a/examples/Makefile.am
+++ b/examples/Makefile.am
@@ -24,6 +24,7 @@ noinst_PROGRAMS = \
 	simple-fetch-first-sector \
 	simple-reads-and-writes \
 	threaded-reads-and-writes \
+	strict-structured-reads \
 	$(NULL)

 simple_fetch_first_sector_SOURCES = \
@@ -52,6 +53,19 @@ simple_reads_and_writes_LDADD = \
 	$(top_builddir)/lib/libnbd.la \
 	$(NULL)

+strict_structured_reads_SOURCES = \
+	strict-structured-reads.c \
+	$(NULL)
+strict_structured_reads_CPPFLAGS = \
+	-I$(top_srcdir)/include \
+	$(NULL)
+strict_structured_reads_CFLAGS = \
+	$(WARNINGS_CFLAGS) \
+	$(NULL)
+strict_structured_reads_LDADD = \
+	$(top_builddir)/lib/libnbd.la \
+	$(NULL)
+
 threaded_reads_and_writes_SOURCES = \
 	threaded-reads-and-writes.c \
 	$(NULL)
diff --git a/examples/strict-structured-reads.c
b/examples/strict-structured-reads.c
new file mode 100644
index 0000000..e75f5a3
--- /dev/null
+++ b/examples/strict-structured-reads.c
@@ -0,0 +1,270 @@
+/* Example usage with qemu-nbd:
+ *
+ * sock=`mktemp -u`
+ * qemu-nbd -f $format -k $sock -r image
+ * ./strict-structured-reads $sock
+ *
+ * This will perform read randomly over the image and check that all
+ * structured replies comply with the NBD spec (chunks may be out of
+ * order or interleaved, but no read succeeds unless chunks cover the
+ * entire region, with no overlapping or zero-length chunks).
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <time.h>
+#include <assert.h>
+#include <errno.h>
+#include <string.h>
+
+#include <libnbd.h>
+
+/* A linked list of ranges still not seen. */
+struct range {
+  uint64_t first;
+  uint64_t last;
+  struct range *next;
+};
+
+/* Per-read data. */
+struct data {
+  uint64_t offset;
+  size_t count;
+  uint32_t flags;
+  size_t chunks;
+  struct range *remaining;
+};
+
+#define MAX_BUF (2 * 1024 * 1024)
+static char buf[MAX_BUF];
+
+/* Various statistics */
+static int total_data_chunks;
+static int total_data_bytes;
+static int total_hole_chunks;
+static int total_hole_bytes;
+static int total_chunks;
+static int total_df_reads;
+static int total_reads;
+static int64_t total_bytes;
+static int total_success;
+
+static int
+read_chunk (void *opaque, const void *bufv, size_t count, uint64_t offset,
+            int *error, int status)
+{
+  struct data *data = opaque;
+  struct range *r, **prev;
+
+  /* libnbd guarantees this: */
+  assert (offset >= data->offset);
+  assert (offset + count <= data->offset + data->count);
+
+  switch (status) {
+  case LIBNBD_READ_DATA:
+    total_data_chunks++;
+    total_data_bytes += count;
+    break;
+  case LIBNBD_READ_HOLE:
+    total_hole_chunks++;
+    total_hole_bytes += count;
+    break;
+  case LIBNBD_READ_ERROR:
+    assert (count == 0);
+    count = 1; /* Ensure no further chunks visit that offset */
+    break;
+  default:
+    goto error;
+  }
+  data->chunks++;
+  if (count == 0) {
+    fprintf (stderr, "buggy server: chunk must have non-zero
size\n");
+    goto error;
+  }
+
+  /* Find element in remaining, or the server is in error */
+  for (prev = &data->remaining, r = *prev; r; prev = &r->next, r
= r->next) {
+    if (offset >= r->first)
+      break;
+  }
+  if (r == NULL || offset + count > r->last) {
+    /* we fail to detect double errors reported at the same offset,
+     * but at least the read is already going to fail.
+     */
+    if (status == LIBNBD_READ_ERROR)
+      return 0;
+    fprintf (stderr, "buggy server: chunk with overlapping range\n");
+    goto error;
+  }
+
+  /* Resize or split r to track new remaining bytes */
+  if (offset == r->first) {
+    if (offset + count == r->last) {
+      *prev = r->next;
+      free (r);
+    }
+    else
+      r->first += count;
+  }
+  else if (offset + count == r->last) {
+    r->last -= count;
+  }
+  else {
+    struct range *n = malloc (sizeof *n);
+    assert (n);
+    n->next = r->next;
+    r->next = n;
+    n->last = r->last;
+    r->last = offset - r->first;
+    n->first = offset + count;
+  }
+
+  return 0;
+ error:
+  *error = EPROTO;
+  return -1;
+}
+
+static int
+read_verify (void *opaque, int64_t handle, int *error)
+{
+  struct data *data = opaque;
+  int ret = -1;
+
+  total_reads++;
+  total_chunks += data->chunks;
+  if (*error)
+    goto cleanup;
+  assert (data->chunks > 0);
+  if (data->flags & LIBNBD_CMD_FLAG_DF) {
+    total_df_reads++;
+    if (data->chunks > 1) {
+      fprintf (stderr, "buggy server: too many chunks for DF
flag\n");
+      *error = EPROTO;
+      goto cleanup;
+    }
+  }
+  if (data->remaining && !*error) {
+    fprintf (stderr, "buggy server: not enough chunks on success\n");
+    *error = EPROTO;
+    goto cleanup;
+  }
+  total_bytes += data->count;
+  total_success++;
+  ret = 0;
+
+ cleanup:
+  while (data->remaining) {
+    struct range *r = data->remaining;
+    data->remaining = r->next;
+    free (r);
+  }
+  free (data);
+  return ret;
+}
+
+int
+main (int argc, char *argv[])
+{
+  struct nbd_handle *nbd;
+  size_t i;
+  int64_t exportsize;
+  int64_t maxsize = MAX_BUF;
+  uint64_t offset;
+
+  srand (time (NULL));
+
+  if (argc != 2) {
+    fprintf (stderr, "%s socket|uri\n", argv[0]);
+    exit (EXIT_FAILURE);
+  }
+
+  nbd = nbd_create ();
+  if (nbd == NULL) {
+    fprintf (stderr, "%s\n", nbd_get_error ());
+    exit (EXIT_FAILURE);
+  }
+  if (strstr (argv[1], "://")) {
+    if (nbd_connect_uri (nbd, argv[1]) == -1) {
+      fprintf (stderr, "%s\n", nbd_get_error ());
+      exit (EXIT_FAILURE);
+    }
+  }
+  else if (nbd_connect_unix (nbd, argv[1]) == -1) {
+    fprintf (stderr, "%s\n", nbd_get_error ());
+    exit (EXIT_FAILURE);
+  }
+
+  exportsize = nbd_get_size (nbd);
+  if (exportsize == -1) {
+    fprintf (stderr, "%s\n", nbd_get_error ());
+    exit (EXIT_FAILURE);
+  }
+  if (exportsize < 512) {
+    fprintf (stderr, "image is too small for useful testing\n");
+    exit (EXIT_FAILURE);
+  }
+  if (exportsize <= maxsize)
+    maxsize = exportsize - 1;
+
+  /* Queue up 1000 parallel reads. We are reusing the same buffer,
+   * which is not safe in real life, but okay here because we aren't
+   * validating contents, only server behavior.
+   */
+  for (i = 0; i < 1000; ++i) {
+    uint32_t flags = 0;
+    struct data *d = malloc (sizeof *d);
+    struct range *r = malloc (sizeof *r);
+
+    assert (d && r);
+    offset = rand () % (exportsize - maxsize);
+    if (rand() & 1)
+      flags = LIBNBD_CMD_FLAG_DF;
+    *r = (struct range) { .first = offset, .last = offset + maxsize, };
+    *d = (struct data) { .offset = offset, .count = maxsize, .flags = flags,
+                         .remaining = r, };
+    if (nbd_aio_pread_structured_notify (nbd, buf, sizeof buf, offset, d,
+                                         read_chunk, read_verify,
+                                         flags) == -1) {
+      fprintf (stderr, "%s\n", nbd_get_error ());
+      exit (EXIT_FAILURE);
+    }
+  }
+
+  while (nbd_aio_in_flight (nbd) > 0) {
+    int64_t handle = nbd_aio_peek_command_completed (nbd);
+
+    if (handle == -1) {
+      fprintf (stderr, "%s\n", nbd_get_error ());
+      exit (EXIT_FAILURE);
+    }
+    if (handle == 0) {
+      if (nbd_poll (nbd, -1) == -1) {
+        fprintf (stderr, "%s\n", nbd_get_error ());
+        exit (EXIT_FAILURE);
+      }
+    }
+    else
+      nbd_aio_command_completed (nbd, handle);
+  }
+
+  if (nbd_shutdown (nbd) == -1) {
+    fprintf (stderr, "%s\n", nbd_get_error ());
+    exit (EXIT_FAILURE);
+  }
+
+  nbd_close (nbd);
+
+  printf ("totals:\n");
+  printf (" data chunks: %10d\n", total_data_chunks);
+  printf (" data bytes:  %10d\n", total_data_bytes);
+  printf (" hole chunks: %10d\n", total_hole_chunks);
+  printf (" hole bytes:  %10d\n", total_hole_bytes);
+  printf (" all chunks:  %10d\n", total_chunks);
+  printf (" reads:       %10d\n", total_reads);
+  printf (" bytes read:  %10" PRId64 "\n", total_bytes);
+  printf (" compliant:   %10d\n", total_success);
+
+  exit (EXIT_SUCCESS);
+}
-- 
2.20.1

Richard W.M. Jones

2019-Jun-29 17:13 UTC

head link

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

Simple and obvious, ACK.

Agree that if we decide to make h->in_flight atomic in future then we
could make the read function is_locked = false.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Richard W.M. Jones

2019-Jun-29 17:17 UTC

head link

Re: [Libguestfs] [libnbd PATCH 2/6] generator: Allow DEAD state actions to run

This wasn't exactly how I imagined it - I thought we'd change the
generator so that ‘return -1’ wouldn't stop the state machine, but
would save an error indication, keep running the machine until it
blocks, then return an error.

However this is fine, so ACK.

If I was going to improve this patch in some way (and didn't implement
the idea above) then: I'd add a new macro for entering the DEAD state,
setting the error, and returning the right code all in one statement.
On the basis that it removes the scope for programmer error.  This
would require a small change to the generator so the new macro is
recognized as working like SET_NEXT_STATE; and probably a second
change to the generator to prevent ordinary SET_NEXT_STATE from
jumping to the DEAD state, to force everyone to use the new macro in
this case.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Richard W.M. Jones

2019-Jun-29 17:18 UTC

head link

Re: [Libguestfs] [libnbd PATCH 3/6] generator: Allow Int64 in callbacks

Python & OCaml versions look fine, ACK.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

Richard W.M. Jones

2019-Jun-29 17:20 UTC

head link

Re: [Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

Sensible, ACK.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Richard W.M. Jones

2019-Jun-29 17:28 UTC

head link

Re: [Libguestfs] [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions

Obvious change given the previous patch, so ACK.

I do wonder if we should make the notify parameter mandatory (and
therefore avoid the duplicate functions).

Another trick you might do if feeling really keen is to
programmatically generate the notify variants of commands, something
like this:

let make_notify_variant_of_call ({ args; longdesc } as call)   let args =
List.rev args in
  let flags, args = List.hd args, List.tl args in
  assert (flags = Flags "flags");
  let args = flags :: CallbackPersist ("notify", [ etc ]) :: args in
  let args = List.rev args in
  let longdesc = longdesc ^ "\nThe C<notify> callback blah blah
..." in
  { call with args; longdesc }

let aio_pwrite_call = {
  default_call with
  args = [ BytesPersistIn ("buf", "count"); UInt64
"offset"; Flags "flags" ]; ]
  (* etc copy the definition of aio_pwrite *)
}

let aio_pwrite_notify_call = make_notify_variant_of_call aio_pwrite_call

let handle_calls = [
  ...
  "aio_pwrite", aio_pwrite_call;
  "aio_pwrite_notify", aio_pwrite_notify_call;
  ...
]

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

Richard W.M. Jones

2019-Jun-30 09:45 UTC

head link

Re: [Libguestfs] [libnbd PATCH 6/6] examples: New example for strict read validations

ACK

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Eric Blake

2019-Jul-02 14:48 UTC

head link

Re: [Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

On 6/29/19 8:28 AM, Eric Blake wrote:> Having the client polling thread perform an O(n) loop over all known
> in-flight commands after each time the poll woke up is somewhat
> inefficient, and in a multi-threaded setup requires additional locking
> beyond libnbd to track the set of known command handles.  Better is a
> way for aio commands to call a notify callback the moment a specific
> command is ready to complete, and then a separate thread can gather
> the final completion status using just libnbd's locking, making the
> polling loop more efficient.  This also provides an opportunity to
> clean up any opaque data and/or change the final command status (for
> example, writing a strict validator for nbd_aio_pread_structured can
> change the command from success to failure if the server violated
> protocol by not returning chunks to cover the entire read).
> 
> We also want the client to be aware of any issued/in-flight commands
> that failed because they were stranded when the state machine moved to
> CLOSED or DEAD.  Previously, nbd_aio_command_completed() would never
> locate such stranded commands, but adding a common point to fire the
> notifier for such commands makes it also possible to move those
> commands to the completion queue.
> 
> This patch sets up the framework, with observable effects for stranded
> commands per the testsuite changes, but nothing yet actually sets the
> notify callback; that will come in the next patch.
> ---
> +/* Forcefully fail any remaining in-flight commands in list */
> +void abort_commands (struct nbd_handle *h,
> +                     struct command_in_flight **list)
> +{
> +  struct command_in_flight *prev_cmd, *cmd;
> +
> +  for (cmd = *list, prev_cmd = NULL;
> +       cmd != NULL;
> +       prev_cmd = cmd, cmd = cmd->next) {
> +    if (cmd->cb.notify && cmd->type != NBD_CMD_DISC) {
> +      int error = cmd->error ? cmd->error : ENOTCONN;
> +
> +      if (cmd->cb.notify (cmd->cb.opaque, cmd->handle,
&error) == -1 && error)
> +        cmd->error = error;
> +    }
Note that this special-cases NBD_CMD_DISC - since we did not return a
handle to the user, nor add an nbd_aio_disconnect_notify() variant, and
since the server (if compliant) does not reply to NBD_CMD_DISC, I
decided it did not make sense to call a notify callback for that command
(instead, you know the disconnect command completed when the state
machine moves to CLOSED or DEAD).

But my special-casing was incomplete; I'm squashing this in before pushing:

diff --git a/lib/aio.c b/lib/aio.c
index b29378b..748665e 100644
--- a/lib/aio.c
+++ b/lib/aio.c
@@ -69,7 +69,7 @@ nbd_unlocked_aio_command_completed (struct nbd_handle *h,
     if (cmd->handle == handle)
       break;
   }
-  if (!cmd)
+  if (!cmd || cmd->type == NBD_CMD_DISC)
     return 0;

   type = cmd->type;
@@ -103,6 +103,14 @@ nbd_unlocked_aio_command_completed (struct
nbd_handle *h,
 int64_t
 nbd_unlocked_aio_peek_command_completed (struct nbd_handle *h)
 {
+  /* Special case NBD_CMD_DISC, as it does not have a user-visible
handle */
+  if (h->cmds_done && h->cmds_done->type == NBD_CMD_DISC) {
+    struct command_in_flight *cmd = h->cmds_done;
+
+    h->cmds_done = cmd->next;
+    free (cmd);
+  }
+
   if (h->cmds_done != NULL)
     return h->cmds_done->handle;

diff --git a/lib/disconnect.c b/lib/disconnect.c
index 53de386..5bbc64b 100644
--- a/lib/disconnect.c
+++ b/lib/disconnect.c
@@ -60,10 +60,11 @@ nbd_unlocked_aio_disconnect (struct nbd_handle *h,
uint32_t flags)
     return -1;
   h->disconnect_request = true;

-  /* This will leave the command on the in-flight list.  Is this a
-   * problem?  Probably it isn't.  If it is, we could add a flag to
-   * the command struct to tell SEND_REQUEST not to add it to the
-   * in-flight list.
+  /* As the server does not reply to this command, it is left
+   * in-flight until the cleanup performed when moving to CLOSED or
+   * DEAD.  We don't return a handle to the user, and thus also
+   * special case things so that the user cannot request the status of
+   * this command during aio_[peek_]command_completed.
    */
   return 0;
 }
diff --git a/tests/server-death.c b/tests/server-death.c
index f8747e4..18ca5f8 100644
--- a/tests/server-death.c
+++ b/tests/server-death.c
@@ -145,6 +145,27 @@ main (int argc, char *argv[])
     goto fail;
   }

+  /* With all commands retired, no further command should be pending */
+  if (nbd_aio_in_flight (nbd) != 0) {
+    fprintf (stderr, "%s: test failed: nbd_aio_in_flight\n",
+             argv[0]);
+    goto fail;
+  }
+  if (nbd_aio_peek_command_completed (nbd) != -1) {
+    fprintf (stderr, "%s: test failed:
nbd_aio_peek_command_completed\n",
+             argv[0]);
+    goto fail;
+  }
+  msg = nbd_get_error ();
+  err = nbd_get_errno ();
+  printf ("error: \"%s\"\n", msg);
+  printf ("errno: %d (%s)\n", err, strerror (err));
+  if (err != EINVAL) {
+    fprintf (stderr, "%s: test failed: unexpected errno %d (%s)\n",
argv[0],
+             err, strerror (err));
+    goto fail;
+  }
+
   close (fd);
   unlink (pidfile);
   nbd_close (nbd);
-- 
2.20.1



-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Eric Blake

2019-Jul-02 18:53 UTC

head link

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

On 6/29/19 8:28 AM, Eric Blake wrote:> Some clients need to know when it is safe to issue NBD_CMD_DISC, or to
> decide whether calling poll(POLLIN) will block indefinitely because
> the server isn't expected to respond.  Make this easier to learn by
> tracking the count of commands we have queued up to send, as well as
> the count of commands where we are waiting on the server's response.
> 
> Update tests/aio-parallel* and examples/batched-read-write to use
> nbd's own in-flight counter instead of reimplementing it ourselves.
> 
> Note that h->in_flight is only ever updated while the lock is held;
> but we may want to consider also making it atomic and therefore
> readable as a lock-less function.
> ---
> +++ b/tests/aio-parallel-load.c
> @@ -189,7 +189,6 @@ start_thread (void *arg)
>    size_t i;
>    uint64_t offset, handle;
>    uint64_t handles[MAX_IN_FLIGHT];
This array is uninitialized. Previously, it did not matter,

> @@ -291,16 +294,16 @@ start_thread (void *arg)
>        nbd_aio_notify_write (nbd);
> 
>      /* If a command is ready to retire, retire it. */
> -    for (i = 0; i < in_flight; ++i) {
> +    for (i = 0; i < MAX_IN_FLIGHT; ++i) {
> +      if (handles[i] == 0)
> +        continue;
>        r = nbd_aio_command_completed (nbd, handles[i]);
>        if (r == -1) {
>          fprintf (stderr, "%s\n", nbd_get_error ());
>          goto error;
>        }
>        if (r) {
> -        memmove (&handles[i], &handles[i+1],
> -                 sizeof (handles[0]) * (in_flight - i - 1));
...because we never accessed an element without first setting it up; but
now valgrind is able to report a conditional branch on an uninit
variable. I'm pushing an obvious fix.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Richard W.M. Jones

2019-Jul-15 11:06 UTC

head link

[Libguestfs] [libnbd] notify API changes (was: Re: [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions)

On Sat, Jun 29, 2019 at 08:28:28AM -0500, Eric Blake
wrote:> As mentioned in the previous patch, there are situations where an aio
> client wants instant notification when a given command is complete,
> rather than having to maintain a separate data structure to track all
> in-flight commands and then iterate over that structure to learn which
> commands are complete.  It's also desirable when writing a server
> validation program (such as for checking structured reads for
> compliance) to be able to clean up the associated opaque data and have
> a final chance to change the overall command status.
> 
> Introduce new nbd_aio_FOO_notify functions for each command. Rewire
> the existing nbd_aio_FOO to forward to the new command.  (Perhaps the
> generator could reduce some of the boilerplate duplication, if a later
> patch wants to refactor this).
I'm writing some code now using these new nbd_aio_<CMD>_notify
functions, and I temporarily confused myself because these are similar
to nbd_notify_read and nbd_notify_write (the functions used to signal
to the state machine that the socket is ready for reading/writing).

I wonder if we should rename something here.  My suggestions are
either of the following or both:

(I) Rename nbd_notify_read / nbd_notify_write to nbd_ready_to_read /
nbd_ready_to_write.

(II) Rename nbd_aio_<CMD>_notify to nbd_aio_<CMD>_callback.

What do you think?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

Eric Blake

2019-Jul-15 14:48 UTC

head link

Re: [Libguestfs] [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions

On 6/29/19 8:28 AM, Eric Blake wrote:> As mentioned in the previous patch, there are situations where an aio
> client wants instant notification when a given command is complete,
> rather than having to maintain a separate data structure to track all
> in-flight commands and then iterate over that structure to learn which
> commands are complete.  It's also desirable when writing a server
> validation program (such as for checking structured reads for
> compliance) to be able to clean up the associated opaque data and have
> a final chance to change the overall command status.
> 
> Introduce new nbd_aio_FOO_notify functions for each command. Rewire
> the existing nbd_aio_FOO to forward to the new command.  (Perhaps the
> generator could reduce some of the boilerplate duplication, if a later
> patch wants to refactor this).
> ---
>  docs/libnbd.pod     |  22 +++-
>  generator/generator | 278 +++++++++++++++++++++++++++++++++++++++++---
>  lib/rw.c            |  99 ++++++++++++++--
>  3 files changed, 374 insertions(+), 25 deletions(-)
Responding here to track what we found on IRC:

> +
> +  "aio_pread_structured_notify", {
> +    default_call with
> +    args = [ BytesPersistOut ("buf", "count"); UInt64
"offset";
> +             Opaque "data";
> +             CallbackPersist ("chunk", [ Opaque
"data";
> +                                         BytesIn ("subbuf",
"count");
> +                                         UInt64 "offset";
> +                                         Mutable (Int "error");
> +                                         Int "status" ]);
> +             CallbackPersist ("notify", [ Opaque
"data"; Int64 "handle";
> +                                          Mutable (Int "error")
]);
The code generated for this function,

> +  "aio_block_status_notify", {
> +    default_call with
> +    args = [ UInt64 "count"; UInt64 "offset";
> +             Opaque "data";
> +             CallbackPersist ("extent", [Opaque
"data"; String "metacontext";
> +                                         UInt64 "offset";
> +                                         ArrayAndLen (UInt32
"entries",
> +                                                     
"nr_entries");
> +                                         Mutable (Int "error")
]);
> +             CallbackPersist ("notify", [ Opaque
"data"; Int64 "handle";
> +                                          Mutable (Int "error")
]);
> +             Flags "flags" ];
and for this is broken. Right now, looking at the generated
python/methods.c, the generator malloc()s two separate wrapper structs,
but only populates the chunk->data (or extent->data) field pertaining to
the shared Opaque "data". Which means when we eventually get around to
calling the notify callback notify(notify_data->data, ...), we are
passing an uninitialized pointer instead of the malloc()d C wrapper
holding the user's Python pointer.

Of course, deferring the cleanup to nbd_add_close_callback is also an
issue (if a client issues 1000 nbd_aio_block_status commands, we've
queued up 1000 malloc()d wrappers that don't get freed until the
connection hangs up, rather than freeing each one as soon as possible by
using the C nbd_aio_block_stattus_callback even for the Python
nbd.aio_block_status).

And it doesn't help that I failed to add a
python/t/5??-pread-callback.py unit test (and similar for all the other
APIs), where a pread-structured and/or block-status unit test would have
caught the bug.  Looks like we've got some more generator tweaking to do.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Eric Blake

2019-Jul-18 13:47 UTC

head link

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

On 6/29/19 8:28 AM, Eric Blake wrote:> Some clients need to know when it is safe to issue NBD_CMD_DISC, or to
> decide whether calling poll(POLLIN) will block indefinitely because
> the server isn't expected to respond.  Make this easier to learn by
> tracking the count of commands we have queued up to send, as well as
> the count of commands where we are waiting on the server's response.
> 
> Update tests/aio-parallel* and examples/batched-read-write to use
> nbd's own in-flight counter instead of reimplementing it ourselves.
> 
> Note that h->in_flight is only ever updated while the lock is held;
> but we may want to consider also making it atomic and therefore
> readable as a lock-less function.
> ---
> +++ b/lib/aio.c
> @@ -23,6 +23,7 @@
>  #include <stdbool.h>
>  #include <errno.h>
>  #include <inttypes.h>
> +#include <assert.h>
> 
>  #include "internal.h"
> 
> @@ -84,6 +85,8 @@ nbd_unlocked_aio_command_completed (struct nbd_handle *h,
>      prev_cmd->next = cmd->next;
>    else
>      h->cmds_done = cmd->next;
> +  h->in_flight--;
> +  assert (h->in_flight >= 0);
We guard against underflow...

> +++ b/lib/rw.c
> @@ -236,6 +241,7 @@ nbd_internal_command_common (struct nbd_handle *h,
        nbd_internal_run (h, cmd_issue) == -1)>        return -1;
>    }
> 
> +  h->in_flight++;
...but even though we inserted in the list, we fail to increment if
nbd_internal_run() encountered an error. Also, the count is too low if
the server managed to get a reply to us without us blocking. Obvious fix
pushed.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Eric Blake

2019-Jul-18 14:22 UTC

head link

Re: [Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

On 6/29/19 8:28 AM, Eric Blake wrote:
> 
> We also want the client to be aware of any issued/in-flight commands
> that failed because they were stranded when the state machine moved to
> CLOSED or DEAD.  Previously, nbd_aio_command_completed() would never
> locate such stranded commands, but adding a common point to fire the
> notifier for such commands makes it also possible to move those
> commands to the completion queue.
> 
> +++ b/generator/states.c
> @@ -111,6 +111,31 @@ send_from_wbuf (struct nbd_handle *h)
>    return 0;                     /* move to next state */
>  }
> 
> +/* Forcefully fail any remaining in-flight commands in list */
> +void abort_commands (struct nbd_handle *h,
> +                     struct command_in_flight **list)
> +{
> +  struct command_in_flight *prev_cmd, *cmd;
> +
> +  for (cmd = *list, prev_cmd = NULL;
> +       cmd != NULL;
> +       prev_cmd = cmd, cmd = cmd->next) {
> +    if (cmd->cb.notify && cmd->type != NBD_CMD_DISC) {
> +      int error = cmd->error ? cmd->error : ENOTCONN;
> +
> +      if (cmd->cb.notify (cmd->cb.opaque, cmd->handle,
&error) == -1 && error)
> +        cmd->error = error;
> +    }
> +    if (cmd->error == 0)
> +      cmd->error = ENOTCONN;
> +  }
> +  if (prev_cmd) {
> +    prev_cmd->next = h->cmds_done;
> +    h->cmds_done = *list;
> +    *list = NULL;
> +  }
This inserts the list to the head of cmds_done, which breaks its use as
a FIFO queue for clients using nbd_aio_peek_command_completed to process
messages in server order. I'll post a fix that keeps things in order.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Eric Blake

2019-Jul-18 14:47 UTC

head link

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

On 6/29/19 8:28 AM, Eric Blake wrote:> Some clients need to know when it is safe to issue NBD_CMD_DISC, or to
> decide whether calling poll(POLLIN) will block indefinitely because
> the server isn't expected to respond.  Make this easier to learn by
> tracking the count of commands we have queued up to send, as well as
> the count of commands where we are waiting on the server's response.
This documents one thing...
> 
> Update tests/aio-parallel* and examples/batched-read-write to use
> nbd's own in-flight counter instead of reimplementing it ourselves.
> 
> Note that h->in_flight is only ever updated while the lock is held;
> but we may want to consider also making it atomic and therefore
> readable as a lock-less function.
> ---
> @@ -2012,6 +2018,22 @@ C<nbd_aio_command_completed> to actually
retire the command and learn
>  whether the command was successful.";
>    };
> 
> +  "aio_in_flight", {
> +    default_call with
> +    args = []; ret = RInt;
> +    permitted_states = [ Connected; Closed; Dead ];
> +    (* XXX is_locked = false ? *)
> +    shortdesc = "check how many aio commands are still in
flight";
> +    longdesc = "\
> +Return the number of in-flight aio commands that are still awaiting a
> +response from the server before they can be retired.  If this returns
> +a non-zero value when requesting a disconnect from the server (see
> +C<nbd_aio_disconnect> and C<nbd_shutdown>), libnbd does not
try to
> +wait for those commands to complete gracefully; if the server strands
> +commands while shutting down, C<nbd_aio_command_completed> will not
> +be able to report status on those commands.";
and this concurs with the intent...

> +++ b/lib/aio.c
> @@ -23,6 +23,7 @@
>  #include <stdbool.h>
>  #include <errno.h>
>  #include <inttypes.h>
> +#include <assert.h>
> 
>  #include "internal.h"
> 
> @@ -84,6 +85,8 @@ nbd_unlocked_aio_command_completed (struct nbd_handle *h,
>      prev_cmd->next = cmd->next;
>    else
>      h->cmds_done = cmd->next;
> +  h->in_flight--;
> +  assert (h->in_flight >= 0);
...but this implementation is wrong.  It counts commands that have a
response but are not retired as being in-flight.  Rather, we should be
decrementing in_flight at the point a command moves into h->cmds_done.
I'll post a fix.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Maybe Matching Threads

Search for more apparently analagous threads

Libguestfs - Jun 2019 - [libnbd PATCH 0/6] new APIs: aio_in_flight, aio_FOO_notify

[Libguestfs] [libnbd PATCH 0/6] new APIs: aio_in_flight, aio_FOO_notify

[Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

[Libguestfs] [libnbd PATCH 2/6] generator: Allow DEAD state actions to run

[Libguestfs] [libnbd PATCH 3/6] generator: Allow Int64 in callbacks

[Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

[Libguestfs] [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions

[Libguestfs] [libnbd PATCH 6/6] examples: New example for strict read validations

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

Re: [Libguestfs] [libnbd PATCH 2/6] generator: Allow DEAD state actions to run

Re: [Libguestfs] [libnbd PATCH 3/6] generator: Allow Int64 in callbacks

Re: [Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

Re: [Libguestfs] [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions

Re: [Libguestfs] [libnbd PATCH 6/6] examples: New example for strict read validations

Re: [Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

[Libguestfs] [libnbd] notify API changes (was: Re: [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions)

Re: [Libguestfs] [libnbd PATCH 5/6] api: Add new nbd_aio_FOO_notify functions

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

Re: [Libguestfs] [libnbd PATCH 4/6] states: Prepare for aio notify callback

Re: [Libguestfs] [libnbd PATCH 1/6] api: Add nbd_aio_in_flight

Maybe Matching Threads