thr3ads.net - Libguestfs - Re: [Libguestfs] [nbdkit PATCH 3/3] server: Add and use nbdkit

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-Aug-03 16:01 UTC

[Libguestfs] [nbdkit PATCH 0/3] More responsive shutdown

We noticed while writing various libnbd tests that when the delay
filter is in use, there are scenarios where we had to resort to
SIGKILL to get rid of nbdkit, because it was non-responsive to SIGINT.

I'm still trying to figure out the best way to add testsuite coverage
of this, but already proved to myself that it works from the command
line, under two scenarios that both used to cause long delays:

1. Early client death:
nbdkit -U - -vf --filter=delay null size=1m rdelay=10 --run \
  'timeout 1s nbdsh --connect "nbd+unix:///?socket=$unixsocket"
      -c "h.pread(1,0)"'

Pre-patch, the server detects the death of the client, but the worker
thread stays busy for the remaining 9 seconds before nbdkit can
finally exit.  Post-patch, the server exits right after the client.

2. Early server death:
timeout 1s nbdkit -U - -vf --filter=delay null size=1m rdelay=10 --run \
  'nbdsh --connect "nbd+unix:///?socket=$unixsocket" -c
"h.pread(1,0)"'

Pre-patch, the server reacts to the signal and kills the client, but
the worker thread stays busy for the remaining 9 seconds before nbdkit
can finally exit.  Post-patch, the server is able to finalize right
after the signal.

Use of --run in the above tests lets you test things in one command
line, but to some extent hides the longevity of the nbdkit process
(you get the shell prompt back when the main thread exits, even though
the detatched threads are still around); if you avoid --run and
actually keep nbdkit in the foreground in one terminal and use nbdsh
in a different terminal, choosing which terminal gets ^C, the effects
are a bit more apparent.

Patch 3 needs porting to any platform lacking ppoll. I have ideas for
that port, but don't want to spend time on it before knowing for sure
we need the port.  And the fact that the pre-patch tests show output
after the shell prompt returns means we still have cases where our
detached threads are executing past the point where the main thread
has tried to unload the plugin, which is never a nice thing.  We may
still have more work ahead of us to ensure that we don't unload an
in-use plugin.

Eric Blake (3):
  server: Add threadlocal_get_conn
  server: Add pipe for tracking disconnects
  server: Add and use nbdkit_nanosleep

 docs/nbdkit-plugin.pod  | 28 +++++++++++++++++
 configure.ac            |  1 +
 include/nbdkit-common.h |  1 +
 server/internal.h       |  3 ++
 filters/delay/delay.c   | 14 ++-------
 filters/rate/rate.c     | 10 +++----
 server/connections.c    | 66 ++++++++++++++++++++++++++++++++++++++++-
 server/public.c         | 61 +++++++++++++++++++++++++++++++++++++
 server/threadlocal.c    | 22 +++++++++++++-
 server/nbdkit.syms      |  1 +
 10 files changed, 188 insertions(+), 19 deletions(-)

-- 
2.20.1

Eric Blake

2019-Aug-03 16:01 UTC

head link

[Libguestfs] [nbdkit PATCH 1/3] server: Add threadlocal_get_conn

We'd like to interrupt a sleep during a filter/plugin transaction if
it is obvious that finishing the transaction is pointless (because the
client has delivered NBD_CMD_DISC or hit EOF).  But to do that, we
need to get back at the connection object from a function called from
the plugin thread.  Passing the connection as an opaque object would
be a major API change, so it's easier to just add a thread-local
tracking the current connection.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 server/internal.h    |  2 ++
 server/connections.c |  4 ++++
 server/threadlocal.c | 22 +++++++++++++++++++++-
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/server/internal.h b/server/internal.h
index 76f2139f..df4df0f2 100644
--- a/server/internal.h
+++ b/server/internal.h
@@ -342,6 +342,8 @@ extern void threadlocal_set_sockaddr (const struct sockaddr
*addr,
 extern void threadlocal_set_error (int err);
 extern int threadlocal_get_error (void);
 extern void *threadlocal_buffer (size_t size);
+extern void threadlocal_set_conn (struct connection *conn);
+extern struct connection *threadlocal_get_conn (void);

 /* Declare program_name. */
 #if HAVE_DECL_PROGRAM_INVOCATION_SHORT_NAME == 1
diff --git a/server/connections.c b/server/connections.c
index f56590c2..f49a74ee 100644
--- a/server/connections.c
+++ b/server/connections.c
@@ -137,6 +137,7 @@ connection_worker (void *data)
   debug ("starting worker thread %s", name);
   threadlocal_new_server_thread ();
   threadlocal_set_name (name);
+  threadlocal_set_conn (conn);
   free (worker);

   while (!quit && connection_get_status (conn) > 0)
@@ -297,6 +298,8 @@ new_connection (int sockin, int sockout, int nworkers)
     conn->send = raw_send_other;
   conn->close = raw_close;

+  threadlocal_set_conn (conn);
+
   return conn;
 }

@@ -306,6 +309,7 @@ free_connection (struct connection *conn)
   if (!conn)
     return;

+  threadlocal_set_conn (NULL);
   conn->close (conn);

   /* Don't call the plugin again if quit has been set because the main
diff --git a/server/threadlocal.c b/server/threadlocal.c
index 49ae1ac2..a796fce8 100644
--- a/server/threadlocal.c
+++ b/server/threadlocal.c
@@ -1,5 +1,5 @@
 /* nbdkit
- * Copyright (C) 2013-2018 Red Hat Inc.
+ * Copyright (C) 2013-2019 Red Hat Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are
@@ -60,6 +60,7 @@ struct threadlocal {
   int err;
   void *buffer;
   size_t buffer_size;
+  struct connection *conn;
 };

 static pthread_key_t threadlocal_key;
@@ -226,3 +227,22 @@ threadlocal_buffer (size_t size)

   return threadlocal->buffer;
 }
+
+/* Set (or clear) the connection that is using the current thread */
+void
+threadlocal_set_conn (struct connection *conn)
+{
+  struct threadlocal *threadlocal = pthread_getspecific (threadlocal_key);
+
+  if (threadlocal)
+    threadlocal->conn = conn;
+}
+
+/* Get the connection associated with this thread, if available */
+struct connection *
+threadlocal_get_conn (void)
+{
+  struct threadlocal *threadlocal = pthread_getspecific (threadlocal_key);
+
+  return threadlocal ? threadlocal->conn : NULL;
+}
-- 
2.20.1

Eric Blake

2019-Aug-03 16:01 UTC

head link

[Libguestfs] [nbdkit PATCH 2/3] server: Add pipe for tracking disconnects

The next patch wants to add a utility function which will sleep until
an appropriate amount of time has elapsed, except that it will
short-circuit and report failure if the client is going away.  The
most obvious kernel interface for waiting for an action or a timeout
is pselect or ppoll, but it requires a file descriptor as the witness
of an action.  We already have a pipe-to-self for when nbdkit itself
is quitting because of a signal, but we also want a pipe-to-self for
each multi-threaded connection so that detection of NBD_CMD_DISC or
EOF in one thread can serve as the short circuit for another thread
waiting for a timeout.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 server/internal.h    |  1 +
 server/connections.c | 62 +++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/server/internal.h b/server/internal.h
index df4df0f2..8c229816 100644
--- a/server/internal.h
+++ b/server/internal.h
@@ -152,6 +152,7 @@ struct connection {
   pthread_mutex_t write_lock;
   pthread_mutex_t status_lock;
   int status; /* 1 for more I/O with client, 0 for shutdown, -1 on error */
+  int status_pipe[2]; /* track status changes via poll when nworkers > 1 */
   void *crypto_session;
   int nworkers;

diff --git a/server/connections.c b/server/connections.c
index f49a74ee..c173df8d 100644
--- a/server/connections.c
+++ b/server/connections.c
@@ -39,8 +39,11 @@
 #include <string.h>
 #include <unistd.h>
 #include <sys/socket.h>
+#include <fcntl.h>
+#include <assert.h>

 #include "internal.h"
+#include "utils.h"

 /* Default number of parallel requests. */
 #define DEFAULT_PARALLEL_REQUESTS 16
@@ -114,8 +117,16 @@ connection_set_status (struct connection *conn, int value)
   if (conn->nworkers &&
       pthread_mutex_lock (&conn->status_lock))
     abort ();
-  if (value < conn->status)
+  if (value < conn->status) {
+    if (conn->nworkers && conn->status > 0) {
+      char c = 0;
+
+      assert (conn->status_pipe[1] >= 0);
+      if (write (conn->status_pipe[1], &c, 1) != 1 && errno !=
EAGAIN)
+        nbdkit_debug ("failed to notify pipe-to-self: %m");
+    }
     conn->status = value;
+  }
   if (conn->nworkers &&
       pthread_mutex_unlock (&conn->status_lock))
     abort ();
@@ -284,6 +295,50 @@ new_connection (int sockin, int sockout, int nworkers)

   conn->status = 1;
   conn->nworkers = nworkers;
+  if (nworkers) {
+#ifdef HAVE_PIPE2
+    if (pipe2 (conn->status_pipe, O_NONBLOCK | O_CLOEXEC)) {
+      perror ("pipe2");
+      free (conn);
+      return NULL;
+    }
+#else
+    /* If we were fully parallel, then this function could be
+     * accepting connections in one thread while another thread could
+     * be in a plugin trying to fork.  But plugins.c forced
+     * thread_model to serialize_all_requests when it detects a lack
+     * of atomic CLOEXEC, at which point, we can use a mutex to ensure
+     * we aren't accepting until the plugin is not running, making
+     * non-atomicity okay.
+     */
+    assert (backend->thread_model (backend) <+           
NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS);
+    lock_request (NULL);
+    if (pipe (conn->status_pipe)) {
+      perror ("pipe");
+      free (conn);
+      unlock_request (NULL);
+      return NULL;
+    }
+    if (set_nonblock (set_cloexec (conn->status_pipe[0])) == -1) {
+      perror ("fcntl");
+      close (conn->status_pipe[1]);
+      free (conn);
+      unlock_request (NULL);
+      return NULL;
+    }
+    if (set_nonblock (set_cloexec (conn->status_pipe[1])) == -1) {
+      perror ("fcntl");
+      close (conn->status_pipe[0]);
+      free (conn);
+      unlock_request (NULL);
+      return NULL;
+    }
+    unlock_request (NULL);
+#endif
+  }
+  else
+    conn->status_pipe[0] = conn->status_pipe[1] = -1;
   conn->sockin = sockin;
   conn->sockout = sockout;
   pthread_mutex_init (&conn->request_lock, NULL);
@@ -324,6 +379,11 @@ free_connection (struct connection *conn)
     }
   }

+  if (conn->status_pipe[0] >= 0) {
+    close (conn->status_pipe[0]);
+    close (conn->status_pipe[1]);
+  }
+
   pthread_mutex_destroy (&conn->request_lock);
   pthread_mutex_destroy (&conn->read_lock);
   pthread_mutex_destroy (&conn->write_lock);
-- 
2.20.1

Eric Blake

2019-Aug-03 16:01 UTC

head link

[Libguestfs] [nbdkit PATCH 3/3] server: Add and use nbdkit_nanosleep

There are a couple of problems with filters trying to sleep. First,
when it is time to shut down nbdkit, we wait until all pending
transactions have had a chance to wind down.  But consider what
happens if one or more of those pending transactions are blocked in a
sleep.  POSIX says nanosleep is interrupted with EINTR if that thread
handles a signal, but wiring up signal masks just to ensure a specific
thread will get the signal is not pretty, and that means the thread
processing SIGINT is NOT the thread blocked in nanosleep.  Couple that
with the fact that if multiple threads are sleeping, only one thread
needs to process the signal, so the rest continue to sleep.  Thus,
even though we know the user wants nbdkit to quit, we are blocked
waiting for a potentially long sleep to complete.  This problem can be
solved by realizing we already have a pipe-to-self to learn about a
quit request or the end of a particular connection, and check for
activities on those pipes in parallel with a timeout through pselect
or ppoll to break our wait as soon as we know there is no reason to
continue on with the transaction.

Second, if we ever add support for a signal that is not fatal to
nbdkit (such as SIGHUP triggering a config reload), but don't mask
which thread can process that signal, then if a thread servicing
nanosleep gets the signal the sleep will terminate early.  But in that
case, we want execution to continue on.  This problem can be solved by
either setting a signal mask so as to avoid processing our non-fatal
signal in the same thread as any filter/plugin processing, or by
paying attention to EINTR results and restarting a sleep.  And rather
than making each sleeping client reimplement that logic, we might as
well stick it in on our common helper.

Thus, this patch introduces nbdkit_nanosleep, and implements it under
the hood as a ppoll that redirects signals but wakes up as needed.
Including <time.h> for struct timespec in nbdkit-common.h is difficult
(C++ is not required to provide it), so we just take the two arguments
in pieces; we can rely on int being 32-bits, so don't need nsec to be
a long, and in turn can avoid EINVAL for nsec outside the
[0,1000000000) range by updating sec as needed.  And sleeping longer
than INT_MAX seconds is indistinguishable from having no timeout, at
least when compared to human lifetimes, so we reject larger values.

Although it is more likely that filters will be the ones actually
trying to slow I/O (and all that uses the new API in this commit), it
is feasible that a third-party plugin may also want access to this
behavior, so I added it in common and documented it for plugins.

Note that ppoll is not yet POSIX [1], so we may have to provide
fallbacks to older interfaces.  But rather than write those now, I
just punted on the compilation, and we'll deal with it when we
actually encounter platforms needing it.

[1] http://austingroupbugs.net/view.php?id=1263

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 docs/nbdkit-plugin.pod  | 28 +++++++++++++++++++
 configure.ac            |  1 +
 include/nbdkit-common.h |  1 +
 filters/delay/delay.c   | 14 ++--------
 filters/rate/rate.c     | 10 +++----
 server/public.c         | 61 +++++++++++++++++++++++++++++++++++++++++
 server/nbdkit.syms      |  1 +
 7 files changed, 99 insertions(+), 17 deletions(-)

diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod
index 9510253f..f36778cf 100644
--- a/docs/nbdkit-plugin.pod
+++ b/docs/nbdkit-plugin.pod
@@ -342,6 +342,34 @@ and returns C<NULL>.

 The returned string must be freed by the caller.

+=head2 C<nbdkit_nanosleep>
+
+ int nbdkit_nanosleep (unsigned sec, unsigned nsec);
+
+The utility function C<nbdkit_nanosleep> suspends the current thread,
+and returns 0 if it slept at least as many seconds and nanoseconds as
+requested, or -1 after calling C<nbdkit_error> if there is no point in
+continuing the current command.  Attempts to sleep more than
+C<INT_MAX> seconds are treated as an error.
+
+This is similar to L<nanosleep(3)>, although you specify the
+components as separate parameters rather than as a C<struct timespec>.
+This wrapper provides two benefits over the system library: in one
+direction, the system library has no easy way to abort a sleep early
+if other information determines that there is no point in finishing
+the sleep (handling a signal in the same thread as the sleep will do
+that, but you don't have full control over the signal masks of other
+threads to ensure that your thread will get the intended interrupting
+signal).  In the other direction, the system library has no easy way
+to avoid aborting a sleep early (you can restart the sleep with any
+remaining unslept time, but calculating this gets tedious; or you can
+use signal masks to avoid handling a signal, but risk making your
+thread non-responsive to signals that were important after all,
+stalling a timely shutdown of nbdkit).  The system call L<ppoll(2)>
+can solve these issues, but requires access to internal file
+descriptors that the plugin does not need access to, hence this
+function exists to do the work on your behalf.
+
 =head2 umask

 All plugins will see a L<umask(2)> of C<0022>.
diff --git a/configure.ac b/configure.ac
index 054ad64a..2a34bf8a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -198,6 +198,7 @@ AC_CHECK_FUNCS([\
 	get_current_dir_name \
 	mkostemp \
 	pipe2 \
+	ppoll \
 	posix_fadvise])

 dnl Check whether printf("%m") works
diff --git a/include/nbdkit-common.h b/include/nbdkit-common.h
index 5257d992..e004aa18 100644
--- a/include/nbdkit-common.h
+++ b/include/nbdkit-common.h
@@ -82,6 +82,7 @@ extern int64_t nbdkit_parse_size (const char *str);
 extern int nbdkit_parse_bool (const char *str);
 extern int nbdkit_read_password (const char *value, char **password);
 extern char *nbdkit_realpath (const char *path);
+extern int nbdkit_nanosleep (unsigned sec, unsigned nsec);

 struct nbdkit_extents;
 extern int nbdkit_add_extent (struct nbdkit_extents *,
diff --git a/filters/delay/delay.c b/filters/delay/delay.c
index 498ab14d..c92a12d7 100644
--- a/filters/delay/delay.c
+++ b/filters/delay/delay.c
@@ -37,7 +37,6 @@
 #include <stdlib.h>
 #include <stdint.h>
 #include <string.h>
-#include <time.h>

 #include <nbdkit-filter.h>

@@ -83,16 +82,9 @@ parse_delay (const char *key, const char *value)
 static int
 delay (int ms, int *err)
 {
-  if (ms > 0) {
-    const struct timespec ts = {
-      .tv_sec = ms / 1000,
-      .tv_nsec = (ms % 1000) * 1000000,
-    };
-    if (nanosleep (&ts, NULL) == -1) {
-      nbdkit_error ("nanosleep: %m");
-      *err = errno;
-      return -1;
-    }
+  if (ms > 0 && nbdkit_nanosleep (ms / 1000, (ms % 1000) * 1000000)
== -1) {
+    *err = errno;
+    return -1;
   }
   return 0;
 }
diff --git a/filters/rate/rate.c b/filters/rate/rate.c
index dbd92ad6..dca5e9fc 100644
--- a/filters/rate/rate.c
+++ b/filters/rate/rate.c
@@ -262,12 +262,10 @@ maybe_sleep (struct bucket *bucket, pthread_mutex_t *lock,
uint32_t count,
       bits = bucket_run (bucket, bits, &ts);
     }

-    if (bits > 0)
-      if (nanosleep (&ts, NULL) == -1) {
-        nbdkit_error ("nanosleep: %m");
-        *err = errno;
-        return -1;
-      }
+    if (bits > 0 && nbdkit_nanosleep (ts.tv_sec, ts.tv_nsec) == -1)
{
+      *err = errno;
+      return -1;
+    }
   }
   return 0;
 }
diff --git a/server/public.c b/server/public.c
index 523a31f9..8f860d73 100644
--- a/server/public.c
+++ b/server/public.c
@@ -47,6 +47,8 @@
 #include <limits.h>
 #include <termios.h>
 #include <errno.h>
+#include <poll.h>
+#include <signal.h>

 #include "get-current-dir-name.h"

@@ -297,3 +299,62 @@ nbdkit_realpath (const char *path)

   return ret;
 }
+
+
+int
+nbdkit_nanosleep (unsigned sec, unsigned nsec)
+{
+#ifndef HAVE_PPOLL
+# error "Please port this to your platform"
+  /* Porting ideas, in order of preference:
+   * - POSIX requires pselect; it's a bit clunkier to set up the poll,
+   *   but the same ability to atomically mask all signals and operate
+   *   on struct timespec makes it similar to the preferred ppoll interface
+   * - calculate an end time target, then use poll in a loop on EINTR with
+   *   a recalculation of the timeout to still reach the end time
+   */
+  nbdkit_error ("nbdkit_nanosleep not yet ported to systems without
ppoll");
+  errno = ENOSYS;
+  return -1;
+#else
+  struct timespec ts;
+  struct connection *conn = threadlocal_get_conn ();
+  struct pollfd fds[2] = {
+    [0].fd = quit_fd,
+    [0].events = POLLIN,
+    [1].fd = conn ? conn->status_pipe[0] : -1,
+    [1].events = POLLIN,
+  };
+  sigset_t all;
+
+  if (sec >= INT_MAX - nsec / 1000000000) {
+    nbdkit_error ("sleep request is too long");
+    errno = EINVAL;
+    return -1;
+  }
+  ts.tv_sec = sec + nsec / 1000000000;
+  ts.tv_nsec = nsec % 1000000000;
+
+  /* Block all signals to this thread during the poll, so we don't
+   * have to worry about EINTR
+   */
+  if (sigfillset(&all))
+    abort ();
+  switch (ppoll (fds, 2, &ts, &all)) {
+  case -1:
+    assert (errno != EINTR);
+    nbdkit_error ("poll: %m");
+    return -1;
+  case 0:
+    return 0;
+  }
+
+  /* We don't have to read the pipe-to-self; if poll returned an
+   * event, we know the connection should be shutting down.
+   */
+  assert (quit || (conn && connection_get_status (conn) < 1));
+  nbdkit_error ("aborting sleep to shut down");
+  errno = ESHUTDOWN;
+  return -1;
+#endif
+}
diff --git a/server/nbdkit.syms b/server/nbdkit.syms
index f8eef214..2a024ed5 100644
--- a/server/nbdkit.syms
+++ b/server/nbdkit.syms
@@ -46,6 +46,7 @@
     nbdkit_extents_free;
     nbdkit_extents_new;
     nbdkit_get_extent;
+    nbdkit_nanosleep;
     nbdkit_parse_bool;
     nbdkit_parse_size;
     nbdkit_read_password;
-- 
2.20.1

Richard W.M. Jones

2019-Aug-05 13:50 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 3/3] server: Add and use nbdkit_nanosleep

On Sat, Aug 03, 2019 at 11:01:44AM -0500, Eric Blake wrote:
[...]

ACK series, but ...
> +=head2 C<nbdkit_nanosleep>
> +
[...]> +This is similar to L<nanosleep(3)>, although you specify the
> +components as separate parameters rather than as a C<struct
timespec>.
> +This wrapper provides two benefits over the system library: in one
> +direction, the system library has no easy way to abort a sleep early
> +if other information determines that there is no point in finishing
> +the sleep (handling a signal in the same thread as the sleep will do
> +that, but you don't have full control over the signal masks of other
> +threads to ensure that your thread will get the intended interrupting
> +signal).  In the other direction, the system library has no easy way
> +to avoid aborting a sleep early (you can restart the sleep with any
> +remaining unslept time, but calculating this gets tedious; or you can
> +use signal masks to avoid handling a signal, but risk making your
> +thread non-responsive to signals that were important after all,
> +stalling a timely shutdown of nbdkit).  The system call L<ppoll(2)>
> +can solve these issues, but requires access to internal file
> +descriptors that the plugin does not need access to, hence this
> +function exists to do the work on your behalf.
Do we really need this paragraph?  It's explaining how
nbdkit_nanosleep works internally which might change in future and is
largely irrelevant to plugin designers.  I'd say something along the
lines of:

  Plugins should prefer this function instead of sleeping using system
  calls like L<sleep(2)> or L<nanosleep(3)>, since it allows nbdkit
to
  shut down cleanly without delay.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Eric Blake

2019-Aug-05 22:16 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 3/3] server: Add and use nbdkit_nanosleep

On 8/3/19 11:01 AM, Eric Blake wrote:> There are a couple of problems with filters trying to sleep. First,
> when it is time to shut down nbdkit, we wait until all pending
> transactions have had a chance to wind down.  But consider what
> happens if one or more of those pending transactions are blocked in a
> sleep.  POSIX says nanosleep is interrupted with EINTR if that thread
> handles a signal, but wiring up signal masks just to ensure a specific
> thread will get the signal is not pretty, and that means the thread
> processing SIGINT is NOT the thread blocked in nanosleep.  Couple that
> with the fact that if multiple threads are sleeping, only one thread
> needs to process the signal, so the rest continue to sleep.  Thus,
> even though we know the user wants nbdkit to quit, we are blocked
> waiting for a potentially long sleep to complete.  This problem can be
> solved by realizing we already have a pipe-to-self to learn about a
> quit request or the end of a particular connection, and check for
> activities on those pipes in parallel with a timeout through pselect
> or ppoll to break our wait as soon as we know there is no reason to
> continue on with the transaction.
> 
> +++ b/server/public.c
> +#else
> +  struct timespec ts;
> +  struct connection *conn = threadlocal_get_conn ();
> +  struct pollfd fds[2] = {
> +    [0].fd = quit_fd,
> +    [0].events = POLLIN,
> +    [1].fd = conn ? conn->status_pipe[0] : -1,
> +    [1].events = POLLIN,
In testing this, the code is responsive to a multi-threaded connection
detecting client death on any other thread, but not responsive to a
single-threaded connection detecting client death on the lone thread.
But even with a multi-threaded connection, if every single thread is
tied up in a sleep, we lose responsiveness.  So I'm currently testing an
amendment to this patch that uses fds[3], with conn->sockout being
polled with .events = 0 for POLLHUP/POLLERR.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Reasonably Related Threads

Search for more apparently analagous threads

Libguestfs - Aug 2019 - Re: [nbdkit PATCH 3/3] server: Add and use nbdkit_nanosleep

[Libguestfs] [nbdkit PATCH 0/3] More responsive shutdown

[Libguestfs] [nbdkit PATCH 1/3] server: Add threadlocal_get_conn

[Libguestfs] [nbdkit PATCH 2/3] server: Add pipe for tracking disconnects

[Libguestfs] [nbdkit PATCH 3/3] server: Add and use nbdkit_nanosleep

Re: [Libguestfs] [nbdkit PATCH 3/3] server: Add and use nbdkit_nanosleep

Re: [Libguestfs] [nbdkit PATCH 3/3] server: Add and use nbdkit_nanosleep

Reasonably Related Threads