Eric Blake
2017-Nov-17 03:13 UTC
[Libguestfs] [nbdkit PATCH 0/4] thread-safety issues prior to parallel handling
These patches should be ready to go in now; I will also post my work-in-progress for enabling full parallel handling that depends on these, but with that series, I was still getting crashes or hangs with test-socket-activation (I think I've nailed all the crashes I've seen, but the hang is rather insidious; see my other email https://www.redhat.com/archives/libguestfs/2017-November/msg00139.html) Eric Blake (4): errors: Avoid interleaved errors from parallel threads threadlocal: Copy thread name plugins: Make plugin_name() reliable past unload sockets: Fix lifetime of thread_data src/errors.c | 30 +++++++++++++++++++++++++++++- src/plugins.c | 8 ++++++-- src/sockets.c | 27 ++++++++++++++++++--------- src/threadlocal.c | 17 +++++++++++++---- 4 files changed, 66 insertions(+), 16 deletions(-) -- 2.13.6
Eric Blake
2017-Nov-17 03:13 UTC
[Libguestfs] [nbdkit PATCH 1/4] errors: Avoid interleaved errors from parallel threads
Since we construct our error/debug messages via multiple calls to stdio primitives, we are at risk of multiple threads interleaving their output if they try to output at once. Add a mutex to group related outputs into an atomic chunk. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/errors.c | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/src/errors.c b/src/errors.c index 5f14315..2fc83ab 100644 --- a/src/errors.c +++ b/src/errors.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013 Red Hat Inc. + * Copyright (C) 2013-2017 Red Hat Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -38,10 +38,30 @@ #include <stdarg.h> #include <string.h> #include <errno.h> +#include <assert.h> +#include <pthread.h> #include "nbdkit-plugin.h" #include "internal.h" +/* Used to group piecemeal message construction into atomic output. */ +static pthread_mutex_t errors_lock = PTHREAD_MUTEX_INITIALIZER; + +static void +lock (void) +{ + int r = pthread_mutex_lock(&errors_lock); + assert(!r); +} + +static void +unlock (void) +{ + int r = pthread_mutex_unlock(&errors_lock); + assert(!r); +} + +/* Called with lock taken. */ static void prologue (const char *type) { @@ -69,11 +89,13 @@ nbdkit_vdebug (const char *fs, va_list args) if (!verbose) return; + lock (); prologue ("debug"); vfprintf (stderr, fs, args); fprintf (stderr, "\n"); + unlock (); errno = err; } @@ -88,6 +110,7 @@ nbdkit_debug (const char *fs, ...) if (!verbose) return; + lock (); prologue ("debug"); va_start (args, fs); @@ -95,6 +118,7 @@ nbdkit_debug (const char *fs, ...) va_end (args); fprintf (stderr, "\n"); + unlock (); errno = err; } @@ -105,11 +129,13 @@ nbdkit_verror (const char *fs, va_list args) { int err = errno; + lock (); prologue ("error"); vfprintf (stderr, fs, args); fprintf (stderr, "\n"); + unlock (); errno = err; } @@ -121,6 +147,7 @@ nbdkit_error (const char *fs, ...) va_list args; int err = errno; + lock (); prologue ("error"); va_start (args, fs); @@ -128,6 +155,7 @@ nbdkit_error (const char *fs, ...) va_end (args); fprintf (stderr, "\n"); + unlock (); errno = err; } -- 2.13.6
Eric Blake
2017-Nov-17 03:13 UTC
[Libguestfs] [nbdkit PATCH 2/4] threadlocal: Copy thread name
We can't guarantee what storage duration the caller's request for a thread name has; and in fact, if the caller uses plugin_name() for their thread name, then the moment .unload is called, our threadlocal storage is pointing to la-la-land and we get a nice SEGV while trying to print any debug message. So copy the user's string instead. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/threadlocal.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/threadlocal.c b/src/threadlocal.c index d6e3942..24c381d 100644 --- a/src/threadlocal.c +++ b/src/threadlocal.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013 Red Hat Inc. + * Copyright (C) 2013-2017 Red Hat Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -55,7 +55,7 @@ */ struct threadlocal { - const char *name; /* Can be NULL. */ + char *name; /* Can be NULL. */ size_t instance_num; /* Can be 0. */ struct sockaddr *addr; socklen_t addrlen; @@ -69,6 +69,7 @@ free_threadlocal (void *threadlocalv) { struct threadlocal *threadlocal = threadlocalv; + free (threadlocal->name); free (threadlocal->addr); free (threadlocal); } @@ -104,8 +105,16 @@ threadlocal_set_name (const char *name) { struct threadlocal *threadlocal = pthread_getspecific (threadlocal_key); - if (threadlocal) - threadlocal->name = name; + /* Copy name, as the original may be residing in a module, but we + * want our thread name to persist even after unload. */ + if (threadlocal) { + free (threadlocal->name); + threadlocal->name = strdup (name); + if (threadlocal->name == NULL) { + perror ("malloc"); + exit (EXIT_FAILURE); + } + } } void -- 2.13.6
Eric Blake
2017-Nov-17 03:13 UTC
[Libguestfs] [nbdkit PATCH 3/4] plugins: Make plugin_name() reliable past unload
Directly returning storage in the module is risky, because it forces callers to strdup() if they aren't sure of the lifetime. Copying into the heap up front means that plugin_name() can now be safely used even if unload happens in the meantime. As long as we have only a single plugin in static storage (see TODO), we don't need to free the copy. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/plugins.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/plugins.c b/src/plugins.c index f5056d9..e8c6b28 100644 --- a/src/plugins.c +++ b/src/plugins.c @@ -143,6 +143,12 @@ plugin_register (const char *_filename, exit (EXIT_FAILURE); } } + /* Copy the module's name into local storage, so that plugin.name + * survives past unload. */ + if (!(plugin.name = strdup (plugin.name))) { + perror ("strdup"); + exit (EXIT_FAILURE); + } debug ("registered %s (name %s)", filename, plugin.name); @@ -177,8 +183,6 @@ plugin_cleanup (void) const char * plugin_name (void) { - assert (dl); - return plugin.name; } -- 2.13.6
Eric Blake
2017-Nov-17 03:13 UTC
[Libguestfs] [nbdkit PATCH 4/4] sockets: Fix lifetime of thread_data
It is never a wise idea to pass stack-allocated storage to another thread during pthread_create() unless you can guarantee that the new thread will complete prior to the calling thread returning and ending the lifetime of that storage. We were violating this, with the result in a SEGV in the detached child thread during threadlocal_set_sockaddr() with parameters pointing into thread_data which was reached at the same time the main thread was trying to call exit(). Signed-off-by: Eric Blake <eblake@redhat.com> --- src/sockets.c | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/src/sockets.c b/src/sockets.c index edb63b3..1d63523 100644 --- a/src/sockets.c +++ b/src/sockets.c @@ -261,6 +261,7 @@ start_thread (void *datav) handle_single_connection (data->sock, data->sock); + free (data); return NULL; } @@ -270,18 +271,25 @@ accept_connection (int listen_sock) int err; pthread_attr_t attrs; pthread_t thread; - struct thread_data thread_data; + struct thread_data *thread_data; static size_t instance_num = 1; - thread_data.instance_num = instance_num++; - thread_data.addrlen = sizeof thread_data.addr; + thread_data = malloc (sizeof *thread_data); + if (!thread_data) { + perror ("malloc"); + return; + } + + thread_data->instance_num = instance_num++; + thread_data->addrlen = sizeof thread_data->addr; again: - thread_data.sock = accept (listen_sock, - &thread_data.addr, &thread_data.addrlen); - if (thread_data.sock == -1) { + thread_data->sock = accept (listen_sock, + &thread_data->addr, &thread_data->addrlen); + if (thread_data->sock == -1) { if (errno == EINTR || errno == EAGAIN) goto again; perror ("accept"); + free (thread_data); return; } @@ -291,16 +299,17 @@ accept_connection (int listen_sock) */ pthread_attr_init (&attrs); pthread_attr_setdetachstate (&attrs, PTHREAD_CREATE_DETACHED); - err = pthread_create (&thread, &attrs, start_thread, &thread_data); + err = pthread_create (&thread, &attrs, start_thread, thread_data); pthread_attr_destroy (&attrs); if (err != 0) { fprintf (stderr, "%s: pthread_create: %s\n", program_name, strerror (err)); - close (thread_data.sock); + close (thread_data->sock); + free (thread_data); return; } /* If the thread starts successfully, then it is responsible for - * closing the socket. + * closing the socket and freeing thread_data. */ } -- 2.13.6
Richard W.M. Jones
2017-Nov-17 08:50 UTC
Re: [Libguestfs] [nbdkit PATCH 1/4] errors: Avoid interleaved errors from parallel threads
On Thu, Nov 16, 2017 at 09:13:55PM -0600, Eric Blake wrote:> Since we construct our error/debug messages via multiple calls > to stdio primitives, we are at risk of multiple threads interleaving > their output if they try to output at once. Add a mutex to group > related outputs into an atomic chunk. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > src/errors.c | 30 +++++++++++++++++++++++++++++- > 1 file changed, 29 insertions(+), 1 deletion(-) > > diff --git a/src/errors.c b/src/errors.c > index 5f14315..2fc83ab 100644 > --- a/src/errors.c > +++ b/src/errors.c > @@ -1,5 +1,5 @@ > /* nbdkit > - * Copyright (C) 2013 Red Hat Inc. > + * Copyright (C) 2013-2017 Red Hat Inc. > * All rights reserved. > * > * Redistribution and use in source and binary forms, with or without > @@ -38,10 +38,30 @@ > #include <stdarg.h> > #include <string.h> > #include <errno.h> > +#include <assert.h> > +#include <pthread.h> > > #include "nbdkit-plugin.h" > #include "internal.h" > > +/* Used to group piecemeal message construction into atomic output. */ > +static pthread_mutex_t errors_lock = PTHREAD_MUTEX_INITIALIZER; > + > +static void > +lock (void) > +{ > + int r = pthread_mutex_lock(&errors_lock); > + assert(!r); > +} > + > +static void > +unlock (void) > +{ > + int r = pthread_mutex_unlock(&errors_lock); > + assert(!r);Spaces between assert (in 2 cases above) and parens. The rest of this patch is fine, so ACK. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Richard W.M. Jones
2017-Nov-17 08:51 UTC
Re: [Libguestfs] [nbdkit PATCH 4/4] sockets: Fix lifetime of thread_data
ACK patches 2, 3 & 4. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Eric Blake
2017-Nov-17 20:21 UTC
Re: [Libguestfs] [nbdkit PATCH 2/4] threadlocal: Copy thread name
On 11/16/2017 09:13 PM, Eric Blake wrote:> We can't guarantee what storage duration the caller's request > for a thread name has; and in fact, if the caller uses > plugin_name() for their thread name, then the moment .unload > is called, our threadlocal storage is pointing to la-la-land > and we get a nice SEGV while trying to print any debug message. > So copy the user's string instead. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > src/threadlocal.c | 17 +++++++++++++---- > 1 file changed, 13 insertions(+), 4 deletions(-) >> @@ -104,8 +105,16 @@ threadlocal_set_name (const char *name) > { > struct threadlocal *threadlocal = pthread_getspecific (threadlocal_key); > > - if (threadlocal) > - threadlocal->name = name; > + /* Copy name, as the original may be residing in a module, but we > + * want our thread name to persist even after unload. */ > + if (threadlocal) { > + free (threadlocal->name); > + threadlocal->name = strdup (name); > + if (threadlocal->name == NULL) { > + perror ("malloc"); > + exit (EXIT_FAILURE); > + }I'm thinking (especially since I plan to create threads for each connection) that an exit() may be a bit harsh here (exit during startup is fine; but this code can be reached on more than just startup, and once we transition into accepting connections, we should make reasonable efforts to avoid killing off existing connections with an exit). Thus, I'm tweaking this function to just state that it is best-effort; if strdup() fails, we're better off printing debug messages without a name at all than we are exit()ing. (Then again, if strdup() fails, most likely something else will fail soon anyways). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Apparently Analagous Threads
- [nbdkit PATCH 1/4] errors: Avoid interleaved errors from parallel threads
- [nbdkit PATCH] errors: Use lighter-weight locking
- [nbdkit PATCH 0/4] thread-safety issues prior to parallel handling
- [PATCH nbdkit 5/9] cache: Allow this filter to serve requests in parallel.
- [PATCH] nouveau: safen up nouveau_device list usage against concurrent access