thr3ads.net - Libguestfs - [Libguestfs] [nbdkit PATCH] nbd: Fix race during close [Nov 2018]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2018-Nov-08 02:22 UTC

[Libguestfs] [nbdkit PATCH] nbd: Fix race during close

ThreadSanitizer [1] pointed out that in the nbd plugin, nbd_close() can
attempt close() in the main thread while the worker thread is still
attempting to start a read().  Normally, if the read() loses the race,
it will get a harmless EBADF that exits the worker thread (which is what
we want, as we are closing the connection anyway); but if another
connection happens to start in that window, we could end up read()ing
from the fd opened by the new connection, with disastrous results on the
second connection.

[1] ./configure CXFLAGS=-fsanitize=thread LDFLAGS=-fsanitize=thread

Commits c70616f8 and 430f8141 tried to clean up deadlock during
shutdown, but missed that without some sort of locking, a
close-before-read was still possible. Swap lines so that pthread_join()
now serves as the locking to ensure close is not attempted while
another thread may be about to use the fd.

Thanks: Richard W.M. Jones
Signed-off-by: Eric Blake <eblake@redhat.com>
---

It took me a while to decipher how ThreadSanitizer actually tests
this race, which gets reported as a Write guarded by a mutex [caused
by close()] racing with an earlier Read [caused by read()].  I
finally realized that linking with libtsan installs wrappers around
the syscalls for read(), close(), etc. where the wrappers create an
underlying mutex and read/write operations on sentinel memory, so that
it can then reuse its memory race analysis it has for more typical
data races.  The wrappers thus cause odd-looking reports for fd races
(the report ends up claiming that Thread 1 performing close() lost a
race to Thread 2 performing read() - even though the ACTUAL data race
is only a bug when Thread 2 loses the race and read()s on an fd
close()d by Thread 1 and possibly reused by Thread 3 in the meantime).

 plugins/nbd/nbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c
index b9a4523..9130642 100644
--- a/plugins/nbd/nbd.c
+++ b/plugins/nbd/nbd.c
@@ -575,9 +575,9 @@ nbd_close (void *handle)
     nbd_request_raw (h, 0, NBD_CMD_DISC, 0, 0, 0, NULL);
     shutdown (h->fd, SHUT_WR);
   }
-  close (h->fd);
   if ((errno = pthread_join (h->reader, NULL)))
     nbdkit_debug ("failed to join reader thread: %m");
+  close (h->fd);
   pthread_mutex_destroy (&h->write_lock);
   pthread_mutex_destroy (&h->trans_lock);
   free (h);
-- 
2.17.2

Richard W.M. Jones

2018-Nov-08 10:50 UTC

head link

Re: [Libguestfs] [nbdkit PATCH] nbd: Fix race during close

On Wed, Nov 07, 2018 at 08:22:30PM -0600, Eric Blake
wrote:> ThreadSanitizer [1] pointed out that in the nbd plugin, nbd_close() can
> attempt close() in the main thread while the worker thread is still
> attempting to start a read().  Normally, if the read() loses the race,
> it will get a harmless EBADF that exits the worker thread (which is what
> we want, as we are closing the connection anyway); but if another
> connection happens to start in that window, we could end up read()ing
> from the fd opened by the new connection, with disastrous results on the
> second connection.
> 
> [1] ./configure CXFLAGS=-fsanitize=thread LDFLAGS=-fsanitize=thread
> 
> Commits c70616f8 and 430f8141 tried to clean up deadlock during
> shutdown, but missed that without some sort of locking, a
> close-before-read was still possible. Swap lines so that pthread_join()
> now serves as the locking to ensure close is not attempted while
> another thread may be about to use the fd.
> 
> Thanks: Richard W.M. Jones
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> It took me a while to decipher how ThreadSanitizer actually tests
> this race, which gets reported as a Write guarded by a mutex [caused
> by close()] racing with an earlier Read [caused by read()].  I
> finally realized that linking with libtsan installs wrappers around
> the syscalls for read(), close(), etc. where the wrappers create an
> underlying mutex and read/write operations on sentinel memory, so that
> it can then reuse its memory race analysis it has for more typical
> data races.  The wrappers thus cause odd-looking reports for fd races
> (the report ends up claiming that Thread 1 performing close() lost a
> race to Thread 2 performing read() - even though the ACTUAL data race
> is only a bug when Thread 2 loses the race and read()s on an fd
> close()d by Thread 1 and possibly reused by Thread 3 in the meantime).
> 
>  plugins/nbd/nbd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c
> index b9a4523..9130642 100644
> --- a/plugins/nbd/nbd.c
> +++ b/plugins/nbd/nbd.c
> @@ -575,9 +575,9 @@ nbd_close (void *handle)
>      nbd_request_raw (h, 0, NBD_CMD_DISC, 0, 0, 0, NULL);
>      shutdown (h->fd, SHUT_WR);
>    }
> -  close (h->fd);
>    if ((errno = pthread_join (h->reader, NULL)))
>      nbdkit_debug ("failed to join reader thread: %m");
> +  close (h->fd);
>    pthread_mutex_destroy (&h->write_lock);
>    pthread_mutex_destroy (&h->trans_lock);
>    free (h);
> -- 
Well spotted :-)  Thanks.

ACK.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Reasonably Related Threads

Search for more apparently analagous threads

Libguestfs - Nov 2018 - [nbdkit PATCH] nbd: Fix race during close

[Libguestfs] [nbdkit PATCH] nbd: Fix race during close

Re: [Libguestfs] [nbdkit PATCH] nbd: Fix race during close

Reasonably Related Threads