Eric Blake
2018-Apr-19 02:51 UTC
[Libguestfs] [nbdkit PATCH 0/2] Fix testsuite deadlocks during close
Commit 9e6d990f exposed a pre-existing deadlock between the nbd plugin as client and parallel nbdkit as server. Prior to that commit, the deadlock was "resolved" because we unloaded the .so in parallel to a .close callback that never completed (yes, it's nasty that it usually? let the testsuite pass), but now we correctly refuse to unload a plugin that has not returned from .close, which makes 'test-nbd' and 'test-parallel-nbd.sh' hang. Here are two patches which each independently break the deadlock, but where both patches should be applied so that we are robust to non-nbdkit partners that exhibit the same behaviors as our unpatched deadlock. Since testsuite hangs make automated testing less than useful, I'm going ahead and pushing them now. Eric Blake (2): nbd: Inform server of intent to quit nbd: Don't read after client sends NBD_CMD_DISC src/connections.c | 5 +++-- plugins/nbd/nbd.c | 4 +++- 2 files changed, 6 insertions(+), 3 deletions(-) -- 2.14.3
Eric Blake
2018-Apr-19 02:51 UTC
[Libguestfs] [nbdkit PATCH 1/2] nbd: Inform server of intent to quit
During .close, the nbd plugin waits for pthread_join() to collect the reader thread, ensuring we have read any remaining replies from the server prior to giving up control. However, after we have sent NBD_CMD_DISC, we end up in a blocking read() that won't progress until the server hangs up its end (the NBD protocol says the server won't reply to NBD_CMD_DISC, but it may still be flushing replies to other pending requests). But if the server is likewise stuck in a read() to see if we (erroneously) send anything after NBD_CMD_DISC (which is the case when nbdkit is the remote server, using parallel threads where the next read() is started before processing the command that was just read off the wire), then our NBD plugin and the remote NBD server are deadlocked with both sides trying to read from the other. Prior to commit 9e6d990f, this deadlock was (typically?) not visible in the testsuite: after the client quits, nbdkit code was unloading the nbd plugin without waiting for .close to complete, at which point process exit() breaks the deadlock (unloading the .so while code is still running might also explain some harder-to-reproduce crashes that also sometimes plagued the testsuite). But now that an ongoing .close inhibits an early unload (which is a GOOD thing), the deadlock is hanging the testsuite on tests involving the nbd plugin when the server doesn't immediately hang up after NBD_CMD_DISC. Using nonblocking reads with a poll() loop that handles both reads and writes from a single thread could probably solve the issue (as then, nbd_close wouldn't have to wait to join a thread stuck on blocking I/O); but refactoring the code to support nonblocking I/O is rather invasive. Instead, we can use shutdown() to inform the remote server that we will not be writing any more data to the socket; if the remote server is blocked on read(), it will now see an immediate EOF rather than waiting forever for something we will never write, and can proceed on its cleanup paths to eventually close its end, so that our NBD reader will likewise complete quickly and let us leave nbd_close(); then the .so can be unloaded with no plugin code running. Note that the deadlock also goes away if the server hangs up automatically after flushing all prior pending requests when NBD_CMD_DISC is received, and that nbdkit will be patched along those lines in the next patch. But while that fixes our testsuite in our own setup, it's better to fix our plugin to not be reliant on the server behavior, since we can't guarantee that all other servers behave the same way as nbdkit. Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/nbd/nbd.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c index a4a1f12..b9e72bc 100644 --- a/plugins/nbd/nbd.c +++ b/plugins/nbd/nbd.c @@ -568,8 +568,10 @@ nbd_close (void *handle) { struct handle *h = handle; - if (!h->dead) + if (!h->dead) { nbd_request_raw (h, 0, NBD_CMD_DISC, 0, 0, 0, NULL); + shutdown (h->fd, SHUT_WR); + } close (h->fd); if ((errno = pthread_join (h->reader, NULL))) nbdkit_debug ("failed to join reader thread: %m"); -- 2.14.3
Eric Blake
2018-Apr-19 02:51 UTC
[Libguestfs] [nbdkit PATCH 2/2] nbd: Don't read after client sends NBD_CMD_DISC
According to the NBD spec, the server should close the connection rather than replying to NBD_CMD_DISC (after flushing any pending replies to earlier commands), and a compliant client should not send any data after that command. However, when nbdkit is running multithreaded, we already have multiple threads competing for a read lock, and each of those threads would try to read() from the socket, which will never make progress unless the client hangs up so that the read fails with EOF. But if the client waits to close its end until after seeing EOF from the server (as was the case with the nbd plugin as the client until the previous patch), then both sides are deadlocked on a read. We already short-circuit a read attempt when the socket is closed; we should likewise avoid a read attempt after the client has requested a disconnect, so that we aren't at the mercy of the client properly using shutdown(). Note that this patch in isolation without the previous patch to the nbd plugin is sufficient to make the testsuite deadlock go away. Signed-off-by: Eric Blake <eblake@redhat.com> --- src/connections.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/connections.c b/src/connections.c index 46f2cd4..6c1f8cd 100644 --- a/src/connections.c +++ b/src/connections.c @@ -1056,8 +1056,9 @@ recv_request_send_reply (struct connection *conn) /* Read the request packet. */ { ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->read_lock); - if (get_status (conn) < 0) - return -1; + r = get_status (conn); + if (r <= 0) + return r; r = conn->recv (conn, &request, sizeof request); if (r == -1) { nbdkit_error ("read request: %m"); -- 2.14.3
Richard W.M. Jones
2018-Apr-19 13:54 UTC
Re: [Libguestfs] [nbdkit PATCH 2/2] nbd: Don't read after client sends NBD_CMD_DISC
On Wed, Apr 18, 2018 at 09:51:08PM -0500, Eric Blake wrote:> According to the NBD spec, the server should close the connection > rather than replying to NBD_CMD_DISC (after flushing any pending > replies to earlier commands), and a compliant client should not > send any data after that command. However, when nbdkit is > running multithreaded, we already have multiple threads competing > for a read lock, and each of those threads would try to read() > from the socket, which will never make progress unless the client > hangs up so that the read fails with EOF. > > But if the client waits to close its end until after seeing EOF > from the server (as was the case with the nbd plugin as the client > until the previous patch), then both sides are deadlocked on a > read. We already short-circuit a read attempt when the socket > is closed; we should likewise avoid a read attempt after the > client has requested a disconnect, so that we aren't at the > mercy of the client properly using shutdown(). > > Note that this patch in isolation without the previous patch to > the nbd plugin is sufficient to make the testsuite deadlock go > away. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > src/connections.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/src/connections.c b/src/connections.c > index 46f2cd4..6c1f8cd 100644 > --- a/src/connections.c > +++ b/src/connections.c > @@ -1056,8 +1056,9 @@ recv_request_send_reply (struct connection *conn) > /* Read the request packet. */ > { > ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&conn->read_lock); > - if (get_status (conn) < 0) > - return -1; > + r = get_status (conn); > + if (r <= 0) > + return r; > r = conn->recv (conn, &request, sizeof request); > if (r == -1) { > nbdkit_error ("read request: %m");These look fine thanks, so: ACK series. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Apparently Analagous Threads
- [PATCH nbdkit nbd] nbd: Unsuccessful attempt to fix memory leak.
- [RFC nbdkit PATCH 0/6] Enable full parallel request handling
- [nbdkit PATCH v2 0/8] Support parallel transactions within single connection
- [nbdkit PATCH] nbd: Fix race during close
- [nbdkit PATCH 0/2] Improve shutdown race in nbd plugin