Displaying 20 results from an estimated 3000 matches similar to: "[nbdkit PATCH 0/2] Improve shutdown race in nbd plugin"
2020 Mar 27
1
Re: [nbdkit PATCH 2/2] nbd: Reorder cleanup to avoid getting stuck in poll()
On Fri, Mar 27, 2020 at 05:33:28PM -0500, Eric Blake wrote:
> We have been seeing sporadic hangs on test-nbd-tls-psk.sh, where even
> though the client to the 'nbdkit nbd' process has cleanly exited,
> things are stalled in .close where nbd is trying to pthread_join() the
> reader thread, while the reader thread is itself blocked on a poll()
> that will never make additional
2020 Mar 30
4
[libnbd PATCH 0/2] fix hangs against nbdkit 1.2
nbdkit 1.2 as a server waits for read() to see EOF, even after the
client has sent NBD_CMD_DISC. That was fixed in mbdkit 1.4; and most
modern NBD servers are smarter than this (they close() the write end
of their traffic soon after NBD_CMD_DISC). But it's easy enough to
revert nbdkit commit c70616f8 to get back to a server with the same
behavior as the older nbdkit, at which point both
2020 Mar 27
0
[nbdkit PATCH 2/2] nbd: Reorder cleanup to avoid getting stuck in poll()
We have been seeing sporadic hangs on test-nbd-tls-psk.sh, where even
though the client to the 'nbdkit nbd' process has cleanly exited,
things are stalled in .close where nbd is trying to pthread_join() the
reader thread, while the reader thread is itself blocked on a poll()
that will never make additional progress. Tracing the race is
difficult: nbd_shutdown() sends NBD_CMD_DISC to the
2019 Jun 28
1
[libnbd PATCH] disconnect: Prevent any further commands
Once the client has requested NBD_CMD_DISC, the protocol states that
it must not send any further information to the server (further writes
may still be needed for a clean TLS shutdown, but that's a different
matter requiring more states).
Our state machine can prevent some of this if we have moved to CLOSED,
but that's not foolproof because we can queue commands that can't be
written
2020 Sep 11
3
[libnbd PATCH] api: Add LIBNBD_SHUTDOWN_IMMEDIATE flag
As mentioned in commits 176fc4ea and 609c25f0, our original plan in
adding a flags argument to nbd_shutdown was to let us specify
different behaviors at the libnbd level, rather than NBD protocol
flags (for that, the user has nbd_aio_disconnect). But when we later
parameterized OFlags to accept various bitmasks (commit f891340b), we
failed to mark nbd_shutdown as using a different bitmask than
2020 Mar 27
0
[nbdkit PATCH 1/2] nbd: Don't reference stale errno in reader loop
When switching to libnbd in commit ab7760fcfd, I mistakenly assumed
that after a POLLIN event fires on the pipe-to-self, then read() will
return either 1 or -1. But this is not true; read() can also return 0
(if the pipe hits EOF), in which case POSIX says errno has an
unspecified value, and we should not be deciding whether to log an
error based on a random value. I did not manage to fix the
2019 May 22
12
[libnbd PATCH v3 0/7] Avoid deadlock with in-flight commands
Since v2:
- rebase to Rich's new API calls
- more refactoring in patch 1 (retitled)
- new patches 3 and 4
- fix data corruption in patch 6 (was 4)
- more tweaks to the reproducer example (including using new API from 3)
Eric Blake (7):
lib: Refactor command_common() to do more common work
commands: Allow for a command queue
commands: Expose FIFO ordering of server completions
2019 May 22
10
[libnbd PATCH v2 0/5] Avoid deadlock with in-flight commands
On v1, we discussed whether cmds_to_issue needed to be a list, since
it never had more than one element. I played with the idea of making
it a list, and allowing the client to queue up new commands regardless
of whether the state machine is currently in READY. I also polished up
the tmp demo into a bit more full-fledged example file, worth
including since it also let me discover a hard-to-hit race
2017 Nov 21
6
[nbdkit PATCH v2 0/4] enable parallel nbd forwarding
With this, I am finally able to get the nbd plugin to do out-of-order
responses to the client. Once this series goes in, we should be
ready for Rich to cut a release.
Eric Blake (4):
nbd: Split reading into separate thread
nbd: Protect writes with mutex
nbd: Enable parallel handling
tests: Test parallel nbd behavior
plugins/nbd/nbd.c | 217
2019 Jun 29
19
[libnbd PATCH 0/6] new APIs: aio_in_flight, aio_FOO_notify
I still need to wire in the use of *_notify functions into nbdkit to
prove whether it makes the code any faster or easier to maintain, but
at least the added example shows one good use case for the new API.
Eric Blake (6):
api: Add nbd_aio_in_flight
generator: Allow DEAD state actions to run
generator: Allow Int64 in callbacks
states: Prepare for aio notify callback
api: Add new
2020 Mar 28
0
[nbdkit PATCH v2] nbd: Avoid stuck poll() in nbdplug_close_handle()
We have been seeing sporadic hangs on test-nbd-tls-psk.sh and
test-nbd-tls.sh (on my machine, running those two in a loop with
commits 0a76cae4 and 09e34ba2 reverted would fail within 100
attempts), where even though the client to the 'nbdkit nbd' process
has cleanly exited, things are stalled in .close where nbd is trying
to pthread_join() the reader thread, while the reader thread is
2020 Mar 19
2
Re: Anyone seen build hangs (esp armv7, s390x) in Fedora?
[Dropping devel, adding libguestfs]
This can be reproduced on x86-64 so I can reproduce it locally. It
only appears to happen when the tests are run under rpmbuild, not when
I run them as ‘make check’, but I'm unclear why this is.
As Eric described earlier, the test runs two copies of nbdkit and a
client, connected like this:
qemu-img info ===> nbdkit nbd ===> nbdkit example1
2018 Nov 08
1
[nbdkit PATCH] nbd: Fix race during close
ThreadSanitizer [1] pointed out that in the nbd plugin, nbd_close() can
attempt close() in the main thread while the worker thread is still
attempting to start a read(). Normally, if the read() loses the race,
it will get a harmless EBADF that exits the worker thread (which is what
we want, as we are closing the connection anyway); but if another
connection happens to start in that window, we
2018 Apr 19
3
[nbdkit PATCH 0/2] Fix testsuite deadlocks during close
Commit 9e6d990f exposed a pre-existing deadlock between the nbd
plugin as client and parallel nbdkit as server. Prior to that
commit, the deadlock was "resolved" because we unloaded the .so
in parallel to a .close callback that never completed (yes, it's
nasty that it usually? let the testsuite pass), but now we
correctly refuse to unload a plugin that has not returned from
.close,
2017 Nov 14
8
[nbdkit PATCH v2 0/2] add nbd plugin
I'm still working on the interleaving (and Rich reminded me on IRC
that we still don't have THREAD_MODEL_PARALLEL working anywhere
yet, anyways). Since nbdkit doesn't really have a parallel plugin
yet, my testing on that front will have to use qemu-nbd as the
original server, as well as qemu-io as the driver (qemu-io's
aio_read and aio_write commands can be used to trigger
2019 Jun 04
9
[PATCH libnbd v2 0/4] api: Implement concurrent writer.
v1:
https://www.redhat.com/archives/libguestfs/2019-June/msg00014.html
I pushed a few bits which are uncontroversial. The main
changes since v1 are:
An extra patch removes the want_to_send / check for nbd_aio_is_ready
in examples/threaded-reads-and-writes.c. This logic was wrong since
commit 6af72b87 as was pointed out by Eric in his review. Comments
and structure of
2019 Sep 26
5
[PATCH libnbd 1/2] lib: Avoid killing subprocess twice.
If the user calls nbd_kill_subprocess, we shouldn't kill the process
again when we close the handle (since the process has likely gone and
we might be killing a different process).
---
lib/handle.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/lib/handle.c b/lib/handle.c
index 2af25fe..5ad818e 100644
--- a/lib/handle.c
+++ b/lib/handle.c
@@ -315,6 +315,8 @@
2020 Sep 11
10
[libnbd PATCH v2 0/5] Add knobs for client- vs. server-side validation
In v2:
- now based on my proposal to add LIBNBD_SHUTDOWN_IMMEDIATE
- four flags instead of two: STRICT_FLAGS is new (patch 4),
and STRICT_BOUNDS is separate from STRICT_ZERO_SIZE (patch 5)
- various refactorings for more shared code and less duplication
Eric Blake (5):
api: Add xxx_MASK constant for each Flags type
generator: Refactor filtering of accepted OFlags
api: Add
2019 May 21
9
[libnbd PATCH 0/3] Avoid deadlock with in-flight commands
This might not be the final solution, but it certainly seems to solve
a deadlock for me that I could trigger by using 'nbdkit
--filter=noparallel memory 512k' and calling nbd_aio_pread for a
request larger than 256k (enough for the Linux kernel to block the
server until libnbd read()s), immediately followed by nbd_aio_pwrite
for a request larger than 256k (enough to block libnbd until the
2019 Sep 30
4
[PATCH libnbd v2 0/2] Implement systemd socket activation.
v1 was posted here:
https://www.redhat.com/archives/libguestfs/2019-September/thread.html#00337
v2:
- Drop the first patch.
- Hopefully fix the multiple issues with fork-safety and general
behaviour on error paths.
Note this requires execvpe for which there seems to be no equivalent
on FreeBSD, except some kind of tedious path parsing (but can we
assign to environ?)
Rich.