similar to: [nbdkit PATCH 0/2] Improve shutdown race in nbd plugin

Displaying 20 results from an estimated 3000 matches similar to: "[nbdkit PATCH 0/2] Improve shutdown race in nbd plugin"

2020 Mar 27
1
Re: [nbdkit PATCH 2/2] nbd: Reorder cleanup to avoid getting stuck in poll()
On Fri, Mar 27, 2020 at 05:33:28PM -0500, Eric Blake wrote: > We have been seeing sporadic hangs on test-nbd-tls-psk.sh, where even > though the client to the 'nbdkit nbd' process has cleanly exited, > things are stalled in .close where nbd is trying to pthread_join() the > reader thread, while the reader thread is itself blocked on a poll() > that will never make additional
2020 Mar 30
4
[libnbd PATCH 0/2] fix hangs against nbdkit 1.2
nbdkit 1.2 as a server waits for read() to see EOF, even after the client has sent NBD_CMD_DISC. That was fixed in mbdkit 1.4; and most modern NBD servers are smarter than this (they close() the write end of their traffic soon after NBD_CMD_DISC). But it's easy enough to revert nbdkit commit c70616f8 to get back to a server with the same behavior as the older nbdkit, at which point both
2020 Mar 27
0
[nbdkit PATCH 2/2] nbd: Reorder cleanup to avoid getting stuck in poll()
We have been seeing sporadic hangs on test-nbd-tls-psk.sh, where even though the client to the 'nbdkit nbd' process has cleanly exited, things are stalled in .close where nbd is trying to pthread_join() the reader thread, while the reader thread is itself blocked on a poll() that will never make additional progress. Tracing the race is difficult: nbd_shutdown() sends NBD_CMD_DISC to the
2019 Jun 28
1
[libnbd PATCH] disconnect: Prevent any further commands
Once the client has requested NBD_CMD_DISC, the protocol states that it must not send any further information to the server (further writes may still be needed for a clean TLS shutdown, but that's a different matter requiring more states). Our state machine can prevent some of this if we have moved to CLOSED, but that's not foolproof because we can queue commands that can't be written
2020 Sep 11
3
[libnbd PATCH] api: Add LIBNBD_SHUTDOWN_IMMEDIATE flag
As mentioned in commits 176fc4ea and 609c25f0, our original plan in adding a flags argument to nbd_shutdown was to let us specify different behaviors at the libnbd level, rather than NBD protocol flags (for that, the user has nbd_aio_disconnect). But when we later parameterized OFlags to accept various bitmasks (commit f891340b), we failed to mark nbd_shutdown as using a different bitmask than
2020 Mar 27
0
[nbdkit PATCH 1/2] nbd: Don't reference stale errno in reader loop
When switching to libnbd in commit ab7760fcfd, I mistakenly assumed that after a POLLIN event fires on the pipe-to-self, then read() will return either 1 or -1. But this is not true; read() can also return 0 (if the pipe hits EOF), in which case POSIX says errno has an unspecified value, and we should not be deciding whether to log an error based on a random value. I did not manage to fix the
2019 May 22
12
[libnbd PATCH v3 0/7] Avoid deadlock with in-flight commands
Since v2: - rebase to Rich's new API calls - more refactoring in patch 1 (retitled) - new patches 3 and 4 - fix data corruption in patch 6 (was 4) - more tweaks to the reproducer example (including using new API from 3) Eric Blake (7): lib: Refactor command_common() to do more common work commands: Allow for a command queue commands: Expose FIFO ordering of server completions
2019 May 22
10
[libnbd PATCH v2 0/5] Avoid deadlock with in-flight commands
On v1, we discussed whether cmds_to_issue needed to be a list, since it never had more than one element. I played with the idea of making it a list, and allowing the client to queue up new commands regardless of whether the state machine is currently in READY. I also polished up the tmp demo into a bit more full-fledged example file, worth including since it also let me discover a hard-to-hit race
2017 Nov 21
6
[nbdkit PATCH v2 0/4] enable parallel nbd forwarding
With this, I am finally able to get the nbd plugin to do out-of-order responses to the client. Once this series goes in, we should be ready for Rich to cut a release. Eric Blake (4): nbd: Split reading into separate thread nbd: Protect writes with mutex nbd: Enable parallel handling tests: Test parallel nbd behavior plugins/nbd/nbd.c | 217
2019 Jun 29
19
[libnbd PATCH 0/6] new APIs: aio_in_flight, aio_FOO_notify
I still need to wire in the use of *_notify functions into nbdkit to prove whether it makes the code any faster or easier to maintain, but at least the added example shows one good use case for the new API. Eric Blake (6): api: Add nbd_aio_in_flight generator: Allow DEAD state actions to run generator: Allow Int64 in callbacks states: Prepare for aio notify callback api: Add new
2020 Mar 28
0
[nbdkit PATCH v2] nbd: Avoid stuck poll() in nbdplug_close_handle()
We have been seeing sporadic hangs on test-nbd-tls-psk.sh and test-nbd-tls.sh (on my machine, running those two in a loop with commits 0a76cae4 and 09e34ba2 reverted would fail within 100 attempts), where even though the client to the 'nbdkit nbd' process has cleanly exited, things are stalled in .close where nbd is trying to pthread_join() the reader thread, while the reader thread is
2020 Mar 19
2
Re: Anyone seen build hangs (esp armv7, s390x) in Fedora?
[Dropping devel, adding libguestfs] This can be reproduced on x86-64 so I can reproduce it locally. It only appears to happen when the tests are run under rpmbuild, not when I run them as ‘make check’, but I'm unclear why this is. As Eric described earlier, the test runs two copies of nbdkit and a client, connected like this: qemu-img info ===> nbdkit nbd ===> nbdkit example1
2018 Nov 08
1
[nbdkit PATCH] nbd: Fix race during close
ThreadSanitizer [1] pointed out that in the nbd plugin, nbd_close() can attempt close() in the main thread while the worker thread is still attempting to start a read(). Normally, if the read() loses the race, it will get a harmless EBADF that exits the worker thread (which is what we want, as we are closing the connection anyway); but if another connection happens to start in that window, we
2018 Apr 19
3
[nbdkit PATCH 0/2] Fix testsuite deadlocks during close
Commit 9e6d990f exposed a pre-existing deadlock between the nbd plugin as client and parallel nbdkit as server. Prior to that commit, the deadlock was "resolved" because we unloaded the .so in parallel to a .close callback that never completed (yes, it's nasty that it usually? let the testsuite pass), but now we correctly refuse to unload a plugin that has not returned from .close,
2017 Nov 14
8
[nbdkit PATCH v2 0/2] add nbd plugin
I'm still working on the interleaving (and Rich reminded me on IRC that we still don't have THREAD_MODEL_PARALLEL working anywhere yet, anyways). Since nbdkit doesn't really have a parallel plugin yet, my testing on that front will have to use qemu-nbd as the original server, as well as qemu-io as the driver (qemu-io's aio_read and aio_write commands can be used to trigger
2019 Jun 04
9
[PATCH libnbd v2 0/4] api: Implement concurrent writer.
v1: https://www.redhat.com/archives/libguestfs/2019-June/msg00014.html I pushed a few bits which are uncontroversial. The main changes since v1 are: An extra patch removes the want_to_send / check for nbd_aio_is_ready in examples/threaded-reads-and-writes.c. This logic was wrong since commit 6af72b87 as was pointed out by Eric in his review. Comments and structure of
2019 Sep 26
5
[PATCH libnbd 1/2] lib: Avoid killing subprocess twice.
If the user calls nbd_kill_subprocess, we shouldn't kill the process again when we close the handle (since the process has likely gone and we might be killing a different process). --- lib/handle.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/handle.c b/lib/handle.c index 2af25fe..5ad818e 100644 --- a/lib/handle.c +++ b/lib/handle.c @@ -315,6 +315,8 @@
2020 Sep 11
10
[libnbd PATCH v2 0/5] Add knobs for client- vs. server-side validation
In v2: - now based on my proposal to add LIBNBD_SHUTDOWN_IMMEDIATE - four flags instead of two: STRICT_FLAGS is new (patch 4), and STRICT_BOUNDS is separate from STRICT_ZERO_SIZE (patch 5) - various refactorings for more shared code and less duplication Eric Blake (5): api: Add xxx_MASK constant for each Flags type generator: Refactor filtering of accepted OFlags api: Add
2019 May 21
9
[libnbd PATCH 0/3] Avoid deadlock with in-flight commands
This might not be the final solution, but it certainly seems to solve a deadlock for me that I could trigger by using 'nbdkit --filter=noparallel memory 512k' and calling nbd_aio_pread for a request larger than 256k (enough for the Linux kernel to block the server until libnbd read()s), immediately followed by nbd_aio_pwrite for a request larger than 256k (enough to block libnbd until the
2019 Sep 30
4
[PATCH libnbd v2 0/2] Implement systemd socket activation.
v1 was posted here: https://www.redhat.com/archives/libguestfs/2019-September/thread.html#00337 v2: - Drop the first patch. - Hopefully fix the multiple issues with fork-safety and general behaviour on error paths. Note this requires execvpe for which there seems to be no equivalent on FreeBSD, except some kind of tedious path parsing (but can we assign to environ?) Rich.