Richard W.M. Jones
2019-May-21 20:44 UTC
[Libguestfs] [PATCH libnbd] api: Synchronous connect waits til all connections are connected.
nbd_connect_unix|tcp had a tricky failure case. This is a consequence of allowing callers to mix synchronous and asynchronous calls, with multi-conn thrown into the mix. I think the new behaviour proposed here is better. We could do with a better way of classifying the state of connections, such as are they connectING. Rich.
Richard W.M. Jones
2019-May-21 20:44 UTC
[Libguestfs] [PATCH libnbd] api: Synchronous connect waits til all connections are connected.
If not using multi-conn then obviously the synchronous connection calls ‘nbd_connect_unix’, ‘nbd_connect_tcp’ and ‘nbd_connect_command’ should only return when the (one) connection object is connected. In the multi-conn case it's not very clear what these synchronous calls should do. Previously I had it so that they would return as soon as at least one connection was connected. However this is a problem if you are using them as a convenient way to set up a multi-threaded main loop, because it can be that some of them have not finished connecting, but then you issue commands on those connections and that will fail. The failure is pernicious because most of the time you won't see it, only if one connection is slow. So it's (probably) better that the synchronous ‘nbd_connect_unix’ and ‘nbd_connect_tcp’ should connect every connection object before returning. For ‘nbd_connect_command’, it essentially ignored multi-conn anyway, and I changed it so that it waits for conn[0] to get connected and returns, the other connections (if they exist) being ignored. It should probably be an error for the user to enable multi-conn on the handle and then use nbd_connect_command, but I did not make that change yet. --- generator/generator | 8 +++----- lib/connect.c | 48 +++++++++++++++++++++++++++++++++++---------- 2 files changed, 41 insertions(+), 15 deletions(-) diff --git a/generator/generator b/generator/generator index a1bf41d..13edbc4 100755 --- a/generator/generator +++ b/generator/generator @@ -1039,8 +1039,7 @@ C<\"qemu:dirty-bitmap:...\"> for qemu-nbd Connect (synchronously) over the named Unix domain socket (C<sockpath>) to an NBD server running on the same machine. This call returns when the connection has been made. If multi-conn is enabled, this -begins connecting all of the connections, and returns as soon as -any one of them is connected."; +returns when all of the connections are connected."; }; "connect_tcp", { @@ -1052,9 +1051,8 @@ Connect (synchronously) to the NBD server listening on C<hostname:port>. The C<port> may be a port name such as C<\"nbd\">, or it may be a port number as a string such as C<\"10809\">. This call returns when the connection -has been made. If multi-conn is enabled, this begins connecting -all of the connections, and returns as soon as -any one of them is connected."; +has been made. If multi-conn is enabled, this returns when +all of the connections are connected."; }; "connect_command", { diff --git a/lib/connect.c b/lib/connect.c index 0fef87d..d711fd7 100644 --- a/lib/connect.c +++ b/lib/connect.c @@ -33,22 +33,32 @@ #include "internal.h" static int -wait_one_connected (struct nbd_handle *h) +wait_all_connected (struct nbd_handle *h) { size_t i; for (;;) { - bool connected = false; + bool all_connected = true; - /* Are any connected? */ + /* Are any not yet connected? */ for (i = 0; i < h->multi_conn; ++i) { - if (nbd_unlocked_aio_is_ready (h->conns[i])) { - connected = true; + if (nbd_unlocked_aio_is_closed (h->conns[i])) { + set_error (0, "connection is closed"); + return -1; + } + if (nbd_unlocked_aio_is_dead (h->conns[i])) { + /* Don't set the error here, keep the error set when + * the connection died. + */ + return -1; + } + if (!nbd_unlocked_aio_is_ready (h->conns[i])) { + all_connected = false; break; } } - if (connected) + if (all_connected) break; if (nbd_unlocked_poll (h, -1) == -1) @@ -59,7 +69,7 @@ wait_one_connected (struct nbd_handle *h) } /* For all connections in the CREATED state, start connecting them to - * a Unix domain socket. Wait until at least one is in the READY + * a Unix domain socket. Wait until all connections are in the READY * state. */ int @@ -90,7 +100,7 @@ nbd_unlocked_connect_unix (struct nbd_handle *h, const char *sockpath) return -1; } - return wait_one_connected (h); + return wait_all_connected (h); } /* Connect to a TCP port. */ @@ -115,7 +125,7 @@ nbd_unlocked_connect_tcp (struct nbd_handle *h, return -1; } - return wait_one_connected (h); + return wait_all_connected (h); } /* Connect to a local command. Multi-conn doesn't make much sense @@ -132,7 +142,25 @@ nbd_unlocked_connect_command (struct nbd_handle *h, char **argv) if (nbd_unlocked_aio_connect_command (h->conns[0], argv) == -1) return -1; - return wait_one_connected (h); + for (;;) { + if (nbd_unlocked_aio_is_closed (h->conns[0])) { + set_error (0, "connection is closed"); + return -1; + } + if (nbd_unlocked_aio_is_dead (h->conns[0])) { + /* Don't set the error here, keep the error set when + * the connection died. + */ + return -1; + } + if (nbd_unlocked_aio_is_ready (h->conns[0])) + break; + + if (nbd_unlocked_poll (h, -1) == -1) + return -1; + } + + return 0; } int -- 2.21.0
Eric Blake
2019-May-21 22:08 UTC
Re: [Libguestfs] [PATCH libnbd] api: Synchronous connect waits til all connections are connected.
On 5/21/19 3:44 PM, Richard W.M. Jones wrote:> If not using multi-conn then obviously the synchronous connection > calls ‘nbd_connect_unix’, ‘nbd_connect_tcp’ and ‘nbd_connect_command’ > should only return when the (one) connection object is connected. > > In the multi-conn case it's not very clear what these synchronous > calls should do. Previously I had it so that they would return as > soon as at least one connection was connected. However this is a > problem if you are using them as a convenient way to set up a > multi-threaded main loop, because it can be that some of them have not > finished connecting, but then you issue commands on those connections > and that will fail. The failure is pernicious because most of the > time you won't see it, only if one connection is slow. So it's > (probably) better that the synchronous ‘nbd_connect_unix’ and > ‘nbd_connect_tcp’ should connect every connection object before > returning.Should this be a bool parameter for the caller to opt-in to? In one direction, returning as soon as possible allows the caller to issue further synchronous commands which will pick the ready connections and skip the ones still initializing, to at least get lower latency on the first commands and better performance later as more connections come up; in the other direction, waiting to return until all connections are made makes it easier to issue async commands on any connection without worrying if the connection is up. The choice of the parameter to pass in depends, then, on whether the caller plans to make future sync calls or async calls. Also, I'm playing with the idea of having some sort of witness whether a connection has ever reached the READY state (even if it is not there at the moment). Prior to that point, we don't want to accept commands (whether sync or async) because we aren't finished negotiating and thus cannot guarantee if the can_FOO() functions are accurate; after that point, it may be nice to queue up the commands for later issuance when we DO come back around to READY, rather than insisting that we be exactly in the READY state when issuing a command.> > For ‘nbd_connect_command’, it essentially ignored multi-conn anyway, > and I changed it so that it waits for conn[0] to get connected and > returns, the other connections (if they exist) being ignored. It > should probably be an error for the user to enable multi-conn on the > handle and then use nbd_connect_command, but I did not make that > change yet.Yes, that change (as a separate patch) makes sense.> --- > generator/generator | 8 +++----- > lib/connect.c | 48 +++++++++++++++++++++++++++++++++++---------- > 2 files changed, 41 insertions(+), 15 deletions(-) >-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Seemingly Similar Threads
- [PATCH libnbd v2 1/6] api: Synchronous connect waits til all connections are connected.
- [PATCH libnbd] api: Synchronous connect waits til all connections are connected.
- [PATCH libnbd 2/4] lib: Split nbd_aio_is_* functions into internal.
- [PATCH libnbd v2 0/6] Test connection states.
- [PATCH libnbd 3/4] lib: Add set_state / get_state macros.