Noticed while thinking about the recent threads wondering if we need a more efficient lookup from cookie back to command. Both of these fix bugs, but are tricky enough that I'm posting for review. Eric Blake (2): lib: Decrement in_flight at response, not retirement lib: Do O(1) rather than O(n) queue insertion generator/states-issue-command.c | 2 ++ generator/states-reply.c | 14 +++++++------- generator/states.c | 11 +++++++++-- lib/aio.c | 6 ++++-- lib/internal.h | 2 ++ lib/rw.c | 10 ++++------ tests/server-death.c | 10 +++++----- 7 files changed, 33 insertions(+), 22 deletions(-) -- 2.20.1
Eric Blake
2019-Jul-18 16:40 UTC
[Libguestfs] [libnbd PATCH 1/2] lib: Decrement in_flight at response, not retirement
We had a mismatch between documentation/code comments (nbd_aio_in_flight counts commands not yet ready to retire) and actual code (we didn't decrement the count until nbd_aio_command_completed). As the documented behavior is useful (I shouldn't delay sending a command to the server merely because I haven't retired all the available commands), it is worth fixing the code to match the intent. This also means decrementing in_flight in bulk when commands are stranded due to a transition to CLOSED/DEAD. Update the testsuite to cover this. Fixes: a05b2237 --- generator/states-reply.c | 2 ++ generator/states.c | 2 ++ lib/aio.c | 2 -- tests/server-death.c | 10 +++++----- 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/generator/states-reply.c b/generator/states-reply.c index 05b2e58..4b22c39 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -179,6 +179,8 @@ save_reply_state (struct nbd_handle *h) } else h->cmds_done = cmd; + h->in_flight--; + assert (h->in_flight >= 0); /* Notify the user */ if (cmd->cb.callback) { diff --git a/generator/states.c b/generator/states.c index fa6df9f..bc2b4a6 100644 --- a/generator/states.c +++ b/generator/states.c @@ -156,6 +156,7 @@ void abort_commands (struct nbd_handle *h, assert (nbd_get_error ()); abort_commands (h, &h->cmds_to_issue); abort_commands (h, &h->cmds_in_flight); + h->in_flight = 0; if (h->sock) { h->sock->ops->close (h->sock); h->sock = NULL; @@ -165,6 +166,7 @@ void abort_commands (struct nbd_handle *h, CLOSED: abort_commands (h, &h->cmds_to_issue); abort_commands (h, &h->cmds_in_flight); + h->in_flight = 0; if (h->sock) { h->sock->ops->close (h->sock); h->sock = NULL; diff --git a/lib/aio.c b/lib/aio.c index 5df20da..2fae10d 100644 --- a/lib/aio.c +++ b/lib/aio.c @@ -85,8 +85,6 @@ nbd_unlocked_aio_command_completed (struct nbd_handle *h, prev_cmd->next = cmd->next; else h->cmds_done = cmd->next; - h->in_flight--; - assert (h->in_flight >= 0); free (cmd); diff --git a/tests/server-death.c b/tests/server-death.c index c9195c5..4093c20 100644 --- a/tests/server-death.c +++ b/tests/server-death.c @@ -128,6 +128,11 @@ main (int argc, char *argv[]) } /* Detection of the dead server completes all remaining in-flight commands */ + if (nbd_aio_in_flight (nbd) != 0) { + fprintf (stderr, "%s: test failed: nbd_aio_in_flight\n", + argv[0]); + goto fail; + } if (nbd_aio_peek_command_completed (nbd) != cookie) { fprintf (stderr, "%s: test failed: nbd_aio_peek_command_completed\n", argv[0]); @@ -148,11 +153,6 @@ main (int argc, char *argv[]) } /* With all commands retired, no further command should be pending */ - if (nbd_aio_in_flight (nbd) != 0) { - fprintf (stderr, "%s: test failed: nbd_aio_in_flight\n", - argv[0]); - goto fail; - } if (nbd_aio_peek_command_completed (nbd) != -1) { fprintf (stderr, "%s: test failed: nbd_aio_peek_command_completed\n", argv[0]); -- 2.20.1
Eric Blake
2019-Jul-18 16:40 UTC
[Libguestfs] [libnbd PATCH 2/2] lib: Do O(1) rather than O(n) queue insertion
We have no control over the user piling up lots of commands faster than the server can accept (h->cmds_to_issue), or delaying retiring those batched commands (h->cmds_in_flight), hence our use of O(n) list insertion can be noticeable, since the growth of n can be unbounded from our viewpoint. It's easy enough to track a tail pointer to keep insertion O(1), to match that these two lists are used as queues; cmds_to_issue is always consumed with O(1) lookup/deletion, and whether cmds_done is used via O(1) or O(n) lookup depends on whether the client uses nbd_aio_peek_command_completed. What's more, we documented that nbd_aio_peek_command_completed can be used to process commands strictly in the order that the server responded - but violated that promise if we strand any commands due to moving to DEAD when there are still outstanding successful commands waiting to be processed. Having the tail pointer makes it easier to fix that bug. Meanwhile, we already had O(1) insertion into h->cmds_in_flight, but O(n) lookup/deletion when correlating server responses back to the right command, since a server can reply out-of-order. It might be a slight optimization to keep this list in FIFO order (by using tail insertion), on the grounds that a server that does NOT reply out-of-order will result in O(1) lookup (because the server's reply to the oldest command is always at the head of the list) instead of the current O(n) lookup (where the oldest message is at the end of the list instead of the front); but I did not do it here because in this scenario, we are more likely to have a sanely-bounded n due to POLLOUT preventing us from flooding the server with more messages than it will handle in parallel, to the point that the O(n) is not notceable. Besides, we may later decide to implement a hash-table for amortized O(1) lookup of an arbitrary cookie not only for the in_flight list, but also for cmds_done or even a pre-submit list if we split aio command cookie generation from server submission. Fixes: 4e0448e38 --- generator/states-issue-command.c | 2 ++ generator/states-reply.c | 12 +++++------- generator/states.c | 9 +++++++-- lib/aio.c | 4 ++++ lib/internal.h | 2 ++ lib/rw.c | 10 ++++------ 6 files changed, 24 insertions(+), 15 deletions(-) diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c index 8828f51..78b2ca1 100644 --- a/generator/states-issue-command.c +++ b/generator/states-issue-command.c @@ -105,6 +105,8 @@ cmd = h->cmds_to_issue; assert (cmd->cookie == be64toh (h->request.handle)); h->cmds_to_issue = cmd->next; + if (h->cmds_to_issue_tail == cmd) + h->cmds_to_issue_tail = NULL; cmd->next = h->cmds_in_flight; h->cmds_in_flight = cmd; SET_NEXT_STATE (%.READY); diff --git a/generator/states-reply.c b/generator/states-reply.c index 4b22c39..1a0c149 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -171,14 +171,12 @@ save_reply_state (struct nbd_handle *h) else h->cmds_in_flight = cmd->next; cmd->next = NULL; - if (h->cmds_done) { - prev_cmd = h->cmds_done; - while (prev_cmd->next) - prev_cmd = prev_cmd->next; - prev_cmd->next = cmd; + if (h->cmds_done_tail != NULL) + h->cmds_done_tail = h->cmds_done_tail->next = cmd; + else { + assert (h->cmds_done == NULL); + h->cmds_done = h->cmds_done_tail = cmd; } - else - h->cmds_done = cmd; h->in_flight--; assert (h->in_flight >= 0); diff --git a/generator/states.c b/generator/states.c index bc2b4a6..69aa431 100644 --- a/generator/states.c +++ b/generator/states.c @@ -132,8 +132,13 @@ void abort_commands (struct nbd_handle *h, cmd->error = ENOTCONN; } if (prev_cmd) { - prev_cmd->next = h->cmds_done; - h->cmds_done = *list; + if (h->cmds_done_tail) + h->cmds_done_tail->next = *list; + else { + assert (h->cmds_done == NULL); + h->cmds_done = *list; + } + h->cmds_done_tail = prev_cmd; *list = NULL; } } diff --git a/lib/aio.c b/lib/aio.c index 2fae10d..8d7cb8d 100644 --- a/lib/aio.c +++ b/lib/aio.c @@ -81,6 +81,10 @@ nbd_unlocked_aio_command_completed (struct nbd_handle *h, error = EIO; /* Retire it from the list and free it. */ + if (h->cmds_done_tail == cmd) { + assert (cmd->next == NULL); + h->cmds_done_tail = prev_cmd; + } if (prev_cmd != NULL) prev_cmd->next = cmd->next; else diff --git a/lib/internal.h b/lib/internal.h index e214d61..3f2d3f8 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -195,6 +195,7 @@ struct nbd_handle { * been issued they are moved to cmds_in_flight. */ struct command *cmds_to_issue; + struct command *cmds_to_issue_tail; /* Commands which have been issued and are waiting for replies. * Order does not matter here, since the server can reply out-of-order. @@ -207,6 +208,7 @@ struct nbd_handle { * replies in server order. */ struct command *cmds_done; + struct command *cmds_done_tail; /* length (cmds_to_issue) + length (cmds_in_flight). */ int in_flight; diff --git a/lib/rw.c b/lib/rw.c index 3c2166f..680b81a 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -163,7 +163,7 @@ nbd_internal_command_common (struct nbd_handle *h, uint64_t offset, uint64_t count, void *data, struct command_cb *cb) { - struct command *cmd, *prev_cmd; + struct command *cmd; if (h->disconnect_request) { set_error (EINVAL, "cannot request more commands after NBD_CMD_DISC"); @@ -231,13 +231,11 @@ nbd_internal_command_common (struct nbd_handle *h, h->in_flight++; if (h->cmds_to_issue != NULL) { assert (nbd_internal_is_state_processing (get_next_state (h))); - prev_cmd = h->cmds_to_issue; - while (prev_cmd->next) - prev_cmd = prev_cmd->next; - prev_cmd->next = cmd; + h->cmds_to_issue_tail = h->cmds_to_issue_tail->next = cmd; } else { - h->cmds_to_issue = cmd; + assert (h->cmds_to_issue_tail == NULL); + h->cmds_to_issue = h->cmds_to_issue_tail = cmd; if (nbd_internal_is_state_ready (get_next_state (h)) && nbd_internal_run (h, cmd_issue) == -1) return -1; -- 2.20.1
Richard W.M. Jones
2019-Jul-19 11:50 UTC
Re: [Libguestfs] [libnbd PATCH 2/2] lib: Do O(1) rather than O(n) queue insertion
Yes this series looks fine, ACK. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Possibly Parallel Threads
- [libnbd PATCH v3 0/7] Avoid deadlock with in-flight commands
- [libnbd PATCH 0/6] new APIs: aio_in_flight, aio_FOO_notify
- [libnbd PATCH 0/3] Avoid deadlock with in-flight commands
- [libnbd PATCH] api: Add LIBNBD_SHUTDOWN_IMMEDIATE flag
- [libnbd PATCH 2/2] lib: Do O(1) rather than O(n) queue insertion