Eric Blake
2023-Jul-13 19:29 UTC
[Libguestfs] [libnbd PATCH 0/2] Fix docs and testing of completion callback
This is my proposal for fixing the documentation to match practice (namely, that completion.callback is not invoked in the cases where the aio call itself reports errors); we could instead try to go the other direction and tweak the generator to guarantee that both completion.callback and completion.free are reached no matter what, but that felt more invasive to me. Eric Blake (2): api: Tighten rules on completion.callback tests: Add coverage of nbd_aio_opt_* in wrong state docs/libnbd.pod | 26 +++-- lib/handle.c | 3 + lib/opt.c | 2 + tests/Makefile.am | 5 + tests/closure-lifetimes.c | 14 +++ tests/errors-not-negotiating-aio.c | 170 +++++++++++++++++++++++++++++ .gitignore | 1 + 7 files changed, 211 insertions(+), 10 deletions(-) create mode 100644 tests/errors-not-negotiating-aio.c -- 2.41.0
Eric Blake
2023-Jul-13 19:29 UTC
[Libguestfs] [libnbd PATCH 1/2] api: Tighten rules on completion.callback
The documentation has claimed since commit 6f4dcdab that any completion callback will be called exactly once; but this is not consistent with the code: if nbd_aio_* itself returns an error, then nothing is queued and the user does not need to wait for a completion callback to know how the command failed. We could tweak the generator to call completion.callback no matter what, but since the completion.free callback already serves that role, it's easier to fix the documentation to match reality. After all, one only needs completion status if an aio command returned success (if it returned failure, we know that there is nothing that is going to complete later). However, there was one place where we indeed fail to call completion.callback, even though the corresponding aio call returned success, which can strand a user that was depending on the callback to know that the pending aio command failed after all. That's when a call to nbd_close() interrupts a connection while commands are in flight. This problem appears to have been around even before commit 52b9b492 (v0.9.8) when we finally settled on having .free callbacks in the first place. Beef up the closure-lifetimes unit test to more robustly check the various conditions guaranteed by the updated documentation, and to expose the previous skip of a completion callback during nbd_close. In summary, the behavior we want (where sequence is important) is: - aio command fails: mid-command .callback: 0 calls completion .callback: 0 calls mid-command .free: 1 call completion .free: 1 call - aio command succeeds: mid-command .callback: 0, 1, or multiple calls completion .callback: 1 call mid-command .free: 1 call completion .free: 1 call Reported-by: Tage Johansson <tage.j.lists at posteo.net> Fixes: 6f4dcdab ("docs: Clarify how callbacks should handle errors", v1.11.8) Signed-off-by: Eric Blake <eblake at redhat.com> --- docs/libnbd.pod | 26 ++++++++++++++++---------- lib/handle.c | 3 +++ tests/closure-lifetimes.c | 14 ++++++++++++++ 3 files changed, 33 insertions(+), 10 deletions(-) diff --git a/docs/libnbd.pod b/docs/libnbd.pod index 72f74053..433479e6 100644 --- a/docs/libnbd.pod +++ b/docs/libnbd.pod @@ -904,9 +904,12 @@ same nbd object, as it would cause deadlock. =head2 Completion callbacks All of the asychronous commands have an optional completion callback -function that is used right before the command is marked complete, -after any mid-command callbacks have finished, and before any free -functions. +function that is used if the asynchronous command succeeded, right +before the command is marked complete, after any mid-command callbacks +have finished, and before any free functions. The completion callback +is not reached if the asynchronous command itself fails, while free +functions are reached regardless of the initial result of the +asynchronous command. When the completion callback returns C<1>, the command is automatically retired (there is no need to call @@ -946,13 +949,16 @@ mid-command callback may be reached more times than expected if the server is non-compliant. On the other hand, if a completion callback is supplied (only possible -with asynchronous commands), it will always be reached exactly once, -and the completion callback must not ignore the value pointed to by -C<error>. In particular, the content of a buffer passed to -L<nbd_aio_pread(3)> or L<nbd_aio_pread_structured(3)> is undefined -if C<*error> is non-zero on entry to the completion callback. It is -recommended that if you choose to use automatic command retirement -(where the completion callback returns C<1> to avoid needing to call +with asynchronous commands), it will not be called if the initial API +call fails (such as attempting an asynchronous command in the wrong +state - there is nothing to be completed since the command was not +queued), but will otherwise be called exactly once, and the completion +callback must not ignore the value pointed to by C<error>. In +particular, the content of a buffer passed to L<nbd_aio_pread(3)> or +L<nbd_aio_pread_structured(3)> is undefined if C<*error> is non-zero +on entry to the completion callback. It is recommended that if you +choose to use automatic command retirement (where the completion +callback returns C<1> to avoid needing to call L<nbd_aio_command_completed(3)> later), your completion function should return C<1> on all control paths, even when handling errors (note that with automatic retirement, assigning into C<error> is diff --git a/lib/handle.c b/lib/handle.c index 0f11bee5..1d66d585 100644 --- a/lib/handle.c +++ b/lib/handle.c @@ -37,7 +37,10 @@ free_cmd_list (struct command *list) struct command *cmd, *cmd_next; for (cmd = list; cmd != NULL; cmd = cmd_next) { + int error = cmd->error ? : ENOTCONN; + cmd_next = cmd->next; + CALL_CALLBACK (cmd->cb.completion, &error); nbd_internal_retire_and_free_command (cmd); } } diff --git a/tests/closure-lifetimes.c b/tests/closure-lifetimes.c index 9bb4e120..23691631 100644 --- a/tests/closure-lifetimes.c +++ b/tests/closure-lifetimes.c @@ -66,6 +66,7 @@ read_cb (void *opaque, uint64_t offset, unsigned status, int *error) { assert (!read_cb_freed); + assert (!completion_cb_called); read_cb_called++; return 0; } @@ -73,6 +74,7 @@ read_cb (void *opaque, static void read_cb_free (void *opaque) { + assert (!completion_cb_freed); read_cb_freed++; } @@ -81,6 +83,7 @@ block_status_cb (void *opaque, const char *meta, uint64_t offset, uint32_t *entries, size_t nr_entries, int *error) { assert (!block_status_cb_freed); + assert (!completion_cb_called); block_status_cb_called++; return 0; } @@ -88,6 +91,7 @@ block_status_cb (void *opaque, const char *meta, uint64_t offset, static void block_status_cb_free (void *opaque) { + assert (!completion_cb_freed); block_status_cb_freed++; } @@ -150,6 +154,10 @@ main (int argc, char *argv[]) cookie = nbd_aio_pread_structured (nbd, buf, sizeof buf, 0, chunk_callback, completion_callback, 0); if (cookie == -1) NBD_ERROR; + /* read_cb_called is indeterminate at this point, as state machine + * progress may vary based on task schduling and network speed factors. + */ + assert (completion_cb_called == 0); assert (read_cb_freed == 0); assert (completion_cb_freed == 0); while (!nbd_aio_command_completed (nbd, cookie)) { @@ -179,6 +187,8 @@ main (int argc, char *argv[]) nbd_kill_subprocess (nbd, 0); nbd_close (nbd); + /* read_cb_called is indeterminate based on timing of kill. */ + assert (completion_cb_called == 1); assert (read_cb_freed == 1); assert (completion_cb_freed == 1); @@ -198,6 +208,8 @@ main (int argc, char *argv[]) fprintf (stderr, "%s: Expecting block_status failure\n", argv[0]); exit (EXIT_FAILURE); } + assert (block_status_cb_called == 0); + assert (completion_cb_called == 0); assert (block_status_cb_freed == 1); assert (completion_cb_freed == 1); @@ -212,6 +224,8 @@ main (int argc, char *argv[]) fprintf (stderr, "%s: Expecting block_status failure\n", argv[0]); exit (EXIT_FAILURE); } + assert (block_status_cb_called == 0); + assert (completion_cb_called == 0); assert (block_status_cb_freed == 1); assert (completion_cb_freed == 1); -- 2.41.0
Eric Blake
2023-Jul-13 19:29 UTC
[Libguestfs] [libnbd PATCH 2/2] tests: Add coverage of nbd_aio_opt_* in wrong state
Enhance the regression tests to prove that the completion callback is not reached if the aio call itself reports an error; only the .free callback is guaranteed. Also add some asserts to the library code that may aid future readers in seeing how we track transfer semantics of a callback. Goes hand in hand with the documentation cleanups made in the previous patch, but done separately for ease of backporting as nbd_aio_opt calls are not in as many releases. Reported-by: Tage Johansson <tage.j.lists at posteo.net> Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/opt.c | 2 + tests/Makefile.am | 5 + tests/errors-not-negotiating-aio.c | 170 +++++++++++++++++++++++++++++ .gitignore | 1 + 4 files changed, 178 insertions(+) create mode 100644 tests/errors-not-negotiating-aio.c diff --git a/lib/opt.c b/lib/opt.c index f58d5e19..1446eef3 100644 --- a/lib/opt.c +++ b/lib/opt.c @@ -223,6 +223,7 @@ nbd_unlocked_opt_list (struct nbd_handle *h, nbd_list_callback *list) if (nbd_unlocked_aio_opt_list (h, &l, &c) == -1) return -1; + assert (CALLBACK_IS_NULL (l)); SET_CALLBACK_TO_NULL (*list); if (wait_for_option (h) == -1) return -1; @@ -269,6 +270,7 @@ opt_meta_context_queries (struct nbd_handle *h, if (aio_opt_meta_context_queries (h, opt, queries, &l, &c) == -1) return -1; + assert (CALLBACK_IS_NULL (l)); SET_CALLBACK_TO_NULL (*context); if (wait_for_option (h) == -1) return -1; diff --git a/tests/Makefile.am b/tests/Makefile.am index 3a93251e..6eddcb7a 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -171,6 +171,7 @@ check_PROGRAMS += \ errors-connect-null \ errors-connect-twice \ errors-not-negotiating \ + errors-not-negotiating-aio \ errors-notify-not-blocked \ errors-bad-cookie \ errors-pread-structured \ @@ -243,6 +244,7 @@ TESTS += \ errors-connect-null \ errors-connect-twice \ errors-not-negotiating \ + errors-not-negotiating-aio \ errors-notify-not-blocked \ errors-bad-cookie \ errors-pread-structured \ @@ -332,6 +334,9 @@ errors_connect_twice_LDADD = $(top_builddir)/lib/libnbd.la errors_not_negotiating_SOURCES = errors-not-negotiating.c errors_not_negotiating_LDADD = $(top_builddir)/lib/libnbd.la +errors_not_negotiating_aio_SOURCES = errors-not-negotiating-aio.c +errors_not_negotiating_aio_LDADD = $(top_builddir)/lib/libnbd.la + errors_notify_not_blocked_SOURCES = errors-notify-not-blocked.c errors_notify_not_blocked_LDADD = $(top_builddir)/lib/libnbd.la diff --git a/tests/errors-not-negotiating-aio.c b/tests/errors-not-negotiating-aio.c new file mode 100644 index 00000000..b09cae82 --- /dev/null +++ b/tests/errors-not-negotiating-aio.c @@ -0,0 +1,170 @@ +/* NBD client library in userspace + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Deliberately provoke some errors and check the error messages from + * nbd_get_error etc look reasonable. + */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <errno.h> + +#include <libnbd.h> + +static char *progname; + +static void +check (int experr, const char *prefix) +{ + const char *msg = nbd_get_error (); + int errnum = nbd_get_errno (); + + fprintf (stderr, "error: \"%s\"\n", msg); + fprintf (stderr, "errno: %d (%s)\n", errnum, strerror (errnum)); + if (strncmp (msg, prefix, strlen (prefix)) != 0) { + fprintf (stderr, "%s: test failed: missing context prefix: %s\n", + progname, msg); + exit (EXIT_FAILURE); + } + if (errnum != experr) { + fprintf (stderr, "%s: test failed: " + "expected errno = %d (%s), but got %d\n", + progname, experr, strerror (experr), errnum); + exit (EXIT_FAILURE); + } +} + +struct progress { + int id; + bool call_freed; + bool comp_freed; +}; + +static int +list_call (void *user_data, const char *name, const char *description) +{ + struct progress *p = user_data; + + fprintf (stderr, "%s: list callback should not be reached, iter %d\n", + progname, p->id); + exit (EXIT_FAILURE); +} + +static void +list_free (void *user_data) +{ + struct progress *p = user_data; + + if (p->comp_freed) { + fprintf (stderr, "%s: list free called too late, iter %d\n", + progname, p->id); + exit (EXIT_FAILURE); + } + p->call_freed = true; +} + +static int +comp_call (void *user_data, int *error) +{ + struct progress *p = user_data; + + fprintf (stderr, "%s: list callback should not be reached, iter %d\n", + progname, p->id); + exit (EXIT_FAILURE); +} + +static void +comp_free (void *user_data) +{ + struct progress *p = user_data; + + if (!p->call_freed) { + fprintf (stderr, "%s: completion free called too early, iter %d\n", + progname, p->id); + exit (EXIT_FAILURE); + } + p->comp_freed = true; +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + const char *cmd[] = { + "nbdkit", "-s", "-v", "--exit-with-parent", "memory", "1048576", NULL + }; + struct progress p; + nbd_list_callback list_cb = { .callback = list_call, + .user_data = &p, + .free = list_free }; + nbd_completion_callback comp_cb = { .callback = comp_call, + .user_data = &p, + .free = comp_free }; + + progname = argv[0]; + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* Too early to try nbd_aio_opt_*. */ + p = (struct progress) { .id = 0 }; + if (nbd_aio_opt_list (nbd, list_cb, comp_cb) != -1) { + fprintf (stderr, "%s: test failed: " + "nbd_aio_opt_list did not reject attempt in wrong state\n", + argv[0]); + exit (EXIT_FAILURE); + } + check (ENOTCONN, "nbd_aio_opt_list: "); + if (!p.call_freed || !p.comp_freed) { + fprintf (stderr, "%s: test failed: " + "nbd_aio_opt_list did not call .free callbacks\n", + argv[0]); + exit (EXIT_FAILURE); + } + + /* Connect to the server without requesting negotiation mode. */ + if (nbd_connect_command (nbd, (char **)cmd) == -1) { + fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* Too late to try nbd_aio_opt_*. */ + p = (struct progress) { .id = 1 }; + if (nbd_aio_opt_list (nbd, list_cb, comp_cb) != -1) { + fprintf (stderr, "%s: test failed: " + "nbd_aio_opt_list did not reject attempt in wrong state\n", + argv[0]); + exit (EXIT_FAILURE); + } + check (EINVAL, "nbd_aio_opt_list: "); + if (!p.call_freed || !p.comp_freed) { + fprintf (stderr, "%s: test failed: " + "nbd_aio_opt_list did not call .free callbacks\n", + argv[0]); + exit (EXIT_FAILURE); + } + + nbd_close (nbd); + exit (EXIT_SUCCESS); +} diff --git a/.gitignore b/.gitignore index efe3080a..b43e83c0 100644 --- a/.gitignore +++ b/.gitignore @@ -220,6 +220,7 @@ Makefile.in /tests/errors-name-too-long /tests/errors-not-connected /tests/errors-not-negotiating +/tests/errors-not-negotiating-aio /tests/errors-notify-not-blocked /tests/errors-poll-no-fd /tests/errors-pread-structured -- 2.41.0