Richard W.M. Jones
2020-Jun-22 15:49 UTC
[Libguestfs] [PATCH nbdkit 1/2] server: Add .after_fork callback, mainly for plugins to create threads.
If you have a plugin which either creates background threads itself or uses a library that creates background threads, it turns out you cannot create these in .get_ready (or earlier). The reason is that nbdkit forks when either daemonizing itself or using the --run option, and fork cancels all the background threads in the child process (the daemonized or captive nbdkit). The only good solution here is to add a new callback which I've called .after_fork which runs after either daemonization or the --run function. It should be used sparingly because it's not possible to report errors nicely from this callback. In particular when daemonizing we have lost the controlling terminal and errors go to syslog - it looks to the user as if nbdkit “died”. This adds the callback to the C and OCaml APIs, and also sh and eval (to make the tests work), but I didn't update the other language APIs yet. --- docs/nbdkit-filter.pod | 39 ++++++++++++++--------- docs/nbdkit-plugin.pod | 49 ++++++++++++++++++++++++++--- plugins/eval/nbdkit-eval-plugin.pod | 2 ++ plugins/sh/nbdkit-sh-plugin.pod | 4 +++ include/nbdkit-filter.h | 2 ++ include/nbdkit-plugin.h | 1 + server/internal.h | 1 + server/filters.c | 23 ++++++++++++++ server/main.c | 3 ++ server/plugins.c | 16 ++++++++++ plugins/ocaml/NBDKit.mli | 1 + plugins/sh/methods.h | 1 + plugins/ocaml/NBDKit.ml | 4 +++ plugins/eval/eval.c | 2 ++ plugins/ocaml/ocaml.c | 22 +++++++++++++ plugins/sh/methods.c | 25 +++++++++++++++ plugins/sh/sh.c | 1 + tests/test-eval.sh | 1 + tests/test-layers-filter.c | 9 ++++++ tests/test-layers-plugin.c | 8 +++++ tests/test-layers.c | 12 +++++++ 21 files changed, 207 insertions(+), 19 deletions(-) diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod index 00f8e70d..510781e1 100644 --- a/docs/nbdkit-filter.pod +++ b/docs/nbdkit-filter.pod @@ -128,23 +128,23 @@ which is required. F<nbdkit-filter.h> defines some function types (C<nbdkit_next_config>, C<nbdkit_next_config_complete>, C<nbdkit_next_get_ready>, -C<nbdkit_next_preconnect>, C<nbdkit_next_open>) and a structure called -C<struct nbdkit_next_ops>. These abstract the next plugin or filter -in the chain. There is also an opaque pointer C<nxdata> which must be -passed along when calling these functions. The value of C<nxdata> -passed to C<.open> has a stable lifetime that lasts to the -corresponding C<.close>, with all intermediate functions (such as -C<.pread>) receiving the same value for convenience; the only -exceptions where C<nxdata> is not reused are C<.config>, -C<.config_complete>, C<.get_ready>, and C<.preconnect>, which are -called outside the lifetime of a connection. +C<nbdkit_next_after_fork>, C<nbdkit_next_preconnect>, +C<nbdkit_next_open>) and a structure called C<struct nbdkit_next_ops>. +These abstract the next plugin or filter in the chain. There is also +an opaque pointer C<nxdata> which must be passed along when calling +these functions. The value of C<nxdata> passed to C<.open> has a +stable lifetime that lasts to the corresponding C<.close>, with all +intermediate functions (such as C<.pread>) receiving the same value +for convenience; the only exceptions where C<nxdata> is not reused are +C<.config>, C<.config_complete>, C<.get_ready>, C<.after_fork> and +C<.preconnect>, which are called outside the lifetime of a connection. =head2 Next config, open and close -The filter’s C<.config>, C<.config_complete>, C<.get_ready> and -C<.open> methods may only call the next C<.config>, -C<.config_complete>, C<.get_ready> and C<.open> method in the chain -(optionally for C<.config>). +The filter’s C<.config>, C<.config_complete>, C<.get_ready>, +C<.after_fork> and C<.open> methods may only call the next C<.config>, +C<.config_complete>, C<.get_ready>, C<.after_fork> and C<.open> method +in the chain (optionally for C<.config>). The filter’s C<.close> method is called when an old connection closed, and this has no C<next> parameter because it cannot be @@ -304,6 +304,17 @@ filter to get ready to serve requests. If there is an error, C<.get_ready> should call C<nbdkit_error> with an error message and return C<-1>. +=head2 C<.after_fork> + + int (*after_fork) (nbdkit_next_after_fork *next, void *nxdata); + +This intercepts the plugin C<.after_fork> method and can be used by +the filter to start background threads (although these have limited +usefulness since they will not be able to access the plugin). + +If there is an error, C<.after_fork> should call C<nbdkit_error> with +an error message and return C<-1>. + =head2 C<.preconnect> int (*preconnect) (nbdkit_next_preconnect *next, void *nxdata, diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod index 7b8a5393..b2f6197b 100644 --- a/docs/nbdkit-plugin.pod +++ b/docs/nbdkit-plugin.pod @@ -143,6 +143,9 @@ the plugin: └─────────┬────────┘ │ nbdkit forks into the background │ and starts serving clients + ┌─────────┴────────┐ + │ after_fork │ + └─────────┬────────┘ │ ┌──────────┴─────────────┬─ ─ ─ ─ ─ ─ ─ ─ ─ ┌──────┴─────┐ client #1 │ @@ -201,8 +204,29 @@ script must be prepared for a missing script). In normal operation, C<.get_ready> is called before the server starts serving. It is called before the server forks or changes directory. -It is the last chance to do any global preparation that is needed to -serve connections. +It is normally the last chance to do any global preparation that is +needed to serve connections. + +=item C<.after_fork> + +In normal operation, C<.after_fork> is called after the server has +forked into the background and changed UID and directory. If a plugin +needs to create background threads (or uses an external library that +creates threads) it should do so here, because background threads are +killed by fork. + +Because the server may have forked into the background, error messages +and failures from C<.after_fork> cannot be seen by the user unless +they look through syslog. An error in C<.after_fork> can appear to +the user as if nbdkit “just died”. So in almost all cases it is +better to use C<.get_ready> instead of this callback, or to do as much +preparation work as possible in C<.get_ready> and only start +background threads here. + +The server doesn't always fork (eg. if the I<-f> flag is used), but +even so this callback will be called. If you want to find out if the +server forked between C<.get_ready> and C<.after_fork> use +L<getpid(2)>. =item C<.preconnect> @@ -579,13 +603,28 @@ with an error message and return C<-1>. int get_ready (void); This optional callback is called before the server starts serving. It -is called before the server forks or changes directory. It is the -last chance to do any global preparation that is needed to serve -connections. +is called before the server forks or changes directory. It is +ordinarily the last chance to do any global preparation that is needed +to serve connections. If there is an error, C<.get_ready> should call C<nbdkit_error> with an error message and return C<-1>. +=head2 C<.after_fork> + + int after_ready (void); + +This optional callback is called before the server starts serving. It +is called after the server forks and changes directory. If a plugin +needs to create background threads (or uses an external library that +creates threads) it should do so here, because background threads are +killed by fork. However you should try to do as little as possible +here because error reporting is difficult. See L</Callback lifecycle> +above. + +If there is an error, C<.after_fork> should call C<nbdkit_error> with +an error message and return C<-1>. + =head2 C<.preconnect> int preconnect (int readonly); diff --git a/plugins/eval/nbdkit-eval-plugin.pod b/plugins/eval/nbdkit-eval-plugin.pod index c5bebb15..7e25a01f 100644 --- a/plugins/eval/nbdkit-eval-plugin.pod +++ b/plugins/eval/nbdkit-eval-plugin.pod @@ -68,6 +68,8 @@ features): =over 4 +=item B<after_fork=>SCRIPT + =item B<cache=>SCRIPT =item B<can_cache=>SCRIPT diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod index 4d885a53..771c6bc0 100644 --- a/plugins/sh/nbdkit-sh-plugin.pod +++ b/plugins/sh/nbdkit-sh-plugin.pod @@ -258,6 +258,10 @@ with status C<1>; unrecognized output is ignored. /path/to/script get_ready +=item C<after_fork> + + /path/to/script after_fork + =item C<preconnect> /path/to/script preconnect <readonly> <exportname> diff --git a/include/nbdkit-filter.h b/include/nbdkit-filter.h index ca58e496..d81186f5 100644 --- a/include/nbdkit-filter.h +++ b/include/nbdkit-filter.h @@ -63,6 +63,7 @@ typedef int nbdkit_next_config (nbdkit_backend *nxdata, const char *key, const char *value); typedef int nbdkit_next_config_complete (nbdkit_backend *nxdata); typedef int nbdkit_next_get_ready (nbdkit_backend *nxdata); +typedef int nbdkit_next_after_fork (nbdkit_backend *nxdata); typedef int nbdkit_next_preconnect (nbdkit_backend *nxdata, int readonly); typedef int nbdkit_next_open (nbdkit_backend *nxdata, int readonly); @@ -145,6 +146,7 @@ struct nbdkit_filter { const char *config_help; int (*thread_model) (void); int (*get_ready) (nbdkit_next_get_ready *next, nbdkit_backend *nxdata); + int (*after_fork) (nbdkit_next_after_fork *next, nbdkit_backend *nxdata); int (*preconnect) (nbdkit_next_preconnect *next, nbdkit_backend *nxdata, int readonly); diff --git a/include/nbdkit-plugin.h b/include/nbdkit-plugin.h index 7ff7bcee..cc261635 100644 --- a/include/nbdkit-plugin.h +++ b/include/nbdkit-plugin.h @@ -138,6 +138,7 @@ struct nbdkit_plugin { int (*preconnect) (int readonly); int (*get_ready) (void); + int (*after_fork) (void); }; extern void nbdkit_set_error (int err); diff --git a/server/internal.h b/server/internal.h index f2b87439..f5653a72 100644 --- a/server/internal.h +++ b/server/internal.h @@ -362,6 +362,7 @@ struct backend { void (*config_complete) (struct backend *); const char *(*magic_config_key) (struct backend *); void (*get_ready) (struct backend *); + void (*after_fork) (struct backend *); int (*preconnect) (struct backend *, int readonly); void *(*open) (struct backend *, int readonly); int (*prepare) (struct backend *, void *handle, int readonly); diff --git a/server/filters.c b/server/filters.c index b4f05236..2d705e1e 100644 --- a/server/filters.c +++ b/server/filters.c @@ -193,6 +193,28 @@ filter_get_ready (struct backend *b) b->next->get_ready (b->next); } +static int +next_after_fork (struct backend *b) +{ + b->after_fork (b); + return 0; +} + +static void +filter_after_fork (struct backend *b) +{ + struct backend_filter *f = container_of (b, struct backend_filter, backend); + + debug ("%s: after_fork", b->name); + + if (f->filter.after_fork) { + if (f->filter.after_fork (next_after_fork, b->next) == -1) + exit (EXIT_FAILURE); + } + else + b->next->after_fork (b->next); +} + static int filter_preconnect (struct backend *b, int readonly) { @@ -516,6 +538,7 @@ static struct backend filter_functions = { .config_complete = filter_config_complete, .magic_config_key = plugin_magic_config_key, .get_ready = filter_get_ready, + .after_fork = filter_after_fork, .preconnect = filter_preconnect, .open = filter_open, .prepare = filter_prepare, diff --git a/server/main.c b/server/main.c index e9f95380..c432f5bd 100644 --- a/server/main.c +++ b/server/main.c @@ -959,6 +959,7 @@ start_serving (void) debug ("using socket activation, nr_socks = %zu", socks.size); change_user (); write_pidfile (); + top->after_fork (top); accept_incoming_connections (&socks); return; } @@ -967,6 +968,7 @@ start_serving (void) if (listen_stdin) { change_user (); write_pidfile (); + top->after_fork (top); threadlocal_new_server_thread (); handle_single_connection (saved_stdin, saved_stdout); return; @@ -986,6 +988,7 @@ start_serving (void) change_user (); fork_into_background (); write_pidfile (); + top->after_fork (top); accept_incoming_connections (&socks); } diff --git a/server/plugins.c b/server/plugins.c index fa572a6a..285569bb 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -159,6 +159,7 @@ plugin_dump_fields (struct backend *b) HAS (config_complete); HAS (config_help); HAS (get_ready); + HAS (after_fork); HAS (preconnect); HAS (open); HAS (close); @@ -249,6 +250,20 @@ plugin_get_ready (struct backend *b) exit (EXIT_FAILURE); } +static void +plugin_after_fork (struct backend *b) +{ + struct backend_plugin *p = container_of (b, struct backend_plugin, backend); + + debug ("%s: after_fork", b->name); + + if (!p->plugin.after_fork) + return; + + if (p->plugin.after_fork () == -1) + exit (EXIT_FAILURE); +} + static int plugin_preconnect (struct backend *b, int readonly) { @@ -695,6 +710,7 @@ static struct backend plugin_functions = { .config_complete = plugin_config_complete, .magic_config_key = plugin_magic_config_key, .get_ready = plugin_get_ready, + .after_fork = plugin_after_fork, .preconnect = plugin_preconnect, .open = plugin_open, .prepare = plugin_prepare, diff --git a/plugins/ocaml/NBDKit.mli b/plugins/ocaml/NBDKit.mli index 1f7a7e4b..3ebbf18f 100644 --- a/plugins/ocaml/NBDKit.mli +++ b/plugins/ocaml/NBDKit.mli @@ -75,6 +75,7 @@ type 'a plugin = { thread_model : (unit -> thread_model) option; get_ready : (unit -> unit) option; + after_fork : (unit -> unit) option; preconnect : (bool -> unit) option; open_connection : (bool -> 'a) option; (* required *) diff --git a/plugins/sh/methods.h b/plugins/sh/methods.h index f11e67e7..08a5ed17 100644 --- a/plugins/sh/methods.h +++ b/plugins/sh/methods.h @@ -42,6 +42,7 @@ extern const char *get_script (const char *method); extern void sh_dump_plugin (void); extern int sh_thread_model (void); extern int sh_get_ready (void); +extern int sh_after_fork (void); extern int sh_preconnect (int readonly); extern void *sh_open (int readonly); extern void sh_close (void *handle); diff --git a/plugins/ocaml/NBDKit.ml b/plugins/ocaml/NBDKit.ml index ae3d8fb6..9ce3bf3e 100644 --- a/plugins/ocaml/NBDKit.ml +++ b/plugins/ocaml/NBDKit.ml @@ -70,6 +70,7 @@ type 'a plugin = { thread_model : (unit -> thread_model) option; get_ready : (unit -> unit) option; + after_fork : (unit -> unit) option; preconnect : (bool -> unit) option; open_connection : (bool -> 'a) option; @@ -114,6 +115,7 @@ let default_callbacks = { thread_model = None; get_ready = None; + after_fork = None; preconnect = None; open_connection = None; @@ -157,6 +159,7 @@ external set_config_help : string -> unit = "ocaml_nbdkit_set_config_help" "noal external set_thread_model : (unit -> thread_model) -> unit = "ocaml_nbdkit_set_thread_model" external set_get_ready : (unit -> unit) -> unit = "ocaml_nbdkit_set_get_ready" +external set_after_fork : (unit -> unit) -> unit = "ocaml_nbdkit_set_after_fork" external set_preconnect : (bool -> unit) -> unit = "ocaml_nbdkit_set_preconnect" external set_open : (bool -> 'a) -> unit = "ocaml_nbdkit_set_open" @@ -219,6 +222,7 @@ let register_plugin plugin may set_thread_model plugin.thread_model; may set_get_ready plugin.get_ready; + may set_after_fork plugin.after_fork; may set_preconnect plugin.preconnect; may set_open plugin.open_connection; diff --git a/plugins/eval/eval.c b/plugins/eval/eval.c index df701f8f..51b57d09 100644 --- a/plugins/eval/eval.c +++ b/plugins/eval/eval.c @@ -54,6 +54,7 @@ static char *missing; static const char *known_methods[] = { + "after_fork", "cache", "can_cache", "can_extents", @@ -389,6 +390,7 @@ static struct nbdkit_plugin plugin = { .config_help = eval_config_help, .thread_model = sh_thread_model, .get_ready = sh_get_ready, + .after_fork = sh_after_fork, .preconnect = sh_preconnect, .open = sh_open, diff --git a/plugins/ocaml/ocaml.c b/plugins/ocaml/ocaml.c index 8e0e88f6..d8b61108 100644 --- a/plugins/ocaml/ocaml.c +++ b/plugins/ocaml/ocaml.c @@ -120,6 +120,7 @@ static value config_complete_fn; static value thread_model_fn; static value get_ready_fn; +static value after_fork_fn; static value preconnect_fn; static value open_fn; @@ -272,6 +273,25 @@ get_ready_wrapper (void) CAMLreturnT (int, 0); } +static int +after_fork_wrapper (void) +{ + CAMLparam0 (); + CAMLlocal1 (rv); + + caml_leave_blocking_section (); + + rv = caml_callback_exn (after_fork_fn, Val_unit); + if (Is_exception_result (rv)) { + nbdkit_error ("%s", caml_format_exception (Extract_exception (rv))); + caml_enter_blocking_section (); + CAMLreturnT (int, -1); + } + + caml_enter_blocking_section (); + CAMLreturnT (int, 0); +} + static int preconnect_wrapper (int readonly) { @@ -833,6 +853,7 @@ SET(config_complete) SET(thread_model) SET(get_ready) +SET(after_fork) SET(preconnect) SET(open) @@ -876,6 +897,7 @@ remove_roots (void) REMOVE (thread_model); REMOVE (get_ready); + REMOVE (after_fork); REMOVE (preconnect); REMOVE (open); diff --git a/plugins/sh/methods.c b/plugins/sh/methods.c index 10cd4100..8257103e 100644 --- a/plugins/sh/methods.c +++ b/plugins/sh/methods.c @@ -165,6 +165,31 @@ sh_get_ready (void) } } +int +sh_after_fork (void) +{ + const char *method = "after_fork"; + const char *script = get_script (method); + const char *args[] = { script, method, NULL }; + + switch (call (args)) { + case OK: + case MISSING: + return 0; + + case ERROR: + return -1; + + case RET_FALSE: + nbdkit_error ("%s: %s method returned unexpected code (3/false)", + script, method); + errno = EIO; + return -1; + + default: abort (); + } +} + int sh_preconnect (int readonly) { diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c index eb677d5b..9e484823 100644 --- a/plugins/sh/sh.c +++ b/plugins/sh/sh.c @@ -297,6 +297,7 @@ static struct nbdkit_plugin plugin = { .magic_config_key = "script", .thread_model = sh_thread_model, .get_ready = sh_get_ready, + .after_fork = sh_after_fork, .preconnect = sh_preconnect, .open = sh_open, diff --git a/tests/test-eval.sh b/tests/test-eval.sh index cf88fcf7..07a9ea1e 100755 --- a/tests/test-eval.sh +++ b/tests/test-eval.sh @@ -65,6 +65,7 @@ grep 'in missing: thread_model' eval.missing grep 'in missing: can_write' eval.missing grep 'in missing: is_rotational' eval.missing grep 'in missing: get_ready' eval.missing +grep 'in missing: after_fork' eval.missing grep 'in missing: preconnect' eval.missing grep 'in missing: open' eval.missing grep 'in missing: close' eval.missing diff --git a/tests/test-layers-filter.c b/tests/test-layers-filter.c index 53427d2a..66d80835 100644 --- a/tests/test-layers-filter.c +++ b/tests/test-layers-filter.c @@ -89,6 +89,14 @@ test_layers_filter_get_ready (nbdkit_next_get_ready *next, return next (nxdata); } +static int +test_layers_filter_after_fork (nbdkit_next_after_fork *next, + void *nxdata) +{ + DEBUG_FUNCTION; + return next (nxdata); +} + static int test_layers_filter_preconnect (nbdkit_next_preconnect *next, void *nxdata, int readonly) @@ -358,6 +366,7 @@ static struct nbdkit_filter filter = { .config_complete = test_layers_filter_config_complete, .config_help = test_layers_filter_config_help, .get_ready = test_layers_filter_get_ready, + .after_fork = test_layers_filter_after_fork, .preconnect = test_layers_filter_preconnect, .open = test_layers_filter_open, .close = test_layers_filter_close, diff --git a/tests/test-layers-plugin.c b/tests/test-layers-plugin.c index 72ed03f5..ccb67fe4 100644 --- a/tests/test-layers-plugin.c +++ b/tests/test-layers-plugin.c @@ -79,6 +79,13 @@ test_layers_plugin_get_ready (void) return 0; } +static int +test_layers_plugin_after_fork (void) +{ + DEBUG_FUNCTION; + return 0; +} + static int test_layers_plugin_preconnect (int readonly) { @@ -239,6 +246,7 @@ static struct nbdkit_plugin plugin = { .config_complete = test_layers_plugin_config_complete, .config_help = test_layers_plugin_config_help, .get_ready = test_layers_plugin_get_ready, + .after_fork = test_layers_plugin_after_fork, .preconnect = test_layers_plugin_preconnect, .open = test_layers_plugin_open, .close = test_layers_plugin_close, diff --git a/tests/test-layers.c b/tests/test-layers.c index aa61d03b..9a0279f1 100644 --- a/tests/test-layers.c +++ b/tests/test-layers.c @@ -300,6 +300,18 @@ main (int argc, char *argv[]) "test_layers_plugin_get_ready", NULL); + /* after_fork methods called in order. */ + log_verify_seen_in_order + ("testlayersfilter3: after_fork", + "filter3: test_layers_filter_after_fork", + "testlayersfilter2: after_fork", + "filter2: test_layers_filter_after_fork", + "testlayersfilter1: after_fork", + "filter1: test_layers_filter_after_fork", + "testlayersplugin: after_fork", + "test_layers_plugin_after_fork", + NULL); + /* preconnect methods called in outer-to-inner order, complete * in inner-to-outer order. */ -- 2.25.0
Richard W.M. Jones
2020-Jun-22 15:49 UTC
[Libguestfs] [PATCH nbdkit 2/2] vddk: Defer library initialization to .after_fork().
VDDK creates background threads. fork kills these, resulting in the library hanging or failing completely in certain configurations. --- plugins/vddk/vddk.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/plugins/vddk/vddk.c b/plugins/vddk/vddk.c index e6639b4d..a481e8bf 100644 --- a/plugins/vddk/vddk.c +++ b/plugins/vddk/vddk.c @@ -431,12 +431,31 @@ vddk_config_complete (void) return 0; } +#define vddk_config_help \ + "[file=]<FILENAME> (required) The filename (eg. VMDK file) to serve.\n" \ + "Many optional parameters are supported, see nbdkit-vddk-plugin(3)." + static int vddk_get_ready (void) { - VixError err; - load_library (true); + return 0; +} + +/* Defer VDDK initialization until after fork because it is known to + * create background threads from VixDiskLib_InitEx. Unfortunately + * error reporting from this callback is difficult, but we have + * already checked in .get_ready that the library is dlopenable. + * + * For various hangs and failures which were caused by background + * threads and fork see: + * https://bugzilla.redhat.com/show_bug.cgi?id=1846309#c9 + * https://www.redhat.com/archives/libguestfs/2019-April/msg00090.html + */ +static int +vddk_after_fork (void) +{ + VixError err; /* Initialize VDDK library. */ DEBUG_CALL ("VixDiskLib_InitEx", @@ -457,10 +476,6 @@ vddk_get_ready (void) return 0; } -#define vddk_config_help \ - "[file=]<FILENAME> (required) The filename (eg. VMDK file) to serve.\n" \ - "Many optional parameters are supported, see nbdkit-vddk-plugin(3)." - static void vddk_dump_plugin (void) { @@ -959,6 +974,7 @@ static struct nbdkit_plugin plugin = { .magic_config_key = "file", .dump_plugin = vddk_dump_plugin, .get_ready = vddk_get_ready, + .after_fork = vddk_after_fork, .open = vddk_open, .close = vddk_close, .get_size = vddk_get_size, -- 2.25.0
Eric Blake
2020-Jun-22 19:00 UTC
Re: [Libguestfs] [PATCH nbdkit 1/2] server: Add .after_fork callback, mainly for plugins to create threads.
On 6/22/20 10:49 AM, Richard W.M. Jones wrote:> If you have a plugin which either creates background threads itself or > uses a library that creates background threads, it turns out you > cannot create these in .get_ready (or earlier). The reason is that > nbdkit forks when either daemonizing itself or using the --run option, > and fork cancels all the background threads in the child process (the > daemonized or captive nbdkit).Or more precisely, the main thread after the fork cannot interact with the helper threads stranded in the parent process (fork always creates a single-threaded process - you have to use Linux-specific clone to copy threads from parent to child, and it's not worth us trying to go that low-level).> > The only good solution here is to add a new callback which I've called > .after_fork which runs after either daemonization or the --run > function.Yeah, I'm not thinking of any other viable solutions.> > It should be used sparingly because it's not possible to report errors > nicely from this callback. In particular when daemonizing we have > lost the controlling terminal and errors go to syslog - it looks to > the user as if nbdkit “died”. > > This adds the callback to the C and OCaml APIs, and also sh and eval > (to make the tests work), but I didn't update the other language APIs > yet. > ---> +++ b/docs/nbdkit-filter.pod> +=head2 C<.after_fork> > + > + int (*after_fork) (nbdkit_next_after_fork *next, void *nxdata); > + > +This intercepts the plugin C<.after_fork> method and can be used by > +the filter to start background threads (although these have limited > +usefulness since they will not be able to access the plugin). > + > +If there is an error, C<.after_fork> should call C<nbdkit_error> with > +an error message and return C<-1>.Should the backend code guarantee that .after_fork is called for all layers in the backend chain? The only flexibility I see us gaining by letting the filter choose when .after_fork is called is fine-grained control over what threads the filter can start before or after the plugin's .after_fork, but is that flexibility really needed, or are we equally safe if the backend just always calls .after_fork and the filter doesn't have to worry about calling next(nxdata)? Of course, given that filters are in-tree, we can change this signature as needed later on, even if we make it void for now we can hook it into passing .after_fork later if we find a use for that finer-grained call semantics.> +=item C<.after_fork> > + > +In normal operation, C<.after_fork> is called after the server has > +forked into the background and changed UID and directory. If a plugin > +needs to create background threads (or uses an external library that > +creates threads) it should do so here, because background threads are > +killed by fork.s/killed/invalidated/ (they are still alive in the parent process, but after the fork we are no longer in the parent process to interact with them)> + > +Because the server may have forked into the background, error messages > +and failures from C<.after_fork> cannot be seen by the user unless > +they look through syslog. An error in C<.after_fork> can appear to > +the user as if nbdkit “just died”. So in almost all cases it is > +better to use C<.get_ready> instead of this callback, or to do as much > +preparation work as possible in C<.get_ready> and only start > +background threads here. > + > +The server doesn't always fork (eg. if the I<-f> flag is used), but > +even so this callback will be called. If you want to find out if the > +server forked between C<.get_ready> and C<.after_fork> use > +L<getpid(2)>.Thinking about back-compat: We promise that plugins compiled against older nbdkit will continue to run against newer nbdkit, and that is preserved here. What we do not promise is that plugins compiled against newer nbdkit will work wehn loaded by older nbdkit. This is one of those cases: anything you stick in .after_fork will not be called by older nbdkit which didn't know about the callback. At present, we silently ignore the tail of 'struct plugin' from any plugin compiled against newer nbdkit when loaded by older nbdkit. Should we tweak that to instead output a warning that the plugin might rely on features present only in newer nbdkit and therefore might not work in the current nbdkit version? Of course, that doesn't help the fact that existing older nbdkit is lacking this safety check. Do we need to instead add some sort of API for plugins to be able to learn which features are provided by the nbdkit loading the plugin, so that a plugin that requires the use of 1.22 nbdkit features (such as .after_fork) can gracefully fail during .config_complete (or similar) when loaded by a 1.20 nbdkit, rather than mysteriously failing to work when .after_fork is not called? [In a distro setting, the solution is version dependencies: you could make it so that installing the nbdkit-vddk-plugin package forces an upgrade of the nbdkit package to 1.22 - but this is more a question for situations not covered by distro packaging setups]> +=head2 C<.after_fork> > + > + int after_ready (void);Typo in the function name.> + > +This optional callback is called before the server starts serving. It > +is called after the server forks and changes directory. If a plugin > +needs to create background threads (or uses an external library that > +creates threads) it should do so here, because background threads are > +killed by fork. However you should try to do as little as possible > +here because error reporting is difficult. See L</Callback lifecycle> > +above. > + > +If there is an error, C<.after_fork> should call C<nbdkit_error> with > +an error message and return C<-1>. > +> +++ b/server/filters.c > @@ -193,6 +193,28 @@ filter_get_ready (struct backend *b) > b->next->get_ready (b->next); > } > > +static int > +next_after_fork (struct backend *b) > +{ > + b->after_fork (b); > + return 0; > +} > + > +static void > +filter_after_fork (struct backend *b) > +{ > + struct backend_filter *f = container_of (b, struct backend_filter, backend); > + > + debug ("%s: after_fork", b->name); > + > + if (f->filter.after_fork) { > + if (f->filter.after_fork (next_after_fork, b->next) == -1) > + exit (EXIT_FAILURE); > + } > + else > + b->next->after_fork (b->next); > +}This code changes if we decide that filters could live with 'void' instead of next_after_fork, because we mandate the chain traversal through the backend rather than leaving it up to the filters. Still a few tweaks needed, but the idea looks sane. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Eric Blake
2020-Jun-22 19:05 UTC
Re: [Libguestfs] [PATCH nbdkit 2/2] vddk: Defer library initialization to .after_fork().
On 6/22/20 10:49 AM, Richard W.M. Jones wrote:> VDDK creates background threads. fork kills these, resulting in thes/kills/strands/> library hanging or failing completely in certain configurations. > --- > plugins/vddk/vddk.c | 28 ++++++++++++++++++++++------ > 1 file changed, 22 insertions(+), 6 deletions(-) >ACK. Should we also modify our tests/libvixDiskLib.so dummy library to also spawn a thread, where .pread then checks for something provided by that helper thread, for testsuite coverage of this? That's a rather complicated change to make, so I'm not insisting, but if you're up to it, the more our dummy library behaves like the real thing, the harder it is to introduce further regressions. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2020-Jun-22 20:15 UTC
Re: [Libguestfs] [PATCH nbdkit 1/2] server: Add .after_fork callback, mainly for plugins to create threads.
On Mon, Jun 22, 2020 at 02:00:50PM -0500, Eric Blake wrote:> >+++ b/docs/nbdkit-filter.pod > > >+=head2 C<.after_fork> > >+ > >+ int (*after_fork) (nbdkit_next_after_fork *next, void *nxdata); > >+ > >+This intercepts the plugin C<.after_fork> method and can be used by > >+the filter to start background threads (although these have limited > >+usefulness since they will not be able to access the plugin). > >+ > >+If there is an error, C<.after_fork> should call C<nbdkit_error> with > >+an error message and return C<-1>. > > Should the backend code guarantee that .after_fork is called for all > layers in the backend chain? The only flexibility I see us gaining > by letting the filter choose when .after_fork is called is > fine-grained control over what threads the filter can start before > or after the plugin's .after_fork, but is that flexibility really > needed, or are we equally safe if the backend just always calls > .after_fork and the filter doesn't have to worry about calling > next(nxdata)? > > Of course, given that filters are in-tree, we can change this > signature as needed later on, even if we make it void for now we can > hook it into passing .after_fork later if we find a use for that > finer-grained call semantics.In theory a filter that wanted background threads would have exactly the same problem as a plugin and would need to add an .after_fork callback to create them. I can't think of a situation where a filter would be correct to not call the plugin's .after_fork, but I suppose it's plausible it might want to do something before or after the plugin. The biggest practical problem at the moment however is ... At this point I would link to the TODO file on github, but it seems like github is down. If you look at the TODO file you'll see I mentioned a general problem with background threads in filters. Anyway as you say they're all in tree so we can worry about this later.> >+=item C<.after_fork> > >+ > >+In normal operation, C<.after_fork> is called after the server has > >+forked into the background and changed UID and directory. If a plugin > >+needs to create background threads (or uses an external library that > >+creates threads) it should do so here, because background threads are > >+killed by fork. > > s/killed/invalidated/ (they are still alive in the parent process, > but after the fork we are no longer in the parent process to > interact with them)OK, I'll change this. It's also of course a point that if a plugin creates background threads before the fork and they grab locks then that could lead to problems. So I need to add a note to .get_ready saying that you shouldn't create background threads there.> >+ > >+Because the server may have forked into the background, error messages > >+and failures from C<.after_fork> cannot be seen by the user unless > >+they look through syslog. An error in C<.after_fork> can appear to > >+the user as if nbdkit “just died”. So in almost all cases it is > >+better to use C<.get_ready> instead of this callback, or to do as much > >+preparation work as possible in C<.get_ready> and only start > >+background threads here. > >+ > >+The server doesn't always fork (eg. if the I<-f> flag is used), but > >+even so this callback will be called. If you want to find out if the > >+server forked between C<.get_ready> and C<.after_fork> use > >+L<getpid(2)>. > > Thinking about back-compat: > We promise that plugins compiled against older nbdkit will continue > to run against newer nbdkit, and that is preserved here. > What we do not promise is that plugins compiled against newer nbdkit > will work wehn loaded by older nbdkit. This is one of those cases: > anything you stick in .after_fork will not be called by older nbdkit > which didn't know about the callback. > > At present, we silently ignore the tail of 'struct plugin' from any > plugin compiled against newer nbdkit when loaded by older nbdkit. > Should we tweak that to instead output a warning that the plugin > might rely on features present only in newer nbdkit and therefore > might not work in the current nbdkit version? Of course, that > doesn't help the fact that existing older nbdkit is lacking this > safety check.It should probably only do that if the tail of the struct is non-zero, because who cares if it contains NULL pointers. But that's another patch.> Do we need to instead add some sort of API for plugins to be able to > learn which features are provided by the nbdkit loading the plugin, > so that a plugin that requires the use of 1.22 nbdkit features (such > as .after_fork) can gracefully fail during .config_complete (or > similar) when loaded by a 1.20 nbdkit, rather than mysteriously > failing to work when .after_fork is not called? [In a distro > setting, the solution is version dependencies: you could make it so > that installing the nbdkit-vddk-plugin package forces an upgrade of > the nbdkit package to 1.22 - but this is more a question for > situations not covered by distro packaging setups]TBH this is a non-supported case so if it fails you get to keep all the pieces.> >+=head2 C<.after_fork> > >+ > >+ int after_ready (void); > > Typo in the function name.Oops.> >+++ b/server/filters.c > >@@ -193,6 +193,28 @@ filter_get_ready (struct backend *b) > > b->next->get_ready (b->next); > > } > >+static int > >+next_after_fork (struct backend *b) > >+{ > >+ b->after_fork (b); > >+ return 0; > >+} > >+ > >+static void > >+filter_after_fork (struct backend *b) > >+{ > >+ struct backend_filter *f = container_of (b, struct backend_filter, backend); > >+ > >+ debug ("%s: after_fork", b->name); > >+ > >+ if (f->filter.after_fork) { > >+ if (f->filter.after_fork (next_after_fork, b->next) == -1) > >+ exit (EXIT_FAILURE); > >+ } > >+ else > >+ b->next->after_fork (b->next); > >+} > > This code changes if we decide that filters could live with 'void' > instead of next_after_fork, because we mandate the chain traversal > through the backend rather than leaving it up to the filters. > > Still a few tweaks needed, but the idea looks sane.Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Apparently Analagous Threads
- [PATCH nbdkit 2/2] vddk: Defer library initialization to .after_fork().
- Re: [nbdkit PATCH v7 2/2] vddk: Drive library loading from libdir parameter.
- [nbdkit PATCH v2] vddk: Drop support for VDDK 5.1.1
- [PATCH nbdkit 1/2] vddk: Delay loading VDDK until config_complete.
- Re: [nbdkit PATCH v5 4/4] vddk: Drive library loading from libdir parameter.