This is a major rewrite compared to my v1 series, where I've tried a lot harder to ensure that we still accommodate building on Haiku (although I have not actually yet fired up a Haiku VM to try it for myself). I also managed to make the sh plugin fully parallel, on capable platforms. See also my question on patch 10 on whether I've picked the best naming convention. Eric Blake (17): maint: Rename server/utils.c to server/public.c plugins: Expose more thread_model details in --dump-plugin server: Add utility functions for setting CLOEXEC and NONBLOCK Revert "RHEL 5: Define O_CLOEXEC and SOCK_CLOEXEC." build: Audit existing use of SOCK_CLOEXEC cow, cache: Better mkostemp fallback build: Audit for use of pipe2 rate: Atomically set CLOEXEC on fds server: Use atomic CLOEXEC for nbdkit_read_password plugins: Add .fork_safe field server: Atomically set CLOEXEC on accept fds filters: Set CLOEXEC on files opened during .config python: Use CLOEXEC on script sh: Use pipe2 with CLOEXEC when possible sh: Enable parallel thread model, when possible sh: Test for fd leaks temp debug docs/nbdkit-plugin.pod | 32 ++++-- plugins/sh/nbdkit-sh-plugin.pod | 10 +- configure.ac | 2 + common/utils/utils.h | 2 + include/nbdkit-plugin.h | 1 + server/internal.h | 14 +-- common/utils/utils.c | 64 ++++++++++++ filters/cache/blk.c | 12 ++- filters/cow/blk.c | 12 ++- filters/log/log.c | 22 ++++- filters/rate/rate.c | 20 +++- filters/stats/stats.c | 21 +++- filters/xz/xzfile.c | 4 - plugins/curl/curl.c | 1 + plugins/data/data.c | 1 + plugins/example2/example2.c | 5 +- plugins/example3/example3.c | 1 + plugins/ext2/ext2.c | 1 + plugins/file/file.c | 5 +- plugins/floppy/floppy.c | 1 + plugins/full/full.c | 1 + plugins/guestfs/guestfs-plugin.c | 1 + plugins/gzip/gzip.c | 1 + plugins/iso/iso.c | 1 + plugins/libvirt/libvirt-plugin.c | 1 + plugins/linuxdisk/linuxdisk.c | 1 + plugins/memory/memory.c | 1 + plugins/nbd/nbd-standalone.c | 1 + plugins/nbd/nbd.c | 32 ++++++ plugins/null/null.c | 1 + plugins/partitioning/partitioning.c | 1 + plugins/pattern/pattern.c | 1 + plugins/python/python.c | 18 +++- plugins/random/random.c | 1 + plugins/sh/call.c | 34 ++++++- plugins/sh/sh.c | 9 +- plugins/split/split.c | 5 +- plugins/ssh/ssh.c | 1 + plugins/streaming/streaming.c | 5 +- plugins/zero/zero.c | 1 + server/locks.c | 4 +- server/plugins.c | 63 ++++++++---- server/{utils.c => public.c} | 19 +++- server/quit.c | 18 ++++ server/sockets.c | 40 +++++++- server/{test-utils.c => test-public.c} | 0 tests/test-layers.c | 4 +- tests/test-socket-activation.c | 12 +-- tests/test-streaming.c | 5 +- tests/web-server.c | 7 +- .gitignore | 2 +- server/Makefile.am | 18 ++-- tests/Makefile.am | 2 + tests/test-dump-plugin.sh | 32 +++++- tests/test-parallel-file.sh | 3 + tests/test-parallel-nbd.sh | 5 +- tests/test-parallel-sh.sh | 131 +++++++++++++++++++++++++ 57 files changed, 605 insertions(+), 108 deletions(-) rename server/{utils.c => public.c} (94%) rename server/{test-utils.c => test-public.c} (100%) create mode 100755 tests/test-parallel-sh.sh -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 01/17] maint: Rename server/utils.c to server/public.c
Otherwise, it gets confusing with the fact that we also have common/utils/utils.c. Signed-off-by: Eric Blake <eblake@redhat.com> --- server/{utils.c => public.c} | 4 ++++ server/{test-utils.c => test-public.c} | 0 .gitignore | 2 +- server/Makefile.am | 18 +++++++++--------- 4 files changed, 14 insertions(+), 10 deletions(-) rename server/{utils.c => public.c} (98%) rename server/{test-utils.c => test-public.c} (100%) diff --git a/server/utils.c b/server/public.c similarity index 98% rename from server/utils.c rename to server/public.c index 5a2d471a..4b8b1d2b 100644 --- a/server/utils.c +++ b/server/public.c @@ -30,6 +30,10 @@ * SUCH DAMAGE. */ +/* This file contains the public utility APIs to be exported by nbdkit + * for use by filters and plugins, declared in nbdkit-common.h. + */ + #include <config.h> #include <stdio.h> diff --git a/server/test-utils.c b/server/test-public.c similarity index 100% rename from server/test-utils.c rename to server/test-public.c diff --git a/.gitignore b/.gitignore index 46ab5dc6..bf662193 100644 --- a/.gitignore +++ b/.gitignore @@ -66,7 +66,7 @@ Makefile.in /server/nbdkit /server/nbdkit.pc /server/synopsis.c -/server/test-utils +/server/test-public /stamp-h1 /tests/disk /tests/disk.gz diff --git a/server/Makefile.am b/server/Makefile.am index 8f6b0a56..29674866 100644 --- a/server/Makefile.am +++ b/server/Makefile.am @@ -55,13 +55,13 @@ nbdkit_SOURCES = \ protocol-handshake.c \ protocol-handshake-oldstyle.c \ protocol-handshake-newstyle.c \ + public.c \ quit.c \ signals.c \ socket-activation.c \ sockets.c \ threadlocal.c \ usergroup.c \ - utils.c \ $(top_srcdir)/include/nbdkit-plugin.h \ $(top_srcdir)/include/nbdkit-filter.h \ $(NULL) @@ -121,21 +121,21 @@ pkgconfig_DATA = nbdkit.pc # Unit testing -TESTS = test-utils +TESTS = test-public -check_PROGRAMS = test-utils +check_PROGRAMS = test-public -test_utils_SOURCES = \ - test-utils.c \ - utils.c \ +test_public_SOURCES = \ + test-public.c \ + public.c \ extents.c \ $(NULL) -test_utils_CPPFLAGS = \ +test_public_CPPFLAGS = \ -I$(top_srcdir)/include \ -I$(top_srcdir)/common/include \ -I$(top_srcdir)/common/utils \ $(NULL) -test_utils_CFLAGS = $(WARNINGS_CFLAGS) $(VALGRIND_CFLAGS) -test_utils_LDADD = \ +test_public_CFLAGS = $(WARNINGS_CFLAGS) $(VALGRIND_CFLAGS) +test_public_LDADD = \ $(top_builddir)/common/utils/libutils.la \ $(NULL) -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 02/17] plugins: Expose more thread_model details in --dump-plugin
When we added .thread_model to plugins, we forgot to expose has_thread_model in the --dump-plugin output. What's more, it is useful to determine if the plugin itself is aware of tighter restrictions imposed by dynamic choices, by outputting both 'max_thread_model' (what the plugin was compiled as) and 'thread_model' (the resulting model based on runtime knowledge). Note that this means that a plugin's .thread_model can be called prior to .config, so the sh plugin must be prepared for a missing script. In fact, exposing additional information during --dump-plugin will make it easier for a future patch to make sh default to PARALLEL, and still have introspection on which model will actually be used. Fixes: 9911a9d3 Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/nbdkit-plugin.pod | 11 +++++--- plugins/sh/nbdkit-sh-plugin.pod | 3 +- plugins/sh/sh.c | 3 ++ server/plugins.c | 49 ++++++++++++++++++++------------- tests/test-dump-plugin.sh | 31 ++++++++++++++++++++- 5 files changed, 72 insertions(+), 25 deletions(-) diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod index 8ce702da..fe9ada87 100644 --- a/docs/nbdkit-plugin.pod +++ b/docs/nbdkit-plugin.pod @@ -128,16 +128,19 @@ is called once just after the plugin is loaded into memory. C<.config> is called zero or more times during command line parsing. C<.config_complete> is called once after all configuration information -has been passed to the plugin. +has been passed to the plugin (but not during C<nbdkit --dump-plugin>). Both are called after loading the plugin but before any connections are accepted. =item C<.thread_model> -C<.thread_model> is called once after all configuration information -has been passed to the plugin, and before any connections are -accepted. +In normal operation, C<.thread_model> is called once after +C<.config_complete> has validated all configuration information, and +before any connections are accepted. However, during C<nbdkit +--dump-plugin>, it is called after any C<.config> calls but without +C<.config_complete> (so a plugin which determines the results from a +script must be prepared for a missing script). =item C<.open> diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod index dfa086b5..41b9e7c7 100644 --- a/plugins/sh/nbdkit-sh-plugin.pod +++ b/plugins/sh/nbdkit-sh-plugin.pod @@ -191,7 +191,8 @@ C<"serialize_requests">, or C<"parallel">. This method is I<not> required; if omitted, then the plugin will be executed under the default sh thread model (currently C<"serialize_all_requests">, which implies this callback only makes a -difference with an output of C<"serialize_connections">). If an error +difference with an output of C<"serialize_connections">; look for +'max_thread_model' in C<nbdkit --dump-plugin sh>). If an error occurs, the script should output an error message and exit with status C<1>; unrecognized output is ignored. diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c index e3d3c2f1..5869d9a4 100644 --- a/plugins/sh/sh.c +++ b/plugins/sh/sh.c @@ -273,6 +273,9 @@ sh_thread_model (void) size_t slen; int r; + if (!script) + return THREAD_MODEL; + switch (call_read (&s, &slen, args)) { case OK: if (slen > 0 && s[slen-1] == '\n') diff --git a/server/plugins.c b/server/plugins.c index 9da0361b..3bb20c93 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -140,6 +140,33 @@ plugin_version (struct backend *b) return p->plugin.version; } +/* Map thread model to a string */ +static void +plugin_dump_thread_model (const char *prefix, int model) +{ + flockfile (stdout); + printf ("%s=", prefix); + switch (model) { + case NBDKIT_THREAD_MODEL_SERIALIZE_CONNECTIONS: + printf ("serialize_connections"); + break; + case NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS: + printf ("serialize_all_requests"); + break; + case NBDKIT_THREAD_MODEL_SERIALIZE_REQUESTS: + printf ("serialize_requests"); + break; + case NBDKIT_THREAD_MODEL_PARALLEL: + printf ("parallel"); + break; + default: + printf ("%d # unknown thread model!", model); + break; + } + printf ("\n"); + funlockfile (stdout); +} + /* This implements the --dump-plugin option. */ static void plugin_dump_fields (struct backend *b) @@ -157,25 +184,8 @@ plugin_dump_fields (struct backend *b) printf ("api_version=%d\n", p->plugin._api_version); printf ("struct_size=%" PRIu64 "\n", p->plugin._struct_size); - printf ("thread_model="); - switch (p->plugin._thread_model) { - case NBDKIT_THREAD_MODEL_SERIALIZE_CONNECTIONS: - printf ("serialize_connections"); - break; - case NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS: - printf ("serialize_all_requests"); - break; - case NBDKIT_THREAD_MODEL_SERIALIZE_REQUESTS: - printf ("serialize_requests"); - break; - case NBDKIT_THREAD_MODEL_PARALLEL: - printf ("parallel"); - break; - default: - printf ("%d # unknown thread model!", p->plugin._thread_model); - break; - } - printf ("\n"); + plugin_dump_thread_model ("max_thread_model", p->plugin._thread_model); + plugin_dump_thread_model ("thread_model", backend->thread_model (backend)); printf ("errno_is_preserved=%d\n", !!p->plugin.errno_is_preserved); if (p->plugin.magic_config_key) printf ("magic_config_key=%s\n", p->plugin.magic_config_key); @@ -213,6 +223,7 @@ plugin_dump_fields (struct backend *b) HAS (extents); HAS (can_cache); HAS (cache); + HAS (thread_model); #undef HAS /* Custom fields. */ diff --git a/tests/test-dump-plugin.sh b/tests/test-dump-plugin.sh index 1d58a3a3..982d5062 100755 --- a/tests/test-dump-plugin.sh +++ b/tests/test-dump-plugin.sh @@ -1,6 +1,6 @@ #!/usr/bin/env bash # nbdkit -# Copyright (C) 2014 Red Hat Inc. +# Copyright (C) 2014-2019 Red Hat Inc. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are @@ -68,3 +68,32 @@ do_test () esac } foreach_plugin do_test + +# Test that --dump-plugin can be used to introspect a resulting dynamic +# thread model. +out=$({ + # sh does not yet support parallel + nbdkit --dump-plugin sh + # Here, the script further reduces things + nbdkit --dump-plugin sh - <<\EOF +case $1 in + get_size) echo 1M ;; + thread_model) echo serialize_connections ;; + *) exit 2 ;; +esac +EOF + # Here, the filter further reduces things + nbdkit --dump-plugin --filter=noparallel sh serialize=connections +} | grep thread_model) +exp="max_thread_model=serialize_all_requests +thread_model=serialize_all_requests +has_thread_model=1 +max_thread_model=serialize_all_requests +thread_model=serialize_connections +has_thread_model=1 +max_thread_model=serialize_all_requests +thread_model=serialize_connections +has_thread_model=1" +if [ "$out" != "$exp" ]; then + echo "thread_model mismatch"; exit 1 +fi -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 03/17] server: Add utility functions for setting CLOEXEC and NONBLOCK
Upcoming patches will be setting FD_CLOEXEC and/or O_NONBLOCK on more files; the correct way to do this requires a read-modify-write, which is enough lines of code to be worth a helper function. I chose the semantics of close()ing the fd on (unlikely) failure, on the grounds that these functions will likely be used only immediately after the fd is created. Signed-off-by: Eric Blake <eblake@redhat.com> --- common/utils/utils.h | 2 ++ common/utils/utils.c | 46 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/common/utils/utils.h b/common/utils/utils.h index 3f10cdc0..ebd5f66b 100644 --- a/common/utils/utils.h +++ b/common/utils/utils.h @@ -36,5 +36,7 @@ extern void shell_quote (const char *str, FILE *fp); extern void uri_quote (const char *str, FILE *fp); extern int exit_status_to_nbd_error (int status, const char *cmd); +extern int set_cloexec (int fd); +extern int set_nonblock (int fd); #endif /* NBDKIT_UTILS_H */ diff --git a/common/utils/utils.c b/common/utils/utils.c index 7534a13d..37c81871 100644 --- a/common/utils/utils.c +++ b/common/utils/utils.c @@ -32,11 +32,13 @@ #include <config.h> +#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/wait.h> +#include <unistd.h> #include <nbdkit-plugin.h> @@ -126,3 +128,47 @@ exit_status_to_nbd_error (int status, const char *cmd) return 0; } + +/* Set the FD_CLOEXEC flag on the given fd, if it is non-negative. + * On failure, close fd and return -1; on success, return fd. + */ +int +set_cloexec (int fd) { + int f; + int err; + + if (fd == -1) + return -1; + + f = fcntl (fd, F_GETFD); + if (f == -1 || fcntl (fd, F_SETFD, f | FD_CLOEXEC) == -1) { + err = errno; + nbdkit_error ("fcntl: %m"); + close (fd); + errno = err; + return -1; + } + return fd; +} + +/* Set the O_NONBLOCK flag on the given fd, if it is non-negative. + * On failure, close fd and return -1; on success, return fd. + */ +int +set_nonblock (int fd) { + int f; + int err; + + if (fd == -1) + return -1; + + f = fcntl (fd, F_GETFL); + if (f == -1 || fcntl (fd, F_SETFL, f | O_NONBLOCK) == -1) { + err = errno; + nbdkit_error ("fcntl: %m"); + close (fd); + errno = err; + return -1; + } + return fd; +} -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 04/17] Revert "RHEL 5: Define O_CLOEXEC and SOCK_CLOEXEC."
This reverts commit 25206df20275aeff346d9b86adf5e9be99cc9e43, and temporarily breaks compilation on Haiku which has O_CLOEXEC but lacks SOCK_CLOEXEC [1]. An upcoming patch wants to ensure no leaked fds from the server to a child process. POSIX has required O_CLOEXEC since 2008, and that is long enough that we should start blindly relying on it. Meanwhile, POSIX doesn't yet specify full atomic interfaces everywhere (SOCK_CLOEXEC and accept4 when dealing with sockets, pipe2 when dealing with pipes, mkostemp and fopen("re") when dealing with stdio, etc.), but there has been an open bug since 2014 [2] recommending the full set of interfaces that will be mandatory in the next revision of POSIX. Most modern OS have caught up to that (RHEL 6 and FreeBSD 10 support SOCK_CLOEXEC, for example), and we're doing ourselves a disservice by using a globally-silent and leaky fallback instead of actively documenting at each site that prefers to use the new interfaces whether the use is essential or merely reducing lines of code, for the sake of systems that are behind in atomic support. [1] https://dev.haiku-os.org/ticket/15219 [2] http://austingroupbugs.net/view.php?id=411 Conflicts: filters/xz/xzfile.c - The fallback moved from plugins/xz/xzfile.c (and in fact has been dead code since commit c879d310 made it a filter) plugins/file/file.c - context changes in the meantime src/internal.h - Moved to server/internal.h plugins/split/split.c - Not in the original patch; copied in 9ad64ba5 --- server/internal.h | 8 -------- common/utils/utils.c | 1 + filters/xz/xzfile.c | 4 ---- plugins/example2/example2.c | 4 ---- plugins/file/file.c | 4 ---- plugins/split/split.c | 4 ---- plugins/streaming/streaming.c | 4 ---- 7 files changed, 1 insertion(+), 28 deletions(-) diff --git a/server/internal.h b/server/internal.h index 80ab879c..6207f0cf 100644 --- a/server/internal.h +++ b/server/internal.h @@ -50,14 +50,6 @@ #define UNIX_PATH_MAX 108 #endif -#ifndef O_CLOEXEC -#define O_CLOEXEC 0 -#endif - -#ifndef SOCK_CLOEXEC -#define SOCK_CLOEXEC 0 -#endif - #if HAVE_VALGRIND # include <valgrind.h> /* http://valgrind.org/docs/manual/faq.html#faq.unhelpful */ diff --git a/common/utils/utils.c b/common/utils/utils.c index 37c81871..f81a8433 100644 --- a/common/utils/utils.c +++ b/common/utils/utils.c @@ -36,6 +36,7 @@ #include <stdio.h> #include <stdlib.h> #include <string.h> +#include <sys/socket.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> diff --git a/filters/xz/xzfile.c b/filters/xz/xzfile.c index ebe7f29c..ee4af713 100644 --- a/filters/xz/xzfile.c +++ b/filters/xz/xzfile.c @@ -52,10 +52,6 @@ #include "xzfile.h" -#ifndef O_CLOEXEC -#define O_CLOEXEC 0 -#endif - #define XZ_HEADER_MAGIC "\xfd" "7zXZ\0" #define XZ_HEADER_MAGIC_LEN 6 #define XZ_FOOTER_MAGIC "YZ" diff --git a/plugins/example2/example2.c b/plugins/example2/example2.c index d0bce155..6adc100a 100644 --- a/plugins/example2/example2.c +++ b/plugins/example2/example2.c @@ -48,10 +48,6 @@ #include <nbdkit-plugin.h> -#ifndef O_CLOEXEC -#define O_CLOEXEC 0 -#endif - static char *filename = NULL; /* A debug flag which can be set on the command line using diff --git a/plugins/file/file.c b/plugins/file/file.c index 52d330f7..9df5001d 100644 --- a/plugins/file/file.c +++ b/plugins/file/file.c @@ -61,10 +61,6 @@ #include "cleanup.h" #include "isaligned.h" -#ifndef O_CLOEXEC -#define O_CLOEXEC 0 -#endif - #ifndef HAVE_FDATASYNC #define fdatasync fsync #endif diff --git a/plugins/split/split.c b/plugins/split/split.c index 1b8e69a2..68682b29 100644 --- a/plugins/split/split.c +++ b/plugins/split/split.c @@ -45,10 +45,6 @@ #include <nbdkit-plugin.h> -#ifndef O_CLOEXEC -#define O_CLOEXEC 0 -#endif - /* The files. */ static char **filenames = NULL; static size_t nr_files = 0; diff --git a/plugins/streaming/streaming.c b/plugins/streaming/streaming.c index 4ca3e765..4ab73b8b 100644 --- a/plugins/streaming/streaming.c +++ b/plugins/streaming/streaming.c @@ -43,10 +43,6 @@ #include <nbdkit-plugin.h> -#ifndef O_CLOEXEC -#define O_CLOEXEC 0 -#endif - static char *filename = NULL; static int fd = -1; -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 05/17] build: Audit existing use of SOCK_CLOEXEC
Haiku unfortunately lacks SOCK_CLOEXEC, so we have to plan for a fallback at each site that uses it. In the testsuite, failure to use CLOEXEC is harmless (in particular, this fixes a minor bug in test-socket-activation where a failed socket followed by fcntl(-1) corrupts errno). Meanwhile, in the server, we still need the fd to be CLOEXEC, but at least don't have to worry about atomicity in the current use. In the process, note that since setting CLOEXEC often (but not always) needs to be atomic, it is also worth tightening our set_cloexec helper to refuse to run on capable platforms that should have used SOCK_CLOEXEC; we'll add further restrictions as more atomic CLOEXEC functions are audited. Signed-off-by: Eric Blake <eblake@redhat.com> --- common/utils/utils.c | 12 ++++++++++++ server/sockets.c | 8 ++++++++ tests/test-socket-activation.c | 12 ++++++------ tests/web-server.c | 7 ++++++- 4 files changed, 32 insertions(+), 7 deletions(-) diff --git a/common/utils/utils.c b/common/utils/utils.c index f81a8433..0548058c 100644 --- a/common/utils/utils.c +++ b/common/utils/utils.c @@ -132,9 +132,20 @@ exit_status_to_nbd_error (int status, const char *cmd) /* Set the FD_CLOEXEC flag on the given fd, if it is non-negative. * On failure, close fd and return -1; on success, return fd. + * + * Note that this function should ONLY be used on platforms that lack + * atomic CLOEXEC support during fd creation (such as Haiku in 2019); + * when using it as a fallback path, you must also consider how to + * prevent fd leaks to plugins that want to fork(). */ int set_cloexec (int fd) { +#ifdef SOCK_CLOEXEC + nbdkit_error ("prefer creating fds with CLOEXEC atomically set"); + close (fd); + errno = EBADF; + return -1; +#else int f; int err; @@ -150,6 +161,7 @@ set_cloexec (int fd) { return -1; } return fd; +#endif } /* Set the O_NONBLOCK flag on the given fd, if it is non-negative. diff --git a/server/sockets.c b/server/sockets.c index b25405cb..5adf5af5 100644 --- a/server/sockets.c +++ b/server/sockets.c @@ -54,6 +54,7 @@ #include <pthread.h> #include "internal.h" +#include "utils.h" static void set_selinux_label (void) @@ -107,7 +108,14 @@ bind_unix_socket (size_t *nr_socks) set_selinux_label (); +#ifdef SOCK_CLOEXEC sock = socket (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0); +#else + /* Fortunately, this code is only run at startup, so there is no + * risk of the fd leaking to a plugin's fork() + */ + sock = set_cloexec (socket (AF_UNIX, SOCK_STREAM, 0)); +#endif if (sock == -1) { perror ("bind_unix_socket: socket"); exit (EXIT_FAILURE); diff --git a/tests/test-socket-activation.c b/tests/test-socket-activation.c index f98a9d62..a7f9fa14 100644 --- a/tests/test-socket-activation.c +++ b/tests/test-socket-activation.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -58,6 +58,11 @@ #include <sys/un.h> #include <sys/wait.h> +#ifndef SOCK_CLOEXEC +/* For this file, we don't care if fds are marked cloexec; leaking is okay. */ +#define SOCK_CLOEXEC 0 +#endif + #define FIRST_SOCKET_ACTIVATION_FD 3 #define NBDKIT_START_TIMEOUT 30 /* seconds */ @@ -215,12 +220,7 @@ main (int argc, char *argv[]) * not from the path), so we should be able to connect to the Unix * domain socket by its path and receive an NBD magic string. */ -#ifdef SOCK_CLOEXEC sock = socket (AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0); -#else - sock = socket (AF_UNIX, SOCK_STREAM, 0); - fcntl (sock, F_SETFD, FD_CLOEXEC); -#endif if (sock == -1) { perror ("socket"); exit (EXIT_FAILURE); diff --git a/tests/web-server.c b/tests/web-server.c index d79924ff..83f90881 100644 --- a/tests/web-server.c +++ b/tests/web-server.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -50,6 +50,11 @@ #include <pthread.h> +#ifndef SOCK_CLOEXEC +/* For this file, we don't care if fds are marked cloexec; leaking is okay. */ +#define SOCK_CLOEXEC 0 +#endif + static char tmpdir[] = "/tmp/wsXXXXXX"; static char sockpath[] = "............./sock"; static int listen_sock = -1; -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 06/17] cow, cache: Better mkostemp fallback
In 2018, we added fallbacks for Haiku lacking mkostemp (one of the atomic CLOEXEC interfaces proposed for future POSIX [1]); however, that fallback has minor bugs: if mkstemp() fails, we blindly call fcntl(-1) (which probably changes the errno later reported against "mkostemp: %s: %m"); and although it is historically unlikely that any other bits will be set, F_SETFD should be used in a read-modify-write pattern rather than a blind overwrite pattern. [1] http://austingroupbugs.net/view.php?id=411 Update these fallbacks to use our newly-added set_cloexec utility function; neither place needs atomicity because they occur during .load (where we are still more-or-less single-threaded), rather than while another thread may be actively inside a plugin (such as sh) using fork(). Fixes: 60076fbcfd Fixes: b962272a56 Signed-off-by: Eric Blake <eblake@redhat.com> --- common/utils/utils.c | 5 ++++- filters/cache/blk.c | 12 +++++++++++- filters/cow/blk.c | 12 +++++++++++- 3 files changed, 26 insertions(+), 3 deletions(-) diff --git a/common/utils/utils.c b/common/utils/utils.c index 0548058c..7b63b4d4 100644 --- a/common/utils/utils.c +++ b/common/utils/utils.c @@ -140,12 +140,15 @@ exit_status_to_nbd_error (int status, const char *cmd) */ int set_cloexec (int fd) { -#ifdef SOCK_CLOEXEC +#if defined SOCK_CLOEXEC && defined HAVE_MKOSTEMP nbdkit_error ("prefer creating fds with CLOEXEC atomically set"); close (fd); errno = EBADF; return -1; #else +# if defined SOCK_CLOEXEC || defined HAVE_MKOSTEMP +# error "Unexpected: your system has incomplete atomic CLOEXEC support" +# endif int f; int err; diff --git a/filters/cache/blk.c b/filters/cache/blk.c index cf7145d8..272d176e 100644 --- a/filters/cache/blk.c +++ b/filters/cache/blk.c @@ -109,8 +109,18 @@ blk_init (void) #ifdef HAVE_MKOSTEMP fd = mkostemp (template, O_CLOEXEC); #else + /* Not atomic, but this is only invoked during .load, so the race + * won't affect any plugin actions trying to fork + */ fd = mkstemp (template); - fcntl (fd, F_SETFD, FD_CLOEXEC); + if (fd >= 0) { + fd = set_cloexec (fd); + if (fd < 0) { + int e = errno; + unlink (template); + errno = e; + } + } #endif if (fd == -1) { nbdkit_error ("mkostemp: %s: %m", tmpdir); diff --git a/filters/cow/blk.c b/filters/cow/blk.c index be43f2ff..2cae1122 100644 --- a/filters/cow/blk.c +++ b/filters/cow/blk.c @@ -114,8 +114,18 @@ blk_init (void) #ifdef HAVE_MKOSTEMP fd = mkostemp (template, O_CLOEXEC); #else + /* Not atomic, but this is only invoked during .load, so the race + * won't affect any plugin actions trying to fork + */ fd = mkstemp (template); - fcntl (fd, F_SETFD, FD_CLOEXEC); + if (fd >= 0) { + fd = set_cloexec (fd); + if (fd < 0) { + int e = errno; + unlink (template); + errno = e; + } + } #endif if (fd == -1) { nbdkit_error ("mkostemp: %s: %m", tmpdir); -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 07/17] build: Audit for use of pipe2
Haiku unfortunately lacks pipe2, so we have to plan for a fallback at each site that uses it. This also makes a good time to audit all existing users of pipe, to see if they should be using pipe2. The tests fork() but don't fail because of fd leaks; and the nbd plugin doesn't fork() but was merely using pipe2 for convenience over multiple fcntl calls. However, the server's quit_fd definitely needs to be marked CLOEXEC (it's easy to use the sh plugin to show that we are currently leaking it to children), although doing so can occur without worrying about atomicity since it is called before threads begin. Finally, it's also worth updating our set_cloexec helper function to document that we still prefer atomicity where possible. Signed-off-by: Eric Blake <eblake@redhat.com> --- configure.ac | 1 + common/utils/utils.c | 4 ++-- plugins/nbd/nbd.c | 31 +++++++++++++++++++++++++++++++ server/quit.c | 18 ++++++++++++++++++ tests/test-layers.c | 4 +++- tests/test-streaming.c | 5 +++-- 6 files changed, 58 insertions(+), 5 deletions(-) diff --git a/configure.ac b/configure.ac index c6bb1b10..cabec5c8 100644 --- a/configure.ac +++ b/configure.ac @@ -196,6 +196,7 @@ AC_CHECK_FUNCS([\ fdatasync \ get_current_dir_name \ mkostemp \ + pipe2 \ posix_fadvise]) dnl Check whether printf("%m") works diff --git a/common/utils/utils.c b/common/utils/utils.c index 7b63b4d4..9ac3443b 100644 --- a/common/utils/utils.c +++ b/common/utils/utils.c @@ -140,13 +140,13 @@ exit_status_to_nbd_error (int status, const char *cmd) */ int set_cloexec (int fd) { -#if defined SOCK_CLOEXEC && defined HAVE_MKOSTEMP +#if defined SOCK_CLOEXEC && defined HAVE_MKOSTEMP && defined HAVE_PIPE2 nbdkit_error ("prefer creating fds with CLOEXEC atomically set"); close (fd); errno = EBADF; return -1; #else -# if defined SOCK_CLOEXEC || defined HAVE_MKOSTEMP +# if defined SOCK_CLOEXEC || defined HAVE_MKOSTEMP || defined HAVE_PIPE2 # error "Unexpected: your system has incomplete atomic CLOEXEC support" # endif int f; diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c index e8bc7798..95d910e7 100644 --- a/plugins/nbd/nbd.c +++ b/plugins/nbd/nbd.c @@ -431,11 +431,42 @@ nbdplug_open_handle (int readonly) nbdkit_error ("malloc: %m"); return NULL; } +#ifdef HAVE_PIPE2 if (pipe2 (h->fds, O_NONBLOCK)) { + nbdkit_error ("pipe2: %m"); + free (h); + return NULL; + } +#else + /* This plugin doesn't fork, so we don't care about CLOEXEC. Our use + * of pipe2 is merely for convenience. + */ + if (pipe (h->fds)) { nbdkit_error ("pipe: %m"); free (h); return NULL; } + else { + int f; + + f = fcntl (h->fds[0], F_GETFL); + if (f == -1 || fcntl (h->fds[0], F_SETFL, f | O_NONBLOCK) == -1) { + nbdkit_error ("fcntl: %m"); + close (h->fds[0]); + close (h->fds[1]); + free (h); + return NULL; + } + f = fcntl (h->fds[1], F_GETFL); + if (f == -1 || fcntl (h->fds[1], F_SETFL, f | O_NONBLOCK) == -1) { + nbdkit_error ("fcntl: %m"); + close (h->fds[0]); + close (h->fds[1]); + free (h); + return NULL; + } + } +#endif retry: h->fd = -1; diff --git a/server/quit.c b/server/quit.c index c2ac11ef..ffd3fecb 100644 --- a/server/quit.c +++ b/server/quit.c @@ -32,6 +32,7 @@ #include <config.h> +#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <stdarg.h> @@ -39,6 +40,7 @@ #include <unistd.h> #include "internal.h" +#include "utils.h" /* Detection of request to exit via signal. Most places in the code * can just poll quit at opportune moments, while sockets.c needs a @@ -54,10 +56,26 @@ set_up_quit_pipe (void) { int fds[2]; +#ifdef HAVE_PIPE2 + if (pipe2 (fds, O_CLOEXEC) < 0) { + perror ("pipe2"); + exit (EXIT_FAILURE); + } +#else + /* This is called early enough that no other thread will be + * fork()ing while we create this; but we must set CLOEXEC so that + * the fds don't leak into children. + */ if (pipe (fds) < 0) { perror ("pipe"); exit (EXIT_FAILURE); } + if (set_cloexec (fds[0]) == -1 || + set_cloexec (fds[1]) == -1) { + perror ("fcntl"); + exit (EXIT_FAILURE); + } +#endif quit_fd = fds[0]; write_quit_fd = fds[1]; } diff --git a/tests/test-layers.c b/tests/test-layers.c index a820ba57..6617cd73 100644 --- a/tests/test-layers.c +++ b/tests/test-layers.c @@ -101,7 +101,9 @@ main (int argc, char *argv[]) exit (77); #endif - /* Socket for communicating with nbdkit. */ + /* Socket for communicating with nbdkit. The test doesn't care about + * fd leaks, so we don't bother with CLOEXEC. + */ if (socketpair (AF_LOCAL, SOCK_STREAM, 0, sfd) == -1) { perror ("socketpair"); exit (EXIT_FAILURE); diff --git a/tests/test-streaming.c b/tests/test-streaming.c index 4aaac9cd..e8c52ee8 100644 --- a/tests/test-streaming.c +++ b/tests/test-streaming.c @@ -70,8 +70,9 @@ main (int argc, char *argv[]) NULL) == -1) exit (EXIT_FAILURE); - /* Fork to run a second process which reads from streaming.fifo - * and checks that the content is correct. + /* Fork to run a second process which reads from streaming.fifo and + * checks that the content is correct. The test doesn't care about + * fd leaks, so we don't bother with CLOEXEC. */ if (pipe (pipefd) == -1) { perror ("pipe"); -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 08/17] rate: Atomically set CLOEXEC on fds
The rate filter is potentially opening fds in one thread while another thread is processing a fork() in the plugin. Although the file is not open for long, we must either use atomic CLOEXEC to avoid fd leaks, or degrade the thread model to SERIALIZE_ALL_REQUESTS. However, the leak is hard to observe (right now, it requires an out-of-tree plugin, since the sh plugin is already at SERIALIZE_ALL_REQUESTS; and the window where the file is open is already small). The easiest solution would be the use of fopen("re"), but although that is slated for future addition to POSIX [1], it is not currently available in Haiku [2]. Thankfully, Haiku has O_CLOEXEC, so we can use that. [1] http://austingroupbugs.net/view.php?id=411 [2] https://dev.haiku-os.org/ticket/15219 This also fixes a minor bug where fclose() may have corrupted the value of errno printed in nbdkit_debug after a failed getline(). Perhaps we could switch to low-level read() into a fixed buffer instead of using fdopen()/readline(), but the patch already seemed large enough. Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/rate/rate.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/filters/rate/rate.c b/filters/rate/rate.c index f8dda0b0..dbd92ad6 100644 --- a/filters/rate/rate.c +++ b/filters/rate/rate.c @@ -34,6 +34,7 @@ #include <config.h> +#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <stdint.h> @@ -41,6 +42,7 @@ #include <string.h> #include <time.h> #include <sys/time.h> +#include <unistd.h> #include <pthread.h> @@ -195,6 +197,7 @@ rate_close (void *handle) static void maybe_adjust (const char *file, struct bucket *bucket, pthread_mutex_t *lock) { + int fd; FILE *fp; ssize_t r; size_t len = 0; @@ -204,16 +207,27 @@ maybe_adjust (const char *file, struct bucket *bucket, pthread_mutex_t *lock) if (!file) return; - fp = fopen (file, "r"); - if (fp == NULL) + /* Alas, Haiku lacks fopen("re"), so we have to spell this out the + * long way. We require atomic CLOEXEC, in case the plugin is using + * fork() in a parallel thread model. + */ + fd = open (file, O_CLOEXEC | O_RDONLY); + if (fd == -1) return; /* this is not an error */ + fp = fdopen (fd, "r"); + if (fp == NULL) { + nbdkit_debug ("fdopen: %s: %m", file); + close (fd); + return; /* unexpected, but treat it as a non-error */ + } r = getline (&line, &len, fp); - fclose (fp); if (r == -1) { nbdkit_debug ("could not read rate file: %s: %m", file); + fclose (fp); return; } + fclose (fp); if (r > 0 && line[r-1] == '\n') line[r-1] = '\0'; new_rate = nbdkit_parse_size (line); -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 09/17] server: Use atomic CLOEXEC for nbdkit_read_password
Although nbdkit_read_password is primarily intended for use during .config, which is before any other thread might be trying to fork, this function is exported for generic use by plugins, and we have no guarantee that a user will not try to use it at other times when there are multiple threads and where an fd leak would be detrimental. However, although POSIX will eventually be adding fopen("re") to make this easy [1], Haiku does not support that yet [2]. So we have to do it by hand with the lower-level open(O_CLOEXEC)/fdopen. Our recently-added unit test should ensure that things still work. [1] http://austingroupbugs.net/view.php?id=411 [2] https://dev.haiku-os.org/ticket/15219 Signed-off-by: Eric Blake <eblake@redhat.com> --- server/public.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/server/public.c b/server/public.c index 4b8b1d2b..523a31f9 100644 --- a/server/public.c +++ b/server/public.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013-2018 Red Hat Inc. + * Copyright (C) 2013-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -36,6 +36,7 @@ #include <config.h> +#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <stdbool.h> @@ -241,11 +242,19 @@ nbdkit_read_password (const char *value, char **password) /* Read password from a file. */ else if (value[0] == '+') { - fp = fopen (&value[1], "r"); - if (fp == NULL) { + int fd; + + fd = open (&value[1], O_CLOEXEC | O_RDONLY); + if (fd == -1) { nbdkit_error ("open %s: %m", &value[1]); return -1; } + fp = fdopen (fd, "r"); + if (fp == NULL) { + nbdkit_error ("fdopen %s: %m", &value[1]); + close (fd); + return -1; + } r = getline (password, &n, fp); err = errno; fclose (fp); -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 10/17] plugins: Add .fork_safe field
Allow a plugin field to declare whether a parallel plugin can tolerate windows where fds are not CLOEXEC, or must take precautions to avoid leaking fds if the plugin may fork. For safety reasons, the flag defaults to off, but many in-tree plugins can set it to on (most commonly because they don't fork after .config_complete; for libvirt because it is documented to clean up fds on fork so it is immune to anything we might leak; for libnbd because we don't use the API that forks). Note that I did not choose to set the new field for many of the various language bindings (it becomes a rather difficult task to prove whether the third-party language binding code is itself using atomic CLOEXEC or fd sanitization). However many of our languages are still stuck as serialized, and the lack of .fork_safe won't impact those thread model anyways. Update the testsuite to skip parallel tests that would otherwise fail when the thread model is crippled. Upcoming patches will then fix the server to audit and fix places where we currently leak fds, and then cripple the thread model only on platforms where atomic CLOEXEC is not possible. Signed-off-by: Eric Blake <eblake@redhat.com> --- I'm open to bikeshed ideas on a better name; perhaps '.tolerate_fd_leak'? The goal is the following conditional: - nbdkit doesn't leak fds: no impact to thread model - plugin doesn't care about leaked fds (either it doesn't fork, or it scrubs unknown fds after fork): no impact to thread model - plugin does not want leaked fds (including historical plugins that didn't state one way or another): cripple thread model to ensure no leaked fds --- docs/nbdkit-plugin.pod | 21 +++++++++++++++++---- include/nbdkit-plugin.h | 1 + plugins/curl/curl.c | 1 + plugins/data/data.c | 1 + plugins/example2/example2.c | 1 + plugins/example3/example3.c | 1 + plugins/ext2/ext2.c | 1 + plugins/file/file.c | 1 + plugins/floppy/floppy.c | 1 + plugins/full/full.c | 1 + plugins/guestfs/guestfs-plugin.c | 1 + plugins/gzip/gzip.c | 1 + plugins/iso/iso.c | 1 + plugins/libvirt/libvirt-plugin.c | 1 + plugins/linuxdisk/linuxdisk.c | 1 + plugins/memory/memory.c | 1 + plugins/nbd/nbd-standalone.c | 1 + plugins/nbd/nbd.c | 1 + plugins/null/null.c | 1 + plugins/partitioning/partitioning.c | 1 + plugins/pattern/pattern.c | 1 + plugins/random/random.c | 1 + plugins/sh/sh.c | 1 + plugins/split/split.c | 1 + plugins/ssh/ssh.c | 1 + plugins/streaming/streaming.c | 1 + plugins/zero/zero.c | 1 + server/plugins.c | 10 ++++++++++ tests/test-parallel-file.sh | 3 +++ tests/test-parallel-nbd.sh | 5 ++++- 30 files changed, 60 insertions(+), 5 deletions(-) diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod index fe9ada87..6639de61 100644 --- a/docs/nbdkit-plugin.pod +++ b/docs/nbdkit-plugin.pod @@ -920,8 +920,8 @@ error message, and C<nbdkit_set_error> to record an appropriate error =head1 OTHER FIELDS -The plugin struct also contains an integer field used as a -boolean in C code, but unlikely to be exposed in other language +The plugin struct also contains a couple of integer fields used as +booleans in C code, but unlikely to be exposed in other language bindings: =over 4 @@ -932,6 +932,18 @@ This defaults to 0; if non-zero, nbdkit can reliably use the value of C<errno> when a callback reports failure, rather than the plugin having to call C<nbdkit_set_error>. +=item C<.fork_safe> + +This defaults to 0; if non-zero, the plugin either does not create +child processes via L<fork(2)> or similar, or else takes care to avoid +leaking a file descriptor created in one thread to a child process +created in another (either by always using atomic C<FD_CLOEXEC>, or by +closing unexpected fds between fork and exec). On platforms where +this is 0 and nbdkit cannot atomically set C<FD_CLOEXEC>, the thread +model will be restricted to +C<NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS> to avoid leaking file +descriptors. + =back =head1 THREADS @@ -942,8 +954,9 @@ C<NBDKIT_REGISTER_PLUGIN>). Additionally, a plugin may implement the C<.thread_model> callback, called right after C<.config_complete> to make a runtime decision on which thread model to use. The nbdkit server chooses the most restrictive model between the plugin's -C<THREAD_MODEL>, the C<.thread_model> if present, and any restrictions -requested by filters. +C<THREAD_MODEL>, the C<.thread_model> if present, any restrictions +requested by filters, and any platform-specific limitations implied if +C<.fork_safe> is not set. The possible settings for C<THREAD_MODEL> are defined below. diff --git a/include/nbdkit-plugin.h b/include/nbdkit-plugin.h index 632df867..75e08651 100644 --- a/include/nbdkit-plugin.h +++ b/include/nbdkit-plugin.h @@ -132,6 +132,7 @@ struct nbdkit_plugin { int (*cache) (void *handle, uint32_t count, uint64_t offset, uint32_t flags); int (*thread_model) (void); + int fork_safe; }; extern void nbdkit_set_error (int err); diff --git a/plugins/curl/curl.c b/plugins/curl/curl.c index 7e3e8c92..192bae4e 100644 --- a/plugins/curl/curl.c +++ b/plugins/curl/curl.c @@ -586,6 +586,7 @@ static struct nbdkit_plugin plugin = { .get_size = curl_get_size, .pread = curl_pread, .pwrite = curl_pwrite, + .fork_safe = 1, /* libcurl does not appear to use fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/data/data.c b/plugins/data/data.c index 14fb8997..6e7d4296 100644 --- a/plugins/data/data.c +++ b/plugins/data/data.c @@ -433,6 +433,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/example2/example2.c b/plugins/example2/example2.c index 6adc100a..0ce081c8 100644 --- a/plugins/example2/example2.c +++ b/plugins/example2/example2.c @@ -224,6 +224,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/example3/example3.c b/plugins/example3/example3.c index 5d2b626c..0037349c 100644 --- a/plugins/example3/example3.c +++ b/plugins/example3/example3.c @@ -232,6 +232,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/ext2/ext2.c b/plugins/ext2/ext2.c index 6698d99f..d6f6d792 100644 --- a/plugins/ext2/ext2.c +++ b/plugins/ext2/ext2.c @@ -358,6 +358,7 @@ static struct nbdkit_plugin plugin = { .pwrite = ext2_pwrite, .flush = ext2_flush, .errno_is_preserved = 1, + .fork_safe = 1, /* ext2fs does not appear to use fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/file/file.c b/plugins/file/file.c index 9df5001d..d71c2379 100644 --- a/plugins/file/file.c +++ b/plugins/file/file.c @@ -667,6 +667,7 @@ static struct nbdkit_plugin plugin = { .cache = file_cache, #endif .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/floppy/floppy.c b/plugins/floppy/floppy.c index 41a23644..bcec4074 100644 --- a/plugins/floppy/floppy.c +++ b/plugins/floppy/floppy.c @@ -210,6 +210,7 @@ static struct nbdkit_plugin plugin = { .can_cache = floppy_can_cache, .pread = floppy_pread, .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/full/full.c b/plugins/full/full.c index 9cfbcfcd..7d4d2ba3 100644 --- a/plugins/full/full.c +++ b/plugins/full/full.c @@ -179,6 +179,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/guestfs/guestfs-plugin.c b/plugins/guestfs/guestfs-plugin.c index b0061e26..200b0094 100644 --- a/plugins/guestfs/guestfs-plugin.c +++ b/plugins/guestfs/guestfs-plugin.c @@ -593,6 +593,7 @@ static struct nbdkit_plugin plugin = { .pread = plugin_guestfs_pread, .pwrite = plugin_guestfs_pwrite, .flush = plugin_guestfs_flush, + .fork_safe = 0, /* libguestfs uses fork(), unaudited for safety */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/gzip/gzip.c b/plugins/gzip/gzip.c index c6baa52e..5821d468 100644 --- a/plugins/gzip/gzip.c +++ b/plugins/gzip/gzip.c @@ -222,6 +222,7 @@ static struct nbdkit_plugin plugin = { .close = gzip_close, .get_size = gzip_get_size, .pread = gzip_pread, + .fork_safe = 1, /* zlib does not appear to use fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/iso/iso.c b/plugins/iso/iso.c index 5634bac9..bb942f85 100644 --- a/plugins/iso/iso.c +++ b/plugins/iso/iso.c @@ -260,6 +260,7 @@ static struct nbdkit_plugin plugin = { .can_cache = iso_can_cache, .pread = iso_pread, .errno_is_preserved = 1, + .fork_safe = 1, /* no fork()s after .config_complete */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/libvirt/libvirt-plugin.c b/plugins/libvirt/libvirt-plugin.c index 71cac42b..cd4e326c 100644 --- a/plugins/libvirt/libvirt-plugin.c +++ b/plugins/libvirt/libvirt-plugin.c @@ -215,6 +215,7 @@ static struct nbdkit_plugin plugin = { .close = virt_close, .get_size = virt_get_size, .pread = virt_pread, + .fork_safe = 1, /* libvirt uses fork(), but does so safely */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/linuxdisk/linuxdisk.c b/plugins/linuxdisk/linuxdisk.c index 99dbc996..24e6b8b2 100644 --- a/plugins/linuxdisk/linuxdisk.c +++ b/plugins/linuxdisk/linuxdisk.c @@ -232,6 +232,7 @@ static struct nbdkit_plugin plugin = { .can_cache = linuxdisk_can_cache, .pread = linuxdisk_pread, .errno_is_preserved = 1, + .fork_safe = 1, /* no fork()s after .config_complete */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/memory/memory.c b/plugins/memory/memory.c index 09162ea2..0f3fd86d 100644 --- a/plugins/memory/memory.c +++ b/plugins/memory/memory.c @@ -231,6 +231,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/nbd/nbd-standalone.c b/plugins/nbd/nbd-standalone.c index 1468098b..4f1224c7 100644 --- a/plugins/nbd/nbd-standalone.c +++ b/plugins/nbd/nbd-standalone.c @@ -1365,6 +1365,7 @@ static struct nbdkit_plugin plugin = { .extents = nbd_extents, .cache = nbd_cache, .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN (plugin) diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c index 95d910e7..7762c9c0 100644 --- a/plugins/nbd/nbd.c +++ b/plugins/nbd/nbd.c @@ -861,6 +861,7 @@ static struct nbdkit_plugin plugin = { .extents = nbdplug_extents, .cache = nbdplug_cache, .errno_is_preserved = 1, + .fork_safe = 1, /* libnbd uses fork() but not for URIs */ }; NBDKIT_REGISTER_PLUGIN (plugin) diff --git a/plugins/null/null.c b/plugins/null/null.c index 647624ba..3e7161b7 100644 --- a/plugins/null/null.c +++ b/plugins/null/null.c @@ -178,6 +178,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/partitioning/partitioning.c b/plugins/partitioning/partitioning.c index 90333bf7..1046579a 100644 --- a/plugins/partitioning/partitioning.c +++ b/plugins/partitioning/partitioning.c @@ -442,6 +442,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/pattern/pattern.c b/plugins/pattern/pattern.c index 6b9b3a0d..97418127 100644 --- a/plugins/pattern/pattern.c +++ b/plugins/pattern/pattern.c @@ -142,6 +142,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/random/random.c b/plugins/random/random.c index 6377310e..ebe97e85 100644 --- a/plugins/random/random.c +++ b/plugins/random/random.c @@ -172,6 +172,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c index 5869d9a4..669008e9 100644 --- a/plugins/sh/sh.c +++ b/plugins/sh/sh.c @@ -960,6 +960,7 @@ static struct nbdkit_plugin plugin = { .cache = sh_cache, .errno_is_preserved = 1, + .fork_safe = 0, /* use of fork() is not yet safe from fd leaks */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/split/split.c b/plugins/split/split.c index 68682b29..babbaf4a 100644 --- a/plugins/split/split.c +++ b/plugins/split/split.c @@ -339,6 +339,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/ssh/ssh.c b/plugins/ssh/ssh.c index d3fa2931..43f71e41 100644 --- a/plugins/ssh/ssh.c +++ b/plugins/ssh/ssh.c @@ -632,6 +632,7 @@ static struct nbdkit_plugin plugin = { .pwrite = ssh_pwrite, .can_flush = ssh_can_flush, .flush = ssh_flush, + .fork_safe = 0, /* libssh uses fork(), unaudited for safety */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/streaming/streaming.c b/plugins/streaming/streaming.c index 4ab73b8b..315ff689 100644 --- a/plugins/streaming/streaming.c +++ b/plugins/streaming/streaming.c @@ -256,6 +256,7 @@ static struct nbdkit_plugin plugin = { .pwrite = streaming_pwrite, .pread = streaming_pread, .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/plugins/zero/zero.c b/plugins/zero/zero.c index b240502c..27b8fe18 100644 --- a/plugins/zero/zero.c +++ b/plugins/zero/zero.c @@ -95,6 +95,7 @@ static struct nbdkit_plugin plugin = { * paths from failed system calls. */ .errno_is_preserved = 1, + .fork_safe = 1, /* no use of fork() */ }; NBDKIT_REGISTER_PLUGIN(plugin) diff --git a/server/plugins.c b/server/plugins.c index 3bb20c93..0e6c05b0 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -90,6 +90,15 @@ plugin_thread_model (struct backend *b) int thread_model = p->plugin._thread_model; int r; + /* For now, we leak fds on all platforms; once that is fixed, this + * restriction can be limited to only occur when !HAVE_ACCEPT4. + */ + if (p->plugin._thread_model > NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS && + p->plugin.fork_safe == 0) { + nbdkit_debug (".fork_safe not set, serializing to avoid fd leaks"); + thread_model = NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS; + } + if (p->plugin.thread_model) { r = p->plugin.thread_model (); if (r == -1) @@ -187,6 +196,7 @@ plugin_dump_fields (struct backend *b) plugin_dump_thread_model ("max_thread_model", p->plugin._thread_model); plugin_dump_thread_model ("thread_model", backend->thread_model (backend)); printf ("errno_is_preserved=%d\n", !!p->plugin.errno_is_preserved); + printf ("fork_safe=%d\n", !!p->plugin.fork_safe); if (p->plugin.magic_config_key) printf ("magic_config_key=%s\n", p->plugin.magic_config_key); diff --git a/tests/test-parallel-file.sh b/tests/test-parallel-file.sh index 8335dc99..20276d48 100755 --- a/tests/test-parallel-file.sh +++ b/tests/test-parallel-file.sh @@ -36,6 +36,9 @@ source ./functions.sh requires test -f file-data requires qemu-io --version +nbdkit --dump-plugin file | grep -q ^thread_model=parallel || + { echo "nbdkit lacks support for parallel requests"; exit 77; } + cleanup_fn rm -f test-parallel-file.data test-parallel-file.out # Populate file, and sanity check that qemu-io can issue parallel requests diff --git a/tests/test-parallel-nbd.sh b/tests/test-parallel-nbd.sh index 36088fa1..4e546df4 100755 --- a/tests/test-parallel-nbd.sh +++ b/tests/test-parallel-nbd.sh @@ -1,6 +1,6 @@ #!/usr/bin/env bash # nbdkit -# Copyright (C) 2017-2018 Red Hat Inc. +# Copyright (C) 2017-2019 Red Hat Inc. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are @@ -36,6 +36,9 @@ source ./functions.sh requires test -f file-data requires qemu-io --version +nbdkit --dump-plugin nbd | grep -q ^thread_model=parallel || + { echo "nbdkit lacks support for parallel requests"; exit 77; } + sock=`mktemp -u` files="test-parallel-nbd.out $sock test-parallel-nbd.data test-parallel-nbd.pid" rm -f $files -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 11/17] server: Atomically set CLOEXEC on accept fds
The sh plugin makes it easy to test whether we leak fds into a plugin that fork()s as part of handling a client request: ./nbdkit -U - --filter=log --filter=stats sh - \ logfile=/dev/null statsfile=/dev/null \ --run 'qemu-io -r -f raw -c "r 0 1" $nbd' <<\EOF case $1 in get_size) echo 1m;; pread) ls -l /proc/$$/fd >/dev/tty dd iflag=count_bytes count=$3 if=/dev/zero || exit 1 ;; *) exit 2 ;; esac EOF Pre-patch, this demonstrates that we are leaking internal socket and filter's log fds, as well as internal pipe fds fixed a couple patches ago (the listing should only include fds 0, 1, 2, and whatever sh has open on the temporary script file; bash picks 255 for that). This patch deals with the server leaks, the next one with the filter leaks. In the case of accept, we either require atomicity or a mutex to ensure that it is not possible for one thread to be accepting a new client while another thread is processing .pread or similar with a fork(), so that we avoid this race: client1 client2 fd1 = accept .pread fd2 = accept fork fcntl(F_SETFD, FD_CLOEXEC) where the fd2 socket for the second client leaks into the fork for handling the first client's pread callback. If the server is running with a thread model of SERIALIZE_ALL_REQUESTS or stricter, then we already have such a lock (although we have to tweak lock_request to work with a NULL connection), but the race could bite a server in the looser SERIALIZE_REQUESTS or PARALLEL thread models. But since we just added the .fork_safe witness of whether a plugin can tolerate fd leaks, we can now easily use that as the decision of whether to stick with accept4() (not yet in POSIX, but documented [1] for several years, and present on RHEL 6 and FreeBSD 10), or cripple parallel plugins that aren't prepared for leaked fds (for platforms like Haiku that still need atomic CLOEXEC [2]). [1] http://austingroupbugs.net/view.php?id=411 [2] https://dev.haiku-os.org/ticket/15219 Signed-off-by: Eric Blake <eblake@redhat.com> --- configure.ac | 1 + server/internal.h | 6 ++---- common/utils/utils.c | 6 ++++-- server/locks.c | 4 ++-- server/plugins.c | 8 ++++++-- server/sockets.c | 32 ++++++++++++++++++++++++++++---- 6 files changed, 43 insertions(+), 14 deletions(-) diff --git a/configure.ac b/configure.ac index cabec5c8..054ad64a 100644 --- a/configure.ac +++ b/configure.ac @@ -193,6 +193,7 @@ AC_CHECK_HEADERS([\ dnl Check for functions in libc, all optional. AC_CHECK_FUNCS([\ + accept4 \ fdatasync \ get_current_dir_name \ mkostemp \ diff --git a/server/internal.h b/server/internal.h index 6207f0cf..76f2139f 100644 --- a/server/internal.h +++ b/server/internal.h @@ -312,10 +312,8 @@ extern struct backend *filter_register (struct backend *next, size_t index, extern void lock_init_thread_model (void); extern void lock_connection (void); extern void unlock_connection (void); -extern void lock_request (struct connection *conn) - __attribute__((__nonnull__ (1))); -extern void unlock_request (struct connection *conn) - __attribute__((__nonnull__ (1))); +extern void lock_request (struct connection *conn); +extern void unlock_request (struct connection *conn); extern void lock_unload (void); extern void unlock_unload (void); diff --git a/common/utils/utils.c b/common/utils/utils.c index 9ac3443b..32bc96a7 100644 --- a/common/utils/utils.c +++ b/common/utils/utils.c @@ -140,13 +140,15 @@ exit_status_to_nbd_error (int status, const char *cmd) */ int set_cloexec (int fd) { -#if defined SOCK_CLOEXEC && defined HAVE_MKOSTEMP && defined HAVE_PIPE2 +#if (defined SOCK_CLOEXEC && defined HAVE_MKOSTEMP && defined HAVE_PIPE2 && \ + defined HAVE_ACCEPT4) nbdkit_error ("prefer creating fds with CLOEXEC atomically set"); close (fd); errno = EBADF; return -1; #else -# if defined SOCK_CLOEXEC || defined HAVE_MKOSTEMP || defined HAVE_PIPE2 +# if (defined SOCK_CLOEXEC || defined HAVE_MKOSTEMP || defined HAVE_PIPE2 || \ + defined HAVE_ACCEPT4) # error "Unexpected: your system has incomplete atomic CLOEXEC support" # endif int f; diff --git a/server/locks.c b/server/locks.c index 1c50186e..739e22a1 100644 --- a/server/locks.c +++ b/server/locks.c @@ -77,7 +77,7 @@ lock_request (struct connection *conn) pthread_mutex_lock (&all_requests_lock)) abort (); - if (thread_model <= NBDKIT_THREAD_MODEL_SERIALIZE_REQUESTS && + if (conn && thread_model <= NBDKIT_THREAD_MODEL_SERIALIZE_REQUESTS && pthread_mutex_lock (&conn->request_lock)) abort (); @@ -91,7 +91,7 @@ unlock_request (struct connection *conn) if (pthread_rwlock_unlock (&unload_prevention_lock)) abort (); - if (thread_model <= NBDKIT_THREAD_MODEL_SERIALIZE_REQUESTS && + if (conn && thread_model <= NBDKIT_THREAD_MODEL_SERIALIZE_REQUESTS && pthread_mutex_unlock (&conn->request_lock)) abort (); diff --git a/server/plugins.c b/server/plugins.c index 0e6c05b0..917f0eec 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -90,14 +90,18 @@ plugin_thread_model (struct backend *b) int thread_model = p->plugin._thread_model; int r; - /* For now, we leak fds on all platforms; once that is fixed, this - * restriction can be limited to only occur when !HAVE_ACCEPT4. +#ifndef HAVE_ACCEPT4 + /* If the server is unable to atomically set CLOEXEC when accepting + * a new connection, then we must serialize to ensure that we are + * not attempting to fork() in one thread while the server is + * accepting the next client in another thread. */ if (p->plugin._thread_model > NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS && p->plugin.fork_safe == 0) { nbdkit_debug (".fork_safe not set, serializing to avoid fd leaks"); thread_model = NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS; } +#endif if (p->plugin.thread_model) { r = p->plugin.thread_model (); diff --git a/server/sockets.c b/server/sockets.c index 5adf5af5..70d999ee 100644 --- a/server/sockets.c +++ b/server/sockets.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013-2018 Red Hat Inc. + * Copyright (C) 2013-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -183,7 +183,14 @@ bind_tcpip_socket (size_t *nr_socks) set_selinux_label (); - sock = socket (a->ai_family, a->ai_socktype, a->ai_protocol); +#ifdef SOCK_CLOEXEC + sock = socket (a->ai_family, a->ai_socktype | SOCK_CLOEXEC, a->ai_protocol); +#else + /* Fortunately, this code is only run at startup, so there is no + * risk of the fd leaking to a plugin's fork() + */ + sock = set_cloexec (socket (a->ai_family, a->ai_socktype, a->ai_protocol)); +#endif if (sock == -1) { perror ("bind_tcpip_socket: socket"); exit (EXIT_FAILURE); @@ -294,8 +301,25 @@ accept_connection (int listen_sock) thread_data->instance_num = instance_num++; thread_data->addrlen = sizeof thread_data->addr; again: - thread_data->sock = accept (listen_sock, - &thread_data->addr, &thread_data->addrlen); +#ifdef HAVE_ACCEPT4 + thread_data->sock = accept4 (listen_sock, + &thread_data->addr, &thread_data->addrlen, + SOCK_CLOEXEC); +#else + /* If we are fully parallel, then this function could be accepting + * connections in one thread while another thread could be in a + * plugin trying to fork. But thanks to .fork_safe, we either know + * that the plugin can tolerate the fd leak, or we've restricted + * things to serialize_all_requests so that we don't perform the + * accept until the plugin is not running. Therefore, not being + * atomic is okay. + */ + lock_request (NULL); + thread_data->sock = set_cloexec (accept (listen_sock, + &thread_data->addr, + &thread_data->addrlen)); + unlock_request (NULL); +#endif if (thread_data->sock == -1) { if (errno == EINTR || errno == EAGAIN) goto again; -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 12/17] filters: Set CLOEXEC on files opened during .config
Any file descriptor left opened after .config must eventually be marked CLOEXEC; otherwise, if our plugin fork()s (such as the sh plugin), we leak an fd into that plugin (observed with the example in the previous patch). If atomicity matters, the options are fopen("we") (sadly, Haiku lacks this), or open(O_CLOEXEC)/fdopen("w"). If atomicity does not matter, you can also get by with fopen("w")/fcntl(F_SETFD). But since Haiku at least has O_CLOEXEC, and that version is less verbose, we don't have to worry about the longer fcntl variant. Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/log/log.c | 22 ++++++++++++++++++++-- filters/stats/stats.c | 21 +++++++++++++++++++-- 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/filters/log/log.c b/filters/log/log.c index 9c0f35cd..7cf741e1 100644 --- a/filters/log/log.c +++ b/filters/log/log.c @@ -42,6 +42,8 @@ #include <pthread.h> #include <sys/time.h> #include <assert.h> +#include <fcntl.h> +#include <unistd.h> #include <nbdkit-filter.h> @@ -88,13 +90,29 @@ log_config (nbdkit_next_config *next, void *nxdata, static int log_config_complete (nbdkit_next_config_complete *next, void *nxdata) { + int fd; + if (!logfilename) { nbdkit_error ("missing logfile= parameter for the log filter"); return -1; } - logfile = fopen (logfilename, append ? "a" : "w"); + /* Using fopen("ae"/"we") would be more convenient, but as Haiku + * still lacks that, use this instead. Atomicity is not essential + * here since .config completes before threads that might fork, if + * we have to later add yet another fallback to fcntl(fileno()) for + * systems without O_CLOEXEC. + */ + fd = open (logfilename, + O_CLOEXEC | O_WRONLY | O_CREAT | (append ? O_APPEND : O_TRUNC), + 0666); + if (fd < 0) { + nbdkit_error ("open: %s: %m", logfilename); + return -1; + } + logfile = fdopen (fd, append ? "a" : "w"); if (!logfile) { - nbdkit_error ("fopen: %m"); + nbdkit_error ("fdopen: %s: %m", logfilename); + close (fd); return -1; } diff --git a/filters/stats/stats.c b/filters/stats/stats.c index 9631cb39..d8ff02bc 100644 --- a/filters/stats/stats.c +++ b/filters/stats/stats.c @@ -39,6 +39,8 @@ #include <inttypes.h> #include <string.h> #include <sys/time.h> +#include <fcntl.h> +#include <unistd.h> #include <pthread.h> @@ -139,14 +141,29 @@ stats_config (nbdkit_next_config *next, void *nxdata, static int stats_config_complete (nbdkit_next_config_complete *next, void *nxdata) { + int fd; + if (filename == NULL) { nbdkit_error ("stats filter requires statsfile parameter"); return -1; } - fp = fopen (filename, append ? "a" : "w"); + /* Using fopen("ae"/"we") would be more convenient, but as Haiku + * still lacks that, use this instead. Atomicity is not essential + * here since .config completes before threads that might fork, if + * we have to later add yet another fallback to fcntl(fileno()) for + * systems without O_CLOEXEC. + */ + fd = open (filename, + O_CLOEXEC | O_WRONLY | O_CREAT | (append ? O_APPEND : O_TRUNC), + 0666); + if (fd < 0) { + nbdkit_error ("open: %s: %m", filename); + return -1; + } + fp = fdopen (fd, append ? "a" : "w"); if (fp == NULL) { - nbdkit_error ("%s: %m", filename); + nbdkit_error ("fdopen: %s: %m", filename); return -1; } -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 13/17] python: Use CLOEXEC on script
I am unable to quickly determine whether the mere act of loading a python file can cause enough code to execute as to trigger the Python interpreter into attempting a fork. So for safety's sake, let's set CLOEXEC on the file before handing it to python rather than risking an accidental fd leak. Since Haiku lacks fopen("re"), and I already decided that set_cloexec(fileno(fp)) is undesirable, this has to use the atomic open/fdopen sequence, even though it does not actually need atomicity. Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/python/python.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/plugins/python/python.c b/plugins/python/python.c index c05e1d6c..71a982a3 100644 --- a/plugins/python/python.c +++ b/plugins/python/python.c @@ -38,6 +38,7 @@ #include <config.h> +#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <stdarg.h> @@ -295,6 +296,7 @@ py_dump_plugin (void) static int py_config (const char *key, const char *value) { + int fd; FILE *fp; PyObject *modname; PyObject *fn; @@ -309,10 +311,22 @@ py_config (const char *key, const char *value) } script = value; - /* Load the Python script. */ - fp = fopen (script, "r"); + /* Load the Python script. Mark the file CLOEXEC, in case the act + * of loading the script invokes code that in turn fork()s. + * However, we can't rely on fopen("re"), so do it by hand. This + * does not have to be atomic, because there are no threads during + * .config before the python interpreter is running, but it's + * easier to use open/fdopen than fopen/fcntl(fileno). + */ + fd = open (script, O_CLOEXEC | O_RDONLY); + if (fd == -1) { + nbdkit_error ("%s: cannot open file: %m", script); + return -1; + } + fp = fdopen (fd, "r"); if (!fp) { nbdkit_error ("%s: cannot open file: %m", script); + close (fd); return -1; } -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible
Technically, as long as our thread model is SERIALIZE_ALL_REQUESTS, we don't have to be very careful about atomic CLOEXEC on any of the pipes we create for communication with the child. However, the next patch wants to promote sh plugin to parallel when possible, which requires the use of pipe2 to avoid fd leaks. Also, add an assert to ensure that we avoid dup2(n, n) (which would fail to clear CLOEXEC) and close(0-2) (otherwise, the logic for which fds to dup and close becomes a lot more tedious). Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/sh/call.c | 34 +++++++++++++++++++++++++++++++++- plugins/sh/sh.c | 5 ++++- 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/plugins/sh/call.c b/plugins/sh/call.c index 9b8b48e2..bb80f642 100644 --- a/plugins/sh/call.c +++ b/plugins/sh/call.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -32,6 +32,8 @@ #include <config.h> +#include <assert.h> +#include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <inttypes.h> @@ -94,6 +96,27 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to stdin */ *rbuflen = *ebuflen = 0; rbufalloc = ebufalloc = 0; +#ifdef HAVE_PIPE2 + if (pipe2 (in_fd, O_CLOEXEC) == -1) { + nbdkit_error ("%s: pipe2: %m", script); + goto error; + } + if (pipe2 (out_fd, O_CLOEXEC) == -1) { + nbdkit_error ("%s: pipe2: %m", script); + goto error; + } + if (pipe2 (err_fd, O_CLOEXEC) == -1) { + nbdkit_error ("%s: pipe2: %m", script); + goto error; + } +#else + /* Without pipe2, we leave .fork_safe at 0 which forces + * NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS, this in turn ensures + * no other thread will be trying to fork, and thus we can skip + * worrying about CLOEXEC races. Adding full parallel support on + * this system would require a loop after fork to close all + * unexpected fds. + */ if (pipe (in_fd) == -1) { nbdkit_error ("%s: pipe: %m", script); goto error; @@ -106,6 +129,15 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to stdin */ nbdkit_error ("%s: pipe: %m", script); goto error; } +#endif + + /* Ensure that stdin/out/err of the current process were not empty + * before we started creating pipes (otherwise, the close and dup2 + * calls below become more complex to juggle fds around correctly). + */ + assert (in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && + out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && + err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO); pid = fork (); if (pid == -1) { diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c index 669008e9..79c9a4ac 100644 --- a/plugins/sh/sh.c +++ b/plugins/sh/sh.c @@ -263,6 +263,7 @@ sh_config_complete (void) } } +/* See also .fork_safe and the comments in call.c:call3() */ #define THREAD_MODEL NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS static int @@ -960,7 +961,9 @@ static struct nbdkit_plugin plugin = { .cache = sh_cache, .errno_is_preserved = 1, - .fork_safe = 0, /* use of fork() is not yet safe from fd leaks */ +#ifdef HAVE_PIPE2 + .fork_safe = 1, /* use of fork() is only safe with atomic CLOEXEC */ +#endif }; NBDKIT_REGISTER_PLUGIN(plugin) -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 15/17] sh: Enable parallel thread model, when possible
When we first created the sh plugin, we copied the thread model used in most other language bindings of SERIALIZE_ALL_REQUESTS, because it was easier to reason about and there was no way for a script to override our choice. However, we've since added support for scripts to request a tighter model when needed, and we've also just fixed remaining bugs about fds inadvertently leaked into the shell script (at least, when pipe2 and accept4 are supported). And the shell itself is sufficiently expressive (even if not necessarily the most efficient) for two scripts to come up with ways to enforce mutual exclusion (for example, a script can use the success or failure of 'mkdir' on a coordinated filename to learn if it is first past the gate, or set up FIFOs for interprocess communication). Thus, it is time to promote the sh plugin to fully-parallel, when supported by the underlying system. (Thanks to .fork_safe, Haiku remains serialized so as to avoid one thread fork()ing while another may leak an fd). The new test copies heavily from test-parallel-file.sh, and also checks for some of the leaks plugged in recent previous patches. Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/sh/nbdkit-sh-plugin.pod | 11 ++-- plugins/sh/sh.c | 2 +- tests/Makefile.am | 2 + tests/test-dump-plugin.sh | 11 ++-- tests/test-parallel-sh.sh | 108 ++++++++++++++++++++++++++++++++ 5 files changed, 122 insertions(+), 12 deletions(-) create mode 100755 tests/test-parallel-sh.sh diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod index 41b9e7c7..7c1781cd 100644 --- a/plugins/sh/nbdkit-sh-plugin.pod +++ b/plugins/sh/nbdkit-sh-plugin.pod @@ -189,12 +189,11 @@ one of C<"serialize_connections">, C<"serialize_all_requests">, C<"serialize_requests">, or C<"parallel">. This method is I<not> required; if omitted, then the plugin will be -executed under the default sh thread model (currently -C<"serialize_all_requests">, which implies this callback only makes a -difference with an output of C<"serialize_connections">; look for -'max_thread_model' in C<nbdkit --dump-plugin sh>). If an error -occurs, the script should output an error message and exit with status -C<1>; unrecognized output is ignored. +executed under the default sh thread model (either C<"parallel"> or +C<"serialize_all_requests">, based on whether the platform supports +L<pipe2(2)>; look for 'max_thread_model' in C<nbdkit --dump-plugin +sh>). If an error occurs, the script should output an error message +and exit with status C<1>; unrecognized output is ignored. =item C<open> diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c index 79c9a4ac..6c7954fe 100644 --- a/plugins/sh/sh.c +++ b/plugins/sh/sh.c @@ -264,7 +264,7 @@ sh_config_complete (void) } /* See also .fork_safe and the comments in call.c:call3() */ -#define THREAD_MODEL NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS +#define THREAD_MODEL NBDKIT_THREAD_MODEL_PARALLEL static int sh_thread_model (void) diff --git a/tests/Makefile.am b/tests/Makefile.am index 8b8a6557..3d78e7a2 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -91,6 +91,7 @@ EXTRA_DIST = \ test-offset-extents.sh \ test-parallel-file.sh \ test-parallel-nbd.sh \ + test-parallel-sh.sh \ test-partition1.sh \ test-partition2.sh \ test-partitioning1.sh \ @@ -342,6 +343,7 @@ file-data: generate-file-data.sh TESTS += \ test-parallel-file.sh \ test-parallel-nbd.sh \ + test-parallel-sh.sh \ $(NULL) # Most in-depth tests need libguestfs, since that is a convenient way to diff --git a/tests/test-dump-plugin.sh b/tests/test-dump-plugin.sh index 982d5062..8f02f659 100755 --- a/tests/test-dump-plugin.sh +++ b/tests/test-dump-plugin.sh @@ -72,8 +72,9 @@ foreach_plugin do_test # Test that --dump-plugin can be used to introspect a resulting dynamic # thread model. out=$({ - # sh does not yet support parallel - nbdkit --dump-plugin sh + # Here, thread_model depends on atomic CLOEXEC support + nbdkit --dump-plugin sh | \ + sed 's/^thread_model=parallel/thread_model=serialize_all_requests/' # Here, the script further reduces things nbdkit --dump-plugin sh - <<\EOF case $1 in @@ -85,13 +86,13 @@ EOF # Here, the filter further reduces things nbdkit --dump-plugin --filter=noparallel sh serialize=connections } | grep thread_model) -exp="max_thread_model=serialize_all_requests +exp="max_thread_model=parallel thread_model=serialize_all_requests has_thread_model=1 -max_thread_model=serialize_all_requests +max_thread_model=parallel thread_model=serialize_connections has_thread_model=1 -max_thread_model=serialize_all_requests +max_thread_model=parallel thread_model=serialize_connections has_thread_model=1" if [ "$out" != "$exp" ]; then diff --git a/tests/test-parallel-sh.sh b/tests/test-parallel-sh.sh new file mode 100755 index 00000000..21bc010a --- /dev/null +++ b/tests/test-parallel-sh.sh @@ -0,0 +1,108 @@ +#!/usr/bin/env bash +# nbdkit +# Copyright (C) 2017-2019 Red Hat Inc. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# * Neither the name of Red Hat nor the names of its contributors may be +# used to endorse or promote products derived from this software without +# specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +# SUCH DAMAGE. + +source ./functions.sh + +# Check file-data was created by Makefile and qemu-io exists. +requires test -f file-data +requires qemu-io --version + +nbdkit --dump-plugin sh | grep -q ^thread_model=parallel || + { echo "nbdkit lacks support for parallel requests"; exit 77; } + +cleanup_fn rm -f test-parallel-sh.data test-parallel-sh.out test-parallel-sh.script + +# Populate file, and sanity check that qemu-io can issue parallel requests +printf '%1024s' . > test-parallel-sh.data +qemu-io -f raw -c "aio_write -P 1 0 512" -c "aio_write -P 2 512 512" \ + -c aio_flush test-parallel-sh.data || + { echo "'qemu-io' can't drive parallel requests"; exit 77; } + +# Set up the sh plugin to delay both reads and writes (for a good +# chance that parallel requests are in flight), and with writes longer +# than reads (to more easily detect if out-of-order completion +# happens). This test may have spurious failures under heavy loads on +# the test machine, where tuning the delays may help. + +cat > test-parallel-sh.script <<EOF +#!/usr/bin/env bash +f=test-parallel-sh.data +if ! test -f \$f; then + echo "can't locate test-parallel-sh.data" >&2; exit 5 +fi +case \$1 in + get_size) stat -L -c %s \$f || exit 1 ;; + pread) dd iflag=skip_bytes,count_bytes skip=\$4 count=\$3 if=\$f || exit 1 ;; + pwrite) dd oflag=seek_bytes conv=notrunc seek=\$4 of=\$f || exit 1 ;; + can_write) ;; + *) exit 2 ;; +esac +exit 0 +EOF +chmod +x test-parallel-sh.script + +# With --threads=1, the write should complete first because it was issued first +nbdkit -v -t 1 -U - --filter=delay sh test-parallel-sh.script \ + wdelay=2 rdelay=1 --run 'qemu-io -f raw -c "aio_write -P 2 512 512" \ + -c "aio_read -P 1 0 512" -c aio_flush $nbd' | + tee test-parallel-sh.out +if test "$(grep '512/512' test-parallel-sh.out)" != \ +"wrote 512/512 bytes at offset 512 +read 512/512 bytes at offset 0"; then + exit 1 +fi + +# With default --threads, the faster read should complete first +nbdkit -v -U - --filter=delay sh test-parallel-sh.script \ + wdelay=2 rdelay=1 --run 'qemu-io -f raw -c "aio_write -P 2 512 512" \ + -c "aio_read -P 1 0 512" -c aio_flush $nbd' | + tee test-parallel-sh.out +if test "$(grep '512/512' test-parallel-sh.out)" != \ +"read 512/512 bytes at offset 0 +wrote 512/512 bytes at offset 512"; then + exit 1 +fi + +# With --filter=noparallel, the write should complete first because it was +# issued first. Also test that the log filter doesn't leak an fd +nbdkit -v -U - --filter=noparallel --filter=log --filter=delay \ + sh test-parallel-sh.script logfile=/dev/null \ + wdelay=2 rdelay=1 --run 'qemu-io -f raw -c "aio_write -P 2 512 512" \ + -c "aio_read -P 1 0 512" -c aio_flush $nbd' | + tee test-parallel-sh.out +if test "$(grep '512/512' test-parallel-sh.out)" != \ +"wrote 512/512 bytes at offset 512 +read 512/512 bytes at offset 0"; then + exit 1 +fi + +exit 0 -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 16/17] sh: Test for fd leaks
Various patches before this one plugged fd leaks of files opened within nbdkit that the shell script need not see. It's time to add testsuite coverage to ensure we don't regress. However, a user may intentionally want to expose an open fd to nbdkit, which in turn remains available all the way through to the shell script, so nbdkit does not close unrecognized inherited fds at startup. This makes the test more interesting - if you run 'exec 17>/dev/null && make check', then there will be one more inherited fd in the environment of both nbdkit and the shell script, but this is not an nbdkit leak. Furthermore, checking how many fds are open in a shell is tricky: many actions in the shell (such as a glob or pipeline) increase the number of temporarily-open file descriptors; I settled on creating a subshell, then using ls to check the files still open in $$ (the subshell prevents the parent from using any more fds, all without changing $$). This addition is rather fragile - if it causes testsuite failures in some environments, we may end up reverting it. Signed-off-by: Eric Blake <eblake@redhat.com> --- tests/test-parallel-sh.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/tests/test-parallel-sh.sh b/tests/test-parallel-sh.sh index 21bc010a..ddbdf90c 100755 --- a/tests/test-parallel-sh.sh +++ b/tests/test-parallel-sh.sh @@ -53,12 +53,35 @@ qemu-io -f raw -c "aio_write -P 1 0 512" -c "aio_write -P 2 512 512" \ # happens). This test may have spurious failures under heavy loads on # the test machine, where tuning the delays may help. +# Also test for leaked fds when possible: nbdkit does not close fds it +# inherited, but other than 0, 1, 2, and the fd associated with the +# script, the child shell should not see any new fds not also present +# in nbdkit's parent environment. When testing for the count of open +# fds, use ls in a subshell (rather than a glob directly within the +# shell under test), to avoid yet another fd open on the /proc/self/fd +# directory. + +curr_fds+if test -d /proc/$$/fd; then + curr_fds=$(/usr/bin/env bash -c '(ls /proc/$$/fd)' | wc -w) +fi +echo "using curr_fds=$curr_fds" + cat > test-parallel-sh.script <<EOF #!/usr/bin/env bash f=test-parallel-sh.data if ! test -f \$f; then echo "can't locate test-parallel-sh.data" >&2; exit 5 fi +if test -d /proc/\$\$/fd; then + ( + if test \$( ls /proc/\$\$/fd | wc -w ) -ne \$(($curr_fds + 1)); then + echo "there seem to be leaked fds, curr_fds=$curr_fds" >&2 + ls -l /proc/\$\$/fd >&2 + exit 1 + fi + ) || exit 5 +fi case \$1 in get_size) stat -L -c %s \$f || exit 1 ;; pread) dd iflag=skip_bytes,count_bytes skip=\$4 count=\$3 if=\$f || exit 1 ;; -- 2.20.1
Eric Blake
2019-Aug-02 19:26 UTC
[Libguestfs] [nbdkit PATCH v2 17/17] DO NOT APPLY temp debug
This is the sort of thing I did on Linux to mimic what I expect to happen on Haiku, such as testing that the dynamic thread model gets crippled but that there are still no fd leaks --- common/utils/utils.c | 4 ++-- plugins/nbd/nbd.c | 2 +- server/plugins.c | 2 +- server/sockets.c | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/common/utils/utils.c b/common/utils/utils.c index 32bc96a7..dd021a8f 100644 --- a/common/utils/utils.c +++ b/common/utils/utils.c @@ -141,14 +141,14 @@ exit_status_to_nbd_error (int status, const char *cmd) int set_cloexec (int fd) { #if (defined SOCK_CLOEXEC && defined HAVE_MKOSTEMP && defined HAVE_PIPE2 && \ - defined HAVE_ACCEPT4) + defined HAVE_ACCEPT5) nbdkit_error ("prefer creating fds with CLOEXEC atomically set"); close (fd); errno = EBADF; return -1; #else # if (defined SOCK_CLOEXEC || defined HAVE_MKOSTEMP || defined HAVE_PIPE2 || \ - defined HAVE_ACCEPT4) + defined HAVE_ACCEPT5) && 0 # error "Unexpected: your system has incomplete atomic CLOEXEC support" # endif int f; diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c index 7762c9c0..8ec13ec0 100644 --- a/plugins/nbd/nbd.c +++ b/plugins/nbd/nbd.c @@ -861,7 +861,7 @@ static struct nbdkit_plugin plugin = { .extents = nbdplug_extents, .cache = nbdplug_cache, .errno_is_preserved = 1, - .fork_safe = 1, /* libnbd uses fork() but not for URIs */ + .fork_safe = 0, /* libnbd uses fork() but not for URIs */ }; NBDKIT_REGISTER_PLUGIN (plugin) diff --git a/server/plugins.c b/server/plugins.c index 917f0eec..8556f18f 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -90,7 +90,7 @@ plugin_thread_model (struct backend *b) int thread_model = p->plugin._thread_model; int r; -#ifndef HAVE_ACCEPT4 +#ifndef HAVE_ACCEPT5 /* If the server is unable to atomically set CLOEXEC when accepting * a new connection, then we must serialize to ensure that we are * not attempting to fork() in one thread while the server is diff --git a/server/sockets.c b/server/sockets.c index 70d999ee..c90f9238 100644 --- a/server/sockets.c +++ b/server/sockets.c @@ -301,7 +301,7 @@ accept_connection (int listen_sock) thread_data->instance_num = instance_num++; thread_data->addrlen = sizeof thread_data->addr; again: -#ifdef HAVE_ACCEPT4 +#ifdef HAVE_ACCEPT5 thread_data->sock = accept4 (listen_sock, &thread_data->addr, &thread_data->addrlen, SOCK_CLOEXEC); -- 2.20.1
Eric Blake
2019-Aug-02 19:42 UTC
Re: [Libguestfs] [nbdkit PATCH v2 04/17] Revert "RHEL 5: Define O_CLOEXEC and SOCK_CLOEXEC."
On 8/2/19 2:26 PM, Eric Blake wrote:> This reverts commit 25206df20275aeff346d9b86adf5e9be99cc9e43, and > temporarily breaks compilation on Haiku which has O_CLOEXEC but lacks > SOCK_CLOEXEC [1]. >> +++ b/common/utils/utils.c > @@ -36,6 +36,7 @@ > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > +#include <sys/socket.h> > #include <sys/types.h> > #include <sys/wait.h> > #include <unistd.h>Looks like I squashed a hunk into the wrong patch; I think this one belongs in 5/17. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2019-Aug-02 19:52 UTC
Re: [Libguestfs] [nbdkit PATCH v2 10/17] plugins: Add .fork_safe field
On Fri, Aug 02, 2019 at 02:26:11PM -0500, Eric Blake wrote:> Allow a plugin field to declare whether a parallel plugin can tolerate > windows where fds are not CLOEXEC, or must take precautions to avoid > leaking fds if the plugin may fork. For safety reasons, the flag > defaults to off, but many in-tree plugins can set it to on (most > commonly because they don't fork after .config_complete; for libvirt > because it is documented to clean up fds on fork so it is immune to > anything we might leak; for libnbd because we don't use the API that > forks). Note that I did not choose to set the new field for many of > the various language bindings (it becomes a rather difficult task to > prove whether the third-party language binding code is itself using > atomic CLOEXEC or fd sanitization). However many of our languages are > still stuck as serialized, and the lack of .fork_safe won't impact > those thread model anyways. > > Update the testsuite to skip parallel tests that would otherwise fail > when the thread model is crippled. > > Upcoming patches will then fix the server to audit and fix places > where we currently leak fds, and then cripple the thread model only on > platforms where atomic CLOEXEC is not possible.My worry about this patch is we're adding a new plugin flag which we'll have to support forever, but IIUC it's only needed on one platform (ie. Haiku) which really ought to get fixed. In future we'll end up in a situation where we have this flag but it's no longer needed. How about instead of this we simply restrict Haiku to the fully serialized mode. Sucks for them, but they can fix it by adding atomic CLOEXEC features ...> + .fork_safe = 0, /* libguestfs uses fork(), unaudited for safety */libguestfs does use fork and should be safe - we're pretty careful about using CLOEXEC, accept4, etc everywhere. (Also libguestfs doesn't run on Haiku and architecturally that's unlikely to ever happen). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Richard W.M. Jones
2019-Aug-02 19:55 UTC
Re: [Libguestfs] [nbdkit PATCH v2 15/17] sh: Enable parallel thread model, when possible
On Fri, Aug 02, 2019 at 02:26:16PM -0500, Eric Blake wrote:> When we first created the sh plugin, we copied the thread model used > in most other language bindings of SERIALIZE_ALL_REQUESTS, because it > was easier to reason about and there was no way for a script to > override our choice. However, we've since added support for scripts > to request a tighter model when needed, and we've also just fixed > remaining bugs about fds inadvertently leaked into the shell script > (at least, when pipe2 and accept4 are supported). And the shell > itself is sufficiently expressive (even if not necessarily the most > efficient) for two scripts to come up with ways to enforce mutual > exclusion (for example, a script can use the success or failure of > 'mkdir' on a coordinated filename to learn if it is first past the > gate, or set up FIFOs for interprocess communication). > > Thus, it is time to promote the sh plugin to fully-parallel, when > supported by the underlying system. (Thanks to .fork_safe, Haiku > remains serialized so as to avoid one thread fork()ing while another > may leak an fd). > > The new test copies heavily from test-parallel-file.sh, and also > checks for some of the leaks plugged in recent previous patches. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > plugins/sh/nbdkit-sh-plugin.pod | 11 ++-- > plugins/sh/sh.c | 2 +- > tests/Makefile.am | 2 + > tests/test-dump-plugin.sh | 11 ++-- > tests/test-parallel-sh.sh | 108 ++++++++++++++++++++++++++++++++ > 5 files changed, 122 insertions(+), 12 deletions(-) > create mode 100755 tests/test-parallel-sh.sh > > diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod > index 41b9e7c7..7c1781cd 100644 > --- a/plugins/sh/nbdkit-sh-plugin.pod > +++ b/plugins/sh/nbdkit-sh-plugin.pod > @@ -189,12 +189,11 @@ one of C<"serialize_connections">, C<"serialize_all_requests">, > C<"serialize_requests">, or C<"parallel">. > > This method is I<not> required; if omitted, then the plugin will be > -executed under the default sh thread model (currently > -C<"serialize_all_requests">, which implies this callback only makes a > -difference with an output of C<"serialize_connections">; look for > -'max_thread_model' in C<nbdkit --dump-plugin sh>). If an error > -occurs, the script should output an error message and exit with status > -C<1>; unrecognized output is ignored. > +executed under the default sh thread model (either C<"parallel"> or > +C<"serialize_all_requests">, based on whether the platform supports > +L<pipe2(2)>; look for 'max_thread_model' in C<nbdkit --dump-plugin > +sh>). If an error occurs, the script should output an error message > +and exit with status C<1>; unrecognized output is ignored.I'm not sure we should change the default (although by all means change our existing shell plugins so they explicitly set the thread model to parallel). One reason I think this is because we've advertised that people can use handles to create a temporary directory for saving state, and that's likely to break if the thread model changes. Rich.> =item C<open> > > diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c > index 79c9a4ac..6c7954fe 100644 > --- a/plugins/sh/sh.c > +++ b/plugins/sh/sh.c > @@ -264,7 +264,7 @@ sh_config_complete (void) > } > > /* See also .fork_safe and the comments in call.c:call3() */ > -#define THREAD_MODEL NBDKIT_THREAD_MODEL_SERIALIZE_ALL_REQUESTS > +#define THREAD_MODEL NBDKIT_THREAD_MODEL_PARALLEL > > static int > sh_thread_model (void) > diff --git a/tests/Makefile.am b/tests/Makefile.am > index 8b8a6557..3d78e7a2 100644 > --- a/tests/Makefile.am > +++ b/tests/Makefile.am > @@ -91,6 +91,7 @@ EXTRA_DIST = \ > test-offset-extents.sh \ > test-parallel-file.sh \ > test-parallel-nbd.sh \ > + test-parallel-sh.sh \ > test-partition1.sh \ > test-partition2.sh \ > test-partitioning1.sh \ > @@ -342,6 +343,7 @@ file-data: generate-file-data.sh > TESTS += \ > test-parallel-file.sh \ > test-parallel-nbd.sh \ > + test-parallel-sh.sh \ > $(NULL) > > # Most in-depth tests need libguestfs, since that is a convenient way to > diff --git a/tests/test-dump-plugin.sh b/tests/test-dump-plugin.sh > index 982d5062..8f02f659 100755 > --- a/tests/test-dump-plugin.sh > +++ b/tests/test-dump-plugin.sh > @@ -72,8 +72,9 @@ foreach_plugin do_test > # Test that --dump-plugin can be used to introspect a resulting dynamic > # thread model. > out=$({ > - # sh does not yet support parallel > - nbdkit --dump-plugin sh > + # Here, thread_model depends on atomic CLOEXEC support > + nbdkit --dump-plugin sh | \ > + sed 's/^thread_model=parallel/thread_model=serialize_all_requests/' > # Here, the script further reduces things > nbdkit --dump-plugin sh - <<\EOF > case $1 in > @@ -85,13 +86,13 @@ EOF > # Here, the filter further reduces things > nbdkit --dump-plugin --filter=noparallel sh serialize=connections > } | grep thread_model) > -exp="max_thread_model=serialize_all_requests > +exp="max_thread_model=parallel > thread_model=serialize_all_requests > has_thread_model=1 > -max_thread_model=serialize_all_requests > +max_thread_model=parallel > thread_model=serialize_connections > has_thread_model=1 > -max_thread_model=serialize_all_requests > +max_thread_model=parallel > thread_model=serialize_connections > has_thread_model=1" > if [ "$out" != "$exp" ]; then > diff --git a/tests/test-parallel-sh.sh b/tests/test-parallel-sh.sh > new file mode 100755 > index 00000000..21bc010a > --- /dev/null > +++ b/tests/test-parallel-sh.sh > @@ -0,0 +1,108 @@ > +#!/usr/bin/env bash > +# nbdkit > +# Copyright (C) 2017-2019 Red Hat Inc. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions are > +# met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in the > +# documentation and/or other materials provided with the distribution. > +# > +# * Neither the name of Red Hat nor the names of its contributors may be > +# used to endorse or promote products derived from this software without > +# specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND > +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, > +# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A > +# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR > +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF > +# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND > +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, > +# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT > +# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > +# SUCH DAMAGE. > + > +source ./functions.sh > + > +# Check file-data was created by Makefile and qemu-io exists. > +requires test -f file-data > +requires qemu-io --version > + > +nbdkit --dump-plugin sh | grep -q ^thread_model=parallel || > + { echo "nbdkit lacks support for parallel requests"; exit 77; } > + > +cleanup_fn rm -f test-parallel-sh.data test-parallel-sh.out test-parallel-sh.script > + > +# Populate file, and sanity check that qemu-io can issue parallel requests > +printf '%1024s' . > test-parallel-sh.data > +qemu-io -f raw -c "aio_write -P 1 0 512" -c "aio_write -P 2 512 512" \ > + -c aio_flush test-parallel-sh.data || > + { echo "'qemu-io' can't drive parallel requests"; exit 77; } > + > +# Set up the sh plugin to delay both reads and writes (for a good > +# chance that parallel requests are in flight), and with writes longer > +# than reads (to more easily detect if out-of-order completion > +# happens). This test may have spurious failures under heavy loads on > +# the test machine, where tuning the delays may help. > + > +cat > test-parallel-sh.script <<EOF > +#!/usr/bin/env bash > +f=test-parallel-sh.data > +if ! test -f \$f; then > + echo "can't locate test-parallel-sh.data" >&2; exit 5 > +fi > +case \$1 in > + get_size) stat -L -c %s \$f || exit 1 ;; > + pread) dd iflag=skip_bytes,count_bytes skip=\$4 count=\$3 if=\$f || exit 1 ;; > + pwrite) dd oflag=seek_bytes conv=notrunc seek=\$4 of=\$f || exit 1 ;; > + can_write) ;; > + *) exit 2 ;; > +esac > +exit 0 > +EOF > +chmod +x test-parallel-sh.script > + > +# With --threads=1, the write should complete first because it was issued first > +nbdkit -v -t 1 -U - --filter=delay sh test-parallel-sh.script \ > + wdelay=2 rdelay=1 --run 'qemu-io -f raw -c "aio_write -P 2 512 512" \ > + -c "aio_read -P 1 0 512" -c aio_flush $nbd' | > + tee test-parallel-sh.out > +if test "$(grep '512/512' test-parallel-sh.out)" != \ > +"wrote 512/512 bytes at offset 512 > +read 512/512 bytes at offset 0"; then > + exit 1 > +fi > + > +# With default --threads, the faster read should complete first > +nbdkit -v -U - --filter=delay sh test-parallel-sh.script \ > + wdelay=2 rdelay=1 --run 'qemu-io -f raw -c "aio_write -P 2 512 512" \ > + -c "aio_read -P 1 0 512" -c aio_flush $nbd' | > + tee test-parallel-sh.out > +if test "$(grep '512/512' test-parallel-sh.out)" != \ > +"read 512/512 bytes at offset 0 > +wrote 512/512 bytes at offset 512"; then > + exit 1 > +fi > + > +# With --filter=noparallel, the write should complete first because it was > +# issued first. Also test that the log filter doesn't leak an fd > +nbdkit -v -U - --filter=noparallel --filter=log --filter=delay \ > + sh test-parallel-sh.script logfile=/dev/null \ > + wdelay=2 rdelay=1 --run 'qemu-io -f raw -c "aio_write -P 2 512 512" \ > + -c "aio_read -P 1 0 512" -c aio_flush $nbd' | > + tee test-parallel-sh.out > +if test "$(grep '512/512' test-parallel-sh.out)" != \ > +"wrote 512/512 bytes at offset 512 > +read 512/512 bytes at offset 0"; then > + exit 1 > +fi > + > +exit 0 > -- > 2.20.1 > > _______________________________________________ > Libguestfs mailing list > Libguestfs@redhat.com > https://www.redhat.com/mailman/listinfo/libguestfs-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Richard W.M. Jones
2019-Aug-02 19:56 UTC
Re: [Libguestfs] [nbdkit PATCH v2 16/17] sh: Test for fd leaks
ACK 1-16 except for the two I noted which I think need further discussion. Thanks, Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Eric Blake
2019-Aug-02 19:59 UTC
Re: [Libguestfs] [nbdkit PATCH v2 07/17] build: Audit for use of pipe2
On 8/2/19 2:26 PM, Eric Blake wrote:> Haiku unfortunately lacks pipe2, so we have to plan for a fallback at > each site that uses it. This also makes a good time to audit all > existing users of pipe, to see if they should be using pipe2. The > tests fork() but don't fail because of fd leaks; and the nbd plugin > doesn't fork() but was merely using pipe2 for convenience over > multiple fcntl calls. However, the server's quit_fd definitely needs > to be marked CLOEXEC (it's easy to use the sh plugin to show that we > are currently leaking it to children), although doing so can occur > without worrying about atomicity since it is called before threads > begin. Finally, it's also worth updating our set_cloexec helper > function to document that we still prefer atomicity where possible. > > Signed-off-by: Eric Blake <eblake@redhat.com> > ---> +++ b/plugins/nbd/nbd.c > @@ -431,11 +431,42 @@ nbdplug_open_handle (int readonly) > nbdkit_error ("malloc: %m"); > return NULL; > } > +#ifdef HAVE_PIPE2 > if (pipe2 (h->fds, O_NONBLOCK)) { > + nbdkit_error ("pipe2: %m"); > + free (h); > + return NULL; > + } > +#else > + /* This plugin doesn't fork, so we don't care about CLOEXEC. Our use > + * of pipe2 is merely for convenience. > + */ > + if (pipe (h->fds)) { > nbdkit_error ("pipe: %m"); > free (h); > return NULL; > } > + else { > + int f; > + > + f = fcntl (h->fds[0], F_GETFL); > + if (f == -1 || fcntl (h->fds[0], F_SETFL, f | O_NONBLOCK) == -1) { > + nbdkit_error ("fcntl: %m");Shoot - I wrote up set_nonblock earlier in the series, then forgot to use it here. Consider this squashed in: diff --git i/plugins/nbd/nbd.c w/plugins/nbd/nbd.c index 95d910e7..f11e54d5 100644 --- i/plugins/nbd/nbd.c +++ w/plugins/nbd/nbd.c @@ -54,6 +54,7 @@ #include <nbdkit-plugin.h> #include "byte-swapping.h" #include "cleanup.h" +#include "utils.h" /* The per-transaction details */ struct transaction { @@ -446,25 +447,15 @@ nbdplug_open_handle (int readonly) free (h); return NULL; } - else { - int f; - - f = fcntl (h->fds[0], F_GETFL); - if (f == -1 || fcntl (h->fds[0], F_SETFL, f | O_NONBLOCK) == -1) { - nbdkit_error ("fcntl: %m"); - close (h->fds[0]); - close (h->fds[1]); - free (h); - return NULL; - } - f = fcntl (h->fds[1], F_GETFL); - if (f == -1 || fcntl (h->fds[1], F_SETFL, f | O_NONBLOCK) == -1) { - nbdkit_error ("fcntl: %m"); - close (h->fds[0]); - close (h->fds[1]); - free (h); - return NULL; - } + if (set_nonblock (h->fds[0]) == -1) { + close (h->fds[1]); + free (h); + return NULL; + } + if (set_nonblock (h->fds[1]) == -1) { + close (h->fds[0]); + free (h); + return NULL; } #endif -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2019-Aug-27 11:47 UTC
[Libguestfs] [PATCH nbdkit] sh: Remove assert and replace with smarter file descriptor duplication. (was: Re: [nbdkit PATCH v2 14/17] sh: Use pipe2 with CLOEXEC when possible)
On Fri, Aug 02, 2019 at 02:26:15PM -0500, Eric Blake wrote:> + /* Ensure that stdin/out/err of the current process were not empty > + * before we started creating pipes (otherwise, the close and dup2 > + * calls below become more complex to juggle fds around correctly). > + */ > + assert (in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && > + out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && > + err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO);This assert is now being thrown whenever we use nbdkit + sh plugin + nbdkit -s option. It's easy to demonstrate using a libnbd one-liner: $ nbdsh -c 'h.connect_command (["nbdkit", "sh", "/tmp/min.sh", "-s", "--exit-with-parent"])' \ -c 'print ("%r" % h.get_size())' 1048576 nbdkit: call.c:155: call3: Assertion `in_fd[0] > STDERR_FILENO && in_fd[1] > STDERR_FILENO && out_fd[0] > STDERR_FILENO && out_fd[1] > STDERR_FILENO && err_fd[0] > STDERR_FILENO && err_fd[1] > STDERR_FILENO' failed. (This is triggered frequently by tests in the libnbd test suite which is where I first observed it.) It happens during the shutdown path where there is one final call to sh_close in the plugin: #0 0x00007fc5d929c615 in raise () from /lib64/libc.so.6 #1 0x00007fc5d92858d9 in abort () from /lib64/libc.so.6 #2 0x00007fc5d92857a9 in __assert_fail_base.cold () from /lib64/libc.so.6 #3 0x00007fc5d9294a56 in __assert_fail () from /lib64/libc.so.6 #4 0x00007fc5d969738c in call3.constprop () from /usr/lib64/nbdkit/plugins/nbdkit-sh-plugin.so #5 0x00007fc5d9697493 in call () from /usr/lib64/nbdkit/plugins/nbdkit-sh-plugin.so #6 0x00007fc5d969833c in sh_close () from /usr/lib64/nbdkit/plugins/nbdkit-sh-plugin.so #7 0x0000561694bcf8e8 in plugin_close (b=<optimized out>, conn=0x56169df9afe0) at plugins.c:305 #8 0x0000561694bca649 in free_connection (conn=0x56169df9afe0) at connections.c:377 #9 _handle_single_connection (sockout=<optimized out>, sockin=<optimized out>) at connections.c:267 #10 handle_single_connection (sockin=<optimized out>, sockout=<optimized out>) at connections.c:277 #11 0x0000561694bc9583 in start_serving () at main.c:856 #12 main (argc=<optimized out>, argv=0x7ffd09440178) at main.c:651 This is after conn->close has been called and stdin/stdout (which were connected to the client socket) have actually been closed. We can easily paper over this by ignoring the assert on the close path, but I foresee two problems: (1) I don't believe the assert is generally correct. While it's not ideal for nbdkit to be run with stdin/out/err closed (they obviously should be connected to /dev/null) it's also not impossible for that to happen. We don't cope well with this situation. (2) The assert is protecting us from some minimal checks in this code, but I think the right answer is to just add those checks. https://github.com/libguestfs/nbdkit/blob/79b26ee95034d9c90913da3b5db596503c5d03d9/plugins/sh/call.c#L166 Anyway, a possible patch for this is attached, but I'm not very confident it is correct. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Possibly Parallel Threads
- Re: [nbdkit PATCH v2 15/17] sh: Enable parallel thread model, when possible
- [nbdkit PATCH 3/4] tests: Don't let test-parallel-* hang on nbdkit bug
- [PATCH nbdkit 1/2] tests: Remove QEMU_IO / HAVE_QEMU_IO.
- [nbdkit PATCH] tests: Make parallel tests work at 512-byte granularity
- [PATCH nbdkit] tests: Remove QEMU_IO / HAVE_QEMU_IO.