Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 0/7] server: Implement NBD_FLAG_CAN_MULTI_CONN.
First thing to say is that I need to do a *lot* more testing on this, so this is just an early peek. In particular, although it passed ‘make check && make check-valgrind’ I have *not* tested it against a multi-conn-aware client such as the Linux kernel >= 4.9. This implements NBD_FLAG_CAN_MULTI_CONN, described in the protocol doc as: "NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates entirely without cache, or that the cache it uses is shared among all connections to the given device. In particular, if this flag is present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST be visible across all connections when the server sends its reply to that command to the client. In the absence of this flag, clients SHOULD NOT multiplex their commands over more than one connection to the export." This is necessary to support the Linux nbd client -C option. The only plugin which sets the flag so far is file. The ocaml, sh and nbd plugins allow the flag to be controlled or passed through. Notable also is that the blocksize filter has to filter out this flag, because I'm not convinced that the bounce buffer is safe. However I believe the other filters *are* safe (although not really certain about the fua filter, and the new cache filter is tricky too). My feeling is that we should set the flag unconditionally for all readonly connections, but I can think of nasty corner cases where it might not work. We should most probably set it in all plugins that are readonly (eg. nbdkit-pattern-plugin). Rich.
Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 1/7] server: Implement NBD_FLAG_CAN_MULTI_CONN.
>From the protocol document:"NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates entirely without cache, or that the cache it uses is shared among all connections to the given device. In particular, if this flag is present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST be visible across all connections when the server sends its reply to that command to the client. In the absence of this flag, clients SHOULD NOT multiplex their commands over more than one connection to the export." This is implemented in nbdkit by a boolean function which plugins may implement if they believe they satisfy the requirements above. It can also be passed through or modified by filters. It defaults to false, which is the safe default. --- docs/nbdkit-filter.pod | 4 ++++ docs/nbdkit-plugin.pod | 20 ++++++++++++++++++++ docs/nbdkit-protocol.pod | 4 ++++ include/nbdkit-filter.h | 3 +++ include/nbdkit-plugin.h | 2 ++ server/internal.h | 1 + server/protocol.h | 1 + server/connections.c | 9 +++++++++ server/filters.c | 24 ++++++++++++++++++++++++ server/plugins.c | 17 +++++++++++++++++ tests/test-layers-filter.c | 10 ++++++++++ tests/test-layers-plugin.c | 8 ++++++++ tests/test-layers.c | 6 ++++++ 13 files changed, 109 insertions(+) diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod index 1e1bf5d..4f27adb 100644 --- a/docs/nbdkit-filter.pod +++ b/docs/nbdkit-filter.pod @@ -354,6 +354,8 @@ calls. =head2 C<.can_fua> +=head2 C<.can_multi_conn> + int (*can_write) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle); int (*can_flush) (struct nbdkit_next_ops *next_ops, void *nxdata, @@ -367,6 +369,8 @@ calls. void *handle); int (*can_fua) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle); + int (*can_multi_conn) (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle); These intercept the corresponding plugin methods, and control feature bits advertised to the client. diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod index cfa14b0..c72ea0d 100644 --- a/docs/nbdkit-plugin.pod +++ b/docs/nbdkit-plugin.pod @@ -586,6 +586,26 @@ handle FUA requests. If omitted, nbdkit checks whether C<.flush> exists, and behaves as if this function returns C<NBDKIT_FUA_NONE> or C<NBDKIT_FUA_EMULATE> as appropriate. +=head2 C<.can_multi_conn> + + int can_multi_conn (void *handle); + +This is called during the option negotiation phase to find out if the +plugin can handle multiple connections from a single client. + +Specifically it means that either the plugin does not cache requests +at all. Or if it does cache then the effects of a C<.flush> request +or setting C<NBDKIT_FLAG_FUA> on a request must be visible across all +connections to the plugin before the plugin replies to that request. + +If you use the Linux NBD client with the I<-C> option then this flag +is checked. + +If there is an error, C<.can_multi_conn> should call C<nbdkit_error> +with an error message and return C<-1>. + +This callback is not required. If omitted, then we return false. + =head2 C<.pread> int pread (void *handle, void *buf, uint32_t count, uint64_t offset, diff --git a/docs/nbdkit-protocol.pod b/docs/nbdkit-protocol.pod index d5ba00f..db7fcf9 100644 --- a/docs/nbdkit-protocol.pod +++ b/docs/nbdkit-protocol.pod @@ -116,6 +116,10 @@ Supported in nbdkit E<ge> 1.9.3. This protocol enhancement allows a client to inspect details about the export without actually connecting. +=item C<NBD_FLAG_CAN_MULTI_CONN> + +Supported in nbdkit E<ge> 1.9.10. + =item Structured Replies I<Not supported>. diff --git a/include/nbdkit-filter.h b/include/nbdkit-filter.h index 931d923..71c06c8 100644 --- a/include/nbdkit-filter.h +++ b/include/nbdkit-filter.h @@ -59,6 +59,7 @@ struct nbdkit_next_ops { int (*can_trim) (void *nxdata); int (*can_zero) (void *nxdata); int (*can_fua) (void *nxdata); + int (*can_multi_conn) (void *nxdata); int (*pread) (void *nxdata, void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err); @@ -121,6 +122,8 @@ struct nbdkit_filter { void *handle); int (*can_fua) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle); + int (*can_multi_conn) (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle); int (*pread) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, void *buf, uint32_t count, uint64_t offset, diff --git a/include/nbdkit-plugin.h b/include/nbdkit-plugin.h index 31aa6c7..d43b2f5 100644 --- a/include/nbdkit-plugin.h +++ b/include/nbdkit-plugin.h @@ -123,6 +123,8 @@ struct nbdkit_plugin { #endif const char *magic_config_key; + + int (*can_multi_conn) (void *handle); }; extern void nbdkit_set_error (int err); diff --git a/server/internal.h b/server/internal.h index e5c91a9..1f237a5 100644 --- a/server/internal.h +++ b/server/internal.h @@ -212,6 +212,7 @@ struct backend { int (*can_trim) (struct backend *, struct connection *conn); int (*can_zero) (struct backend *, struct connection *conn); int (*can_fua) (struct backend *, struct connection *conn); + int (*can_multi_conn) (struct backend *, struct connection *conn); int (*pread) (struct backend *, struct connection *conn, void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err); diff --git a/server/protocol.h b/server/protocol.h index 6709ddc..003fc45 100644 --- a/server/protocol.h +++ b/server/protocol.h @@ -94,6 +94,7 @@ extern const char *name_of_nbd_flag (int); #define NBD_FLAG_ROTATIONAL (1 << 4) #define NBD_FLAG_SEND_TRIM (1 << 5) #define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) +#define NBD_FLAG_CAN_MULTI_CONN (1 << 8) /* NBD options (new style handshake only). */ extern const char *name_of_nbd_opt (int); diff --git a/server/connections.c b/server/connections.c index 0a89315..51d4912 100644 --- a/server/connections.c +++ b/server/connections.c @@ -84,6 +84,7 @@ struct connection { bool can_trim; bool can_zero; bool can_fua; + bool can_multi_conn; bool using_tls; int sockin, sockout; @@ -463,6 +464,14 @@ compute_eflags (struct connection *conn, uint16_t *flags) conn->is_rotational = true; } + fl = backend->can_multi_conn (backend, conn); + if (fl == -1) + return -1; + if (fl) { + eflags |= NBD_FLAG_CAN_MULTI_CONN; + conn->can_multi_conn = true; + } + *flags = eflags; return 0; } diff --git a/server/filters.c b/server/filters.c index d02e7fb..5b7abc4 100644 --- a/server/filters.c +++ b/server/filters.c @@ -316,6 +316,13 @@ next_can_fua (void *nxdata) return b_conn->b->can_fua (b_conn->b, b_conn->conn); } +static int +next_can_multi_conn (void *nxdata) +{ + struct b_conn *b_conn = nxdata; + return b_conn->b->can_multi_conn (b_conn->b, b_conn->conn); +} + static int next_pread (void *nxdata, void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err) @@ -365,6 +372,7 @@ static struct nbdkit_next_ops next_ops = { .can_trim = next_can_trim, .can_zero = next_can_zero, .can_fua = next_can_fua, + .can_multi_conn = next_can_multi_conn, .pread = next_pread, .pwrite = next_pwrite, .flush = next_flush, @@ -518,6 +526,21 @@ filter_can_fua (struct backend *b, struct connection *conn) return f->backend.next->can_fua (f->backend.next, conn); } +static int +filter_can_multi_conn (struct backend *b, struct connection *conn) +{ + struct backend_filter *f = container_of (b, struct backend_filter, backend); + void *handle = connection_get_handle (conn, f->backend.i); + struct b_conn nxdata = { .b = f->backend.next, .conn = conn }; + + debug ("%s: can_multi_conn", f->name); + + if (f->filter.can_multi_conn) + return f->filter.can_multi_conn (&next_ops, &nxdata, handle); + else + return f->backend.next->can_multi_conn (f->backend.next, conn); +} + static int filter_pread (struct backend *b, struct connection *conn, void *buf, uint32_t count, uint64_t offset, @@ -645,6 +668,7 @@ static struct backend filter_functions = { .can_trim = filter_can_trim, .can_zero = filter_can_zero, .can_fua = filter_can_fua, + .can_multi_conn = filter_can_multi_conn, .pread = filter_pread, .pwrite = filter_pwrite, .flush = filter_flush, diff --git a/server/plugins.c b/server/plugins.c index cb0f411..bbde578 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -197,6 +197,7 @@ plugin_dump_fields (struct backend *b) HAS (flush); HAS (trim); HAS (zero); + HAS (can_multi_conn); #undef HAS /* Custom fields. */ @@ -414,6 +415,21 @@ plugin_can_fua (struct backend *b, struct connection *conn) return NBDKIT_FUA_NONE; } +static int +plugin_can_multi_conn (struct backend *b, struct connection *conn) +{ + struct backend_plugin *p = container_of (b, struct backend_plugin, backend); + + assert (connection_get_handle (conn, 0)); + + debug ("can_multi_conn"); + + if (p->plugin.can_multi_conn) + return p->plugin.can_multi_conn (connection_get_handle (conn, 0)); + else + return 0; /* assume false */ +} + /* Plugins and filters can call this to set the true errno, in cases * where !errno_is_preserved. */ @@ -656,6 +672,7 @@ static struct backend plugin_functions = { .can_trim = plugin_can_trim, .can_zero = plugin_can_zero, .can_fua = plugin_can_fua, + .can_multi_conn = plugin_can_multi_conn, .pread = plugin_pread, .pwrite = plugin_pwrite, .flush = plugin_flush, diff --git a/tests/test-layers-filter.c b/tests/test-layers-filter.c index c5a96e7..6da6ee6 100644 --- a/tests/test-layers-filter.c +++ b/tests/test-layers-filter.c @@ -167,6 +167,15 @@ test_layers_filter_can_fua (struct nbdkit_next_ops *next_ops, void *nxdata, return next_ops->can_fua (nxdata); } +static int +test_layers_filter_can_multi_conn (struct nbdkit_next_ops *next_ops, + void *nxdata, + void *handle) +{ + DEBUG_FUNCTION; + return next_ops->can_multi_conn (nxdata); +} + static int test_layers_filter_pread (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, void *buf, @@ -233,6 +242,7 @@ static struct nbdkit_filter filter = { .can_trim = test_layers_filter_can_trim, .can_zero = test_layers_filter_can_zero, .can_fua = test_layers_filter_can_fua, + .can_multi_conn = test_layers_filter_can_multi_conn, .pread = test_layers_filter_pread, .pwrite = test_layers_filter_pwrite, .flush = test_layers_filter_flush, diff --git a/tests/test-layers-plugin.c b/tests/test-layers-plugin.c index 3befcc1..042ecc3 100644 --- a/tests/test-layers-plugin.c +++ b/tests/test-layers-plugin.c @@ -137,6 +137,13 @@ test_layers_plugin_can_fua (void *handle) return NBDKIT_FUA_NATIVE; } +static int +test_layers_plugin_can_multi_conn (void *handle) +{ + DEBUG_FUNCTION; + return 1; +} + static int test_layers_plugin_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -196,6 +203,7 @@ static struct nbdkit_plugin plugin = { .can_trim = test_layers_plugin_can_trim, .can_zero = test_layers_plugin_can_zero, .can_fua = test_layers_plugin_can_fua, + .can_multi_conn = test_layers_plugin_can_multi_conn, .pread = test_layers_plugin_pread, .pwrite = test_layers_plugin_pwrite, .flush = test_layers_plugin_flush, diff --git a/tests/test-layers.c b/tests/test-layers.c index 261b9c6..c8d15d2 100644 --- a/tests/test-layers.c +++ b/tests/test-layers.c @@ -349,6 +349,12 @@ main (int argc, char *argv[]) "filter1: test_layers_filter_is_rotational", "test_layers_plugin_is_rotational", NULL); + log_verify_seen_in_order + ("filter3: test_layers_filter_can_multi_conn", + "filter2: test_layers_filter_can_multi_conn", + "filter1: test_layers_filter_can_multi_conn", + "test_layers_plugin_can_multi_conn", + NULL); fprintf (stderr, "%s: protocol connected\n", program_name); -- 2.19.2
Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 2/7] blocksize: Implement filtering of .can_multi_conn (setting it to false).
After examining each filter and where it appears that the filter implements a cache and does not properly consider consistency across clients for flush requests, force .can_multi_conn to return false. In fact I believe only one filter (blocksize) needs to be updated and all the other ones are safe. --- filters/blocksize/blocksize.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/filters/blocksize/blocksize.c b/filters/blocksize/blocksize.c index 34f58c9..9e0e713 100644 --- a/filters/blocksize/blocksize.c +++ b/filters/blocksize/blocksize.c @@ -164,6 +164,17 @@ blocksize_get_size (struct nbdkit_next_ops *next_ops, void *nxdata, return size == -1 ? size : size & ~(minblock - 1); } +static int +blocksize_can_multi_conn (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle) +{ + /* Although we are serializing all requests anyway so this likely + * doesn't make a difference, return false because the bounce buffer + * is not consistent for flush. + */ + return 0; +} + static int blocksize_pread (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, void *b, uint32_t count, uint64_t offs, @@ -370,6 +381,7 @@ static struct nbdkit_filter filter = { .config_help = blocksize_config_help, .prepare = blocksize_prepare, .get_size = blocksize_get_size, + .can_multi_conn = blocksize_can_multi_conn, .pread = blocksize_pread, .pwrite = blocksize_pwrite, .trim = blocksize_trim, -- 2.19.2
Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 3/7] nbd: Pass NBD_FLAG_CAN_MULTI_CONN through from clients.
--- plugins/nbd/nbd.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c index 0443788..674f4a4 100644 --- a/plugins/nbd/nbd.c +++ b/plugins/nbd/nbd.c @@ -644,6 +644,14 @@ nbd_can_fua (void *handle) return h->flags & NBD_FLAG_SEND_FUA ? NBDKIT_FUA_NATIVE : NBDKIT_FUA_NONE; } +static int +nbd_can_multi_conn (void *handle) +{ + struct handle *h = handle; + + return h->flags & NBD_FLAG_CAN_MULTI_CONN; +} + /* Read data from the file. */ static int nbd_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -732,6 +740,7 @@ static struct nbdkit_plugin plugin = { .can_trim = nbd_can_trim, .can_zero = nbd_can_zero, .can_fua = nbd_can_fua, + .can_multi_conn = nbd_can_multi_conn, .pread = nbd_pread, .pwrite = nbd_pwrite, .zero = nbd_zero, -- 2.19.2
Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 4/7] ocaml: Implement .can_multi_conn callback for OCaml plugins.
--- plugins/ocaml/ocaml.c | 25 +++++++++++++++++++++++++ plugins/ocaml/NBDKit.ml | 10 +++++++++- plugins/ocaml/NBDKit.mli | 2 ++ 3 files changed, 36 insertions(+), 1 deletion(-) diff --git a/plugins/ocaml/ocaml.c b/plugins/ocaml/ocaml.c index 7dd5e4e..a89ad08 100644 --- a/plugins/ocaml/ocaml.c +++ b/plugins/ocaml/ocaml.c @@ -111,6 +111,8 @@ static value flush_fn; static value trim_fn; static value zero_fn; +static value can_multi_conn_fn; + /* Wrapper functions that translate calls from C (ie. nbdkit) to OCaml. */ static void @@ -167,6 +169,8 @@ unload_wrapper (void) REMOVE (flush); REMOVE (trim); REMOVE (zero); + + REMOVE (can_multi_conn); } static int @@ -550,6 +554,25 @@ zero_wrapper (void *h, uint32_t count, uint64_t offset, uint32_t flags) CAMLreturnT (int, 0); } +static int +can_multi_conn_wrapper (void *h) +{ + CAMLparam0 (); + CAMLlocal1 (rv); + + caml_leave_blocking_section (); + + rv = caml_callback_exn (can_multi_conn_fn, *(value *) h); + if (Is_exception_result (rv)) { + nbdkit_error ("%s", caml_format_exception (Extract_exception (rv))); + caml_enter_blocking_section (); + CAMLreturnT (int, -1); + } + + caml_enter_blocking_section (); + CAMLreturnT (int, Bool_val (rv)); +} + value ocaml_nbdkit_set_thread_model (value modelv) { @@ -629,6 +652,8 @@ SET(flush) SET(trim) SET(zero) +SET(can_multi_conn) + /* NB: noalloc function. */ value ocaml_nbdkit_set_error (value nv) diff --git a/plugins/ocaml/NBDKit.ml b/plugins/ocaml/NBDKit.ml index 1937bc2..fdeecd3 100644 --- a/plugins/ocaml/NBDKit.ml +++ b/plugins/ocaml/NBDKit.ml @@ -71,6 +71,8 @@ type 'a plugin = { flush : ('a -> flags -> unit) option; trim : ('a -> int32 -> int64 -> flags -> unit) option; zero : ('a -> int32 -> int64 -> flags -> unit) option; + + can_multi_conn : ('a -> bool) option; } let default_callbacks = { @@ -106,6 +108,8 @@ let default_callbacks = { flush = None; trim = None; zero = None; + + can_multi_conn = None; } type thread_model @@ -149,6 +153,8 @@ external set_flush : ('a -> flags -> unit) -> unit = "ocaml_nbdkit_set_flush" external set_trim : ('a -> int32 -> int64 -> flags -> unit) -> unit = "ocaml_nbdkit_set_trim" external set_zero : ('a -> int32 -> int64 -> flags -> unit) -> unit = "ocaml_nbdkit_set_zero" +external set_can_multi_conn : ('a -> bool) -> unit = "ocaml_nbdkit_set_can_multi_conn" + let may f = function None -> () | Some a -> f a let register_plugin thread_model plugin @@ -203,7 +209,9 @@ let register_plugin thread_model plugin may set_pwrite plugin.pwrite; may set_flush plugin.flush; may set_trim plugin.trim; - may set_zero plugin.zero + may set_zero plugin.zero; + + may set_can_multi_conn plugin.can_multi_conn external _set_error : int -> unit = "ocaml_nbdkit_set_error" "noalloc" diff --git a/plugins/ocaml/NBDKit.mli b/plugins/ocaml/NBDKit.mli index c0f9248..ec888ce 100644 --- a/plugins/ocaml/NBDKit.mli +++ b/plugins/ocaml/NBDKit.mli @@ -73,6 +73,8 @@ type 'a plugin = { flush : ('a -> flags -> unit) option; trim : ('a -> int32 -> int64 -> flags -> unit) option; zero : ('a -> int32 -> int64 -> flags -> unit) option; + + can_multi_conn : ('a -> bool) option; } (** The plugin fields and callbacks. ['a] is the handle type. *) -- 2.19.2
Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 5/7] sh: Implement can_multi_conn callback for shell script plugins.
--- plugins/sh/sh.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c index 67750e5..49ade78 100644 --- a/plugins/sh/sh.c +++ b/plugins/sh/sh.c @@ -579,6 +579,12 @@ sh_can_fua (void *handle) } } +static int +sh_can_multi_conn (void *handle) +{ + return boolean_method (handle, "can_multi_conn"); +} + static int sh_flush (void *handle, uint32_t flags) { @@ -699,6 +705,7 @@ static struct nbdkit_plugin plugin = { .can_trim = sh_can_trim, .can_zero = sh_can_zero, .can_fua = sh_can_fua, + .can_multi_conn = sh_can_multi_conn, .pread = sh_pread, .pwrite = sh_pwrite, -- 2.19.2
Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 6/7] file: Implement NBDKIT_API_VERSION 2.
Upgrade this plugin to version 2 of the API. The main change is implementing FUA support. --- plugins/file/file.c | 62 ++++++++++++++++++++++++++++++--------------- 1 file changed, 42 insertions(+), 20 deletions(-) diff --git a/plugins/file/file.c b/plugins/file/file.c index 2a8c9b5..e564787 100644 --- a/plugins/file/file.c +++ b/plugins/file/file.c @@ -52,6 +52,8 @@ #include <linux/fs.h> /* For BLKZEROOUT */ #endif +#define NBDKIT_API_VERSION 2 + #include <nbdkit-plugin.h> #include "isaligned.h" @@ -263,9 +265,30 @@ file_can_trim (void *handle) #endif } +static int +file_can_fua (void *handle) +{ + return NBDKIT_FUA_NATIVE; +} + +/* Flush the file to disk. */ +static int +file_flush (void *handle, uint32_t flags) +{ + struct handle *h = handle; + + if (fdatasync (h->fd) == -1) { + nbdkit_error ("fdatasync: %m"); + return -1; + } + + return 0; +} + /* Read data from the file. */ static int -file_pread (void *handle, void *buf, uint32_t count, uint64_t offset) +file_pread (void *handle, void *buf, uint32_t count, uint64_t offset, + uint32_t flags) { struct handle *h = handle; @@ -289,7 +312,8 @@ file_pread (void *handle, void *buf, uint32_t count, uint64_t offset) /* Write data to the file. */ static int -file_pwrite (void *handle, const void *buf, uint32_t count, uint64_t offset) +file_pwrite (void *handle, const void *buf, uint32_t count, uint64_t offset, + uint32_t flags) { struct handle *h = handle; @@ -304,6 +328,9 @@ file_pwrite (void *handle, const void *buf, uint32_t count, uint64_t offset) offset += r; } + if (flags & NBDKIT_FLAG_FUA && file_flush (handle, 0) == -1) + return -1; + return 0; } @@ -323,20 +350,20 @@ do_fallocate(int fd, int mode, off_t offset, off_t len) /* Write zeroes to the file. */ static int -file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim) +file_zero (void *handle, uint32_t count, uint64_t offset, uint32_t flags) { struct handle *h = handle; int r; #ifdef FALLOC_FL_PUNCH_HOLE - if (h->can_punch_hole && may_trim) { + if (h->can_punch_hole && (flags & NBDKIT_FLAG_MAY_TRIM)) { r = do_fallocate (h->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, offset, count); if (r == 0) { if (file_debug_zero) nbdkit_debug ("h->can_punch_hole && may_trim: " "zero succeeded using fallocate"); - return 0; + goto out; } if (errno != EOPNOTSUPP) { @@ -355,7 +382,7 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim) if (file_debug_zero) nbdkit_debug ("h->can_zero-range: " "zero succeeded using fallocate"); - return 0; + goto out; } if (errno != EOPNOTSUPP) { @@ -380,7 +407,7 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim) if (file_debug_zero) nbdkit_debug ("h->can_punch_hole && h->can_fallocate: " "zero succeeded using fallocate"); - return 0; + goto out; } if (errno != EOPNOTSUPP) { @@ -410,7 +437,7 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim) if (file_debug_zero) nbdkit_debug ("h->can_zeroout && IS_ALIGNED: " "zero succeeded using BLKZEROOUT"); - return 0; + goto out; } if (errno != ENOTTY) { @@ -427,25 +454,16 @@ file_zero (void *handle, uint32_t count, uint64_t offset, int may_trim) nbdkit_debug ("zero falling back to writing"); errno = EOPNOTSUPP; return -1; -} -/* Flush the file to disk. */ -static int -file_flush (void *handle) -{ - struct handle *h = handle; - - if (fdatasync (h->fd) == -1) { - nbdkit_error ("fdatasync: %m"); + out: + if (flags & NBDKIT_FLAG_FUA && file_flush (handle, 0) == -1) return -1; - } - return 0; } /* Punch a hole in the file. */ static int -file_trim (void *handle, uint32_t count, uint64_t offset) +file_trim (void *handle, uint32_t count, uint64_t offset, uint32_t flags) { #ifdef FALLOC_FL_PUNCH_HOLE struct handle *h = handle; @@ -470,6 +488,9 @@ file_trim (void *handle, uint32_t count, uint64_t offset) } #endif + if (flags & NBDKIT_FLAG_FUA && file_flush (handle, 0) == -1) + return -1; + return 0; } @@ -487,6 +508,7 @@ static struct nbdkit_plugin plugin = { .close = file_close, .get_size = file_get_size, .can_trim = file_can_trim, + .can_fua = file_can_fua, .pread = file_pread, .pwrite = file_pwrite, .flush = file_flush, -- 2.19.2
Richard W.M. Jones
2019-Jan-04 22:08 UTC
[Libguestfs] [PATCH nbdkit 7/7] file: Return NBD_FLAG_CAN_MULTI_CONN for the file plugin.
This allows multiple connections from a single client, and should be safe assuming flush/FUA has been implemented correctly in the previous commit. --- plugins/file/file.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/plugins/file/file.c b/plugins/file/file.c index e564787..25ed83a 100644 --- a/plugins/file/file.c +++ b/plugins/file/file.c @@ -250,6 +250,13 @@ file_get_size (void *handle) } } +/* Allow multiple parallel connections from a single client. */ +static int +file_can_multi_conn (void *handle) +{ + return 1; +} + static int file_can_trim (void *handle) { @@ -507,6 +514,7 @@ static struct nbdkit_plugin plugin = { .open = file_open, .close = file_close, .get_size = file_get_size, + .can_multi_conn = file_can_multi_conn, .can_trim = file_can_trim, .can_fua = file_can_fua, .pread = file_pread, -- 2.19.2
Eric Blake
2019-Jan-04 23:26 UTC
Re: [Libguestfs] [PATCH nbdkit 0/7] server: Implement NBD_FLAG_CAN_MULTI_CONN.
On 1/4/19 4:08 PM, Richard W.M. Jones wrote:> First thing to say is that I need to do a *lot* more testing on this, > so this is just an early peek. In particular, although it passed > ‘make check && make check-valgrind’ I have *not* tested it against a > multi-conn-aware client such as the Linux kernel >= 4.9. > > This implements NBD_FLAG_CAN_MULTI_CONN, described in the protocol doc > as: > > "NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates > entirely without cache, or that the cache it uses is shared among > all connections to the given device. In particular, if this flag is > present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST > be visible across all connections when the server sends its reply to > that command to the client. In the absence of this flag, clients > SHOULD NOT multiplex their commands over more than one connection to > the export." > > This is necessary to support the Linux nbd client -C option.Sounds promising. And reminds me that I have not even tried to implement this option for qemu yet, but probably should.> > The only plugin which sets the flag so far is file. The ocaml, sh and > nbd plugins allow the flag to be controlled or passed through. > > Notable also is that the blocksize filter has to filter out this flag, > because I'm not convinced that the bounce buffer is safe. However I > believe the other filters *are* safe (although not really certain > about the fua filter, and the new cache filter is tricky too). > > My feeling is that we should set the flag unconditionally for all > readonly connections, but I can think of nasty corner cases where it > might not work. We should most probably set it in all plugins that > are readonly (eg. nbdkit-pattern-plugin).Should the callback include a bool readonly parameter that lets the filter/plugin know for sure whether it is answering the question on a read-only connection vs. on a read-write connection, as the answer may differ between those two? Should we automatically set the bit for any plugin that has fully-serialized operation but does not provide the callback? That is, if a plugin uses SERIALIZE_CONNECTIONS, you can never have multiple clients at the same time anyways (so the flag is trivially ignorable - a client can try to set up a second connection for speed, but it won't help). A plugin that uses SERIALIZE_ALL_REQUESTS is also trivially supported - any flush request will complete (regardless of which connected client made it) prior to any other client being able to make progress on a request that would need to see the results of the flush. Only when we get to SERIALIZE_REQUESTS or PARALLEL can we have the situation where one client's request can outpace the flush operation in another client. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2019-Jan-05 07:42 UTC
Re: [Libguestfs] [PATCH nbdkit 0/7] server: Implement NBD_FLAG_CAN_MULTI_CONN.
On Fri, Jan 04, 2019 at 05:26:07PM -0600, Eric Blake wrote:> On 1/4/19 4:08 PM, Richard W.M. Jones wrote: > > First thing to say is that I need to do a *lot* more testing on this, > > so this is just an early peek. In particular, although it passed > > ‘make check && make check-valgrind’ I have *not* tested it against a > > multi-conn-aware client such as the Linux kernel >= 4.9. > > > > This implements NBD_FLAG_CAN_MULTI_CONN, described in the protocol doc > > as: > > > > "NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates > > entirely without cache, or that the cache it uses is shared among > > all connections to the given device. In particular, if this flag is > > present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST > > be visible across all connections when the server sends its reply to > > that command to the client. In the absence of this flag, clients > > SHOULD NOT multiplex their commands over more than one connection to > > the export." > > > > This is necessary to support the Linux nbd client -C option. > > Sounds promising. And reminds me that I have not even tried to implement > this option for qemu yet, but probably should.I've yet to do testing, but it seems to be necessary if we're going to do scaling with the Linux client.> > The only plugin which sets the flag so far is file. The ocaml, sh and > > nbd plugins allow the flag to be controlled or passed through. > > > > Notable also is that the blocksize filter has to filter out this flag, > > because I'm not convinced that the bounce buffer is safe. However I > > believe the other filters *are* safe (although not really certain > > about the fua filter, and the new cache filter is tricky too). > > > > My feeling is that we should set the flag unconditionally for all > > readonly connections, but I can think of nasty corner cases where it > > might not work. We should most probably set it in all plugins that > > are readonly (eg. nbdkit-pattern-plugin). > > Should the callback include a bool readonly parameter that lets the > filter/plugin know for sure whether it is answering the question on a > read-only connection vs. on a read-write connection, as the answer may > differ between those two?Yes, I think this is a good idea. Will try that for v2.> Should we automatically set the bit for any plugin that has > fully-serialized operation but does not provide the callback? That is, > if a plugin uses SERIALIZE_CONNECTIONS, you can never have multiple > clients at the same time anyways (so the flag is trivially ignorable - a > client can try to set up a second connection for speed, but it won't > help). A plugin that uses SERIALIZE_ALL_REQUESTS is also trivially > supported - any flush request will complete (regardless of which > connected client made it) prior to any other client being able to make > progress on a request that would need to see the results of the flush. > Only when we get to SERIALIZE_REQUESTS or PARALLEL can we have the > situation where one client's request can outpace the flush operation in > another client.I believe the answer here is _no_: Imagine a plugin which serves different data on different connections, ie. complete different data, such as a "roulette" plugin which selects a different disk for each handle. Such a plugin is of course insane, and it's not something that any of _our_ plugins do, but it's something which is possible. Currently a roulette plugin works with well-behaved clients because they will only open a single connection and thus will always see a single view of the virtual disk. However if this plugin advertized NBD_FLAG_CAN_MULTI_CONN then clients could connect multiple times, and would see completely different data on each connection, which would break those clients horribly. Note this argument is not affected by the thread model or the readonly flag. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Eric Blake
2019-May-15 18:31 UTC
Re: [Libguestfs] [PATCH nbdkit 1/7] server: Implement NBD_FLAG_CAN_MULTI_CONN.
On 1/4/19 4:08 PM, Richard W.M. Jones wrote:>>From the protocol document: > > "NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates > entirely without cache, or that the cache it uses is shared among > all connections to the given device. In particular, if this flag is > present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST > be visible across all connections when the server sends its reply to > that command to the client. In the absence of this flag, clients > SHOULD NOT multiplex their commands over more than one connection to > the export." > > This is implemented in nbdkit by a boolean function which plugins may > implement if they believe they satisfy the requirements above. It can > also be passed through or modified by filters. It defaults to false, > which is the safe default. > --- > docs/nbdkit-filter.pod | 4 ++++ > docs/nbdkit-plugin.pod | 20 ++++++++++++++++++++ > docs/nbdkit-protocol.pod | 4 ++++ > include/nbdkit-filter.h | 3 +++ > include/nbdkit-plugin.h | 2 ++ > server/internal.h | 1 + > server/protocol.h | 1 + > server/connections.c | 9 +++++++++ > server/filters.c | 24 ++++++++++++++++++++++++ > server/plugins.c | 17 +++++++++++++++++ > tests/test-layers-filter.c | 10 ++++++++++ > tests/test-layers-plugin.c | 8 ++++++++ > tests/test-layers.c | 6 ++++++ > 13 files changed, 109 insertions(+)> +++ b/server/connections.c > @@ -84,6 +84,7 @@ struct connection { > bool can_trim; > bool can_zero; > bool can_fua; > + bool can_multi_conn; > bool using_tls; > > int sockin, sockout; > @@ -463,6 +464,14 @@ compute_eflags (struct connection *conn, uint16_t *flags) > conn->is_rotational = true; > } > > + fl = backend->can_multi_conn (backend, conn); > + if (fl == -1) > + return -1; > + if (fl) { > + eflags |= NBD_FLAG_CAN_MULTI_CONN; > + conn->can_multi_conn = true; > + }One more tweak: we should never advertise multi-conn if we are serializing connections (a client attempting a second connection will block on that connection, which defeats the point). I'm applying this followup: +++ b/server/protocol-handshake.c @@ -101,12 +101,16 @@ protocol_compute_eflags (struct connection *conn, uint16_t *flags) conn->is_rotational = true; } - fl = backend->can_multi_conn (backend, conn); - if (fl == -1) - return -1; - if (fl) { - eflags |= NBD_FLAG_CAN_MULTI_CONN; - conn->can_multi_conn = true; + /* multi-conn is useless if parallel connections are not allowed */ + if (backend->thread_model (backend) > + NBDKIT_THREAD_MODEL_SERIALIZE_CONNECTIONS) { + fl = backend->can_multi_conn (backend, conn); + if (fl == -1) + return -1; + if (fl) { + eflags |= NBD_FLAG_CAN_MULTI_CONN; + conn->can_multi_conn = true; + } } fl = backend->can_cache (backend, conn); -- 2.20.1 -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Apparently Analagous Threads
- [PATCH nbdkit v2 01/11] server: Implement NBD_FLAG_CAN_MULTI_CONN.
- [PATCH nbdkit] server: Implement extents/can_extents calls for plugins and filters.
- [PATCH nbdkit] server: Implement extents/can_extents calls for plugins and filters.
- [PATCH nbdkit] server: Implement extents/can_extents calls for plugins and filters.
- [nbdkit PATCH 0/9] can_FOO caching, more filter validation