Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 00/24] implement NBD_CMD_CACHE
Since v1: - rework .can_cache to be tri-state, with default of no advertisement (ripple effect through other patches) - add a lot more patches in order to round out filter support And in the meantime, Rich pushed NBD_CMD_CACHE support into libnbd, so in theory we now have a way to test cache commands through the entire stack. Eric Blake (24): server: Internal hooks for implementing NBD_CMD_CACHE plugins: Add .cache callback file, split: Implement .cache with posix_fadvise plugins: Implement .pread emulation cache plugins: Implement no-op .cache for in-memory plugins nbd: Implement NBD_CMD_CACHE passthrough sh: Implement .cache script callback ocaml: Implement .cache script callback plugins: Document lack of .cache in language bindings filters: Add .cache callback test-layers: Test .cache usage test-eflags: Test .can_cache support blocksize: Implement .cache rounding cache: Implement .cache cow: Implement .cache delay: Implement .cache error: Implement .cache log: Implement .cache offset, partition: Implement .cache readahead, xz: Implement .can_cache via emulation stats: Implement .cache truncate: Implement .cache filters: Pass through .can_cache for remaining filters nocache: Implement new filter docs/nbdkit-filter.pod | 42 +++++-- docs/nbdkit-plugin.pod | 82 +++++++++++++ docs/nbdkit-protocol.pod | 8 ++ filters/cache/nbdkit-cache-filter.pod | 6 +- filters/cow/nbdkit-cow-filter.pod | 15 ++- filters/delay/nbdkit-delay-filter.pod | 9 +- filters/error/nbdkit-error-filter.pod | 8 +- filters/fua/nbdkit-fua-filter.pod | 5 +- filters/log/nbdkit-log-filter.pod | 10 +- filters/nocache/nbdkit-nocache-filter.pod | 69 +++++++++++ filters/noextents/nbdkit-noextents-filter.pod | 2 + filters/nozero/nbdkit-nozero-filter.pod | 2 + plugins/lua/nbdkit-lua-plugin.pod | 3 +- plugins/perl/nbdkit-perl-plugin.pod | 2 +- plugins/python/nbdkit-python-plugin.pod | 2 +- plugins/ruby/nbdkit-ruby-plugin.pod | 2 +- plugins/sh/nbdkit-sh-plugin.pod | 27 ++++- plugins/tcl/nbdkit-tcl-plugin.pod | 3 +- configure.ac | 5 +- common/protocol/protocol.h | 2 + filters/cache/blk.h | 10 +- filters/cow/blk.h | 16 ++- include/nbdkit-common.h | 4 + include/nbdkit-filter.h | 8 ++ include/nbdkit-plugin.h | 2 + server/internal.h | 5 + filters/blocksize/blocksize.c | 29 +++++ filters/cache/blk.c | 50 +++++++- filters/cache/cache.c | 53 ++++++++ filters/cow/blk.c | 41 ++++++- filters/cow/cow.c | 90 ++++++++++++++ filters/delay/delay.c | 27 ++++- filters/error/error.c | 32 ++++- filters/log/log.c | 29 ++++- filters/nocache/nocache.c | 113 ++++++++++++++++++ filters/offset/offset.c | 12 +- filters/partition/partition.c | 12 ++ filters/readahead/readahead.c | 13 ++ filters/stats/stats.c | 23 ++++ filters/truncate/truncate.c | 24 ++++ filters/xz/xz.c | 15 ++- plugins/data/data.c | 11 ++ plugins/ext2/ext2.c | 10 +- plugins/file/file.c | 37 ++++++ plugins/floppy/floppy.c | 11 +- plugins/full/full.c | 13 +- plugins/iso/iso.c | 10 +- plugins/linuxdisk/linuxdisk.c | 9 ++ plugins/memory/memory.c | 11 ++ plugins/nbd/nbd.c | 24 ++++ plugins/null/null.c | 13 +- plugins/ocaml/ocaml.c | 51 ++++++++ plugins/partitioning/partitioning.c | 11 +- plugins/pattern/pattern.c | 13 +- plugins/random/random.c | 13 +- plugins/sh/sh.c | 81 +++++++++++++ plugins/split/split.c | 52 +++++++- plugins/zero/zero.c | 13 +- server/filters.c | 59 ++++++++- server/plugins.c | 44 +++++++ server/protocol-handshake.c | 9 ++ server/protocol.c | 26 ++++ tests/test-layers-filter.c | 22 +++- tests/test-layers-plugin.c | 17 +++ tests/test-layers.c | 36 ++++++ filters/nocache/Makefile.am | 61 ++++++++++ plugins/ocaml/NBDKit.ml | 16 ++- plugins/ocaml/NBDKit.mli | 5 + plugins/sh/example.sh | 7 ++ tests/test-eflags.sh | 36 +++++- 70 files changed, 1570 insertions(+), 63 deletions(-) create mode 100644 filters/nocache/nbdkit-nocache-filter.pod create mode 100644 filters/nocache/nocache.c create mode 100644 filters/nocache/Makefile.am -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 01/24] server: Internal hooks for implementing NBD_CMD_CACHE
The NBD spec documents NBD_CMD_CACHE as an optional extension, although it is rather vague on what semantics are required (future spec additions may add flags to the cache command, where the use of a flag requires specific caching behavior or an error if that behavior is not possible). Unlike NBD_CMD_WRITE_ZEROES, we do not want to default to an emulation (calling .pread and ignoring the results works for some cases like local file systems, but actually penalizes other cases like network access); still the code for doing a .pread and ignoring the result is common enough to warrant having .can_cache return a tri-state (similar to .can_fua). This patch wires up the backend for the new entry points as well as the emulation handling, although until later patches actually expose the new callbacks for filters and plugins, a client cannot observe any difference yet. Note that the NBD spec states that some older clients call NBD_CMD_CACHE without first checking whether NBD_FLAG_SEND_CACHE is set; we choose to flag cache requests from such clients as invalid. Also, for bisection reasons, this patch treats any use of a filter as a forced .can_cache of NBDKIT_CACHE_NONE to rather than passthrough to the plugin's state. Once all affected filters are patched to handle cache requests correctly, a later patch will then switch the filter default to passthrough for the sake of remaining filters. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/nbdkit-protocol.pod | 8 ++++++++ common/protocol/protocol.h | 2 ++ include/nbdkit-common.h | 4 ++++ server/internal.h | 5 +++++ server/filters.c | 33 ++++++++++++++++++++++++++++++++- server/plugins.c | 34 ++++++++++++++++++++++++++++++++++ server/protocol-handshake.c | 9 +++++++++ server/protocol.c | 26 ++++++++++++++++++++++++++ 8 files changed, 120 insertions(+), 1 deletion(-) diff --git a/docs/nbdkit-protocol.pod b/docs/nbdkit-protocol.pod index f706cfd..ad470bd 100644 --- a/docs/nbdkit-protocol.pod +++ b/docs/nbdkit-protocol.pod @@ -152,6 +152,14 @@ when structured replies are in effect. However, the flag is a no-op until we extend the plugin API to allow a fragmented read in the first place. +=item C<NBD_CMD_CACHE> + +Supported in nbdkit E<ge> 1.13.4. + +This protocol extension allows a client to inform the server about +intent to access a portion of the export, to allow the server an +opportunity to cache things appropriately. + =item Resize Extension I<Not supported>. diff --git a/common/protocol/protocol.h b/common/protocol/protocol.h index c27104c..e938643 100644 --- a/common/protocol/protocol.h +++ b/common/protocol/protocol.h @@ -95,6 +95,7 @@ extern const char *name_of_nbd_flag (int); #define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) #define NBD_FLAG_SEND_DF (1 << 7) #define NBD_FLAG_CAN_MULTI_CONN (1 << 8) +#define NBD_FLAG_SEND_CACHE (1 << 10) /* NBD options (new style handshake only). */ extern const char *name_of_nbd_opt (int); @@ -217,6 +218,7 @@ extern const char *name_of_nbd_cmd (int); #define NBD_CMD_DISC 2 /* Disconnect. */ #define NBD_CMD_FLUSH 3 #define NBD_CMD_TRIM 4 +#define NBD_CMD_CACHE 5 #define NBD_CMD_WRITE_ZEROES 6 #define NBD_CMD_BLOCK_STATUS 7 diff --git a/include/nbdkit-common.h b/include/nbdkit-common.h index 636a789..5257d99 100644 --- a/include/nbdkit-common.h +++ b/include/nbdkit-common.h @@ -65,6 +65,10 @@ extern "C" { #define NBDKIT_FUA_EMULATE 1 #define NBDKIT_FUA_NATIVE 2 +#define NBDKIT_CACHE_NONE 0 +#define NBDKIT_CACHE_EMULATE 1 +#define NBDKIT_CACHE_NATIVE 2 + #define NBDKIT_EXTENT_HOLE (1<<0) /* Same as NBD_STATE_HOLE */ #define NBDKIT_EXTENT_ZERO (1<<1) /* Same as NBD_STATE_ZERO */ diff --git a/server/internal.h b/server/internal.h index 67fccfc..2ee5e23 100644 --- a/server/internal.h +++ b/server/internal.h @@ -170,6 +170,8 @@ struct connection { bool can_zero; bool can_fua; bool can_multi_conn; + bool can_cache; + bool emulate_cache; bool can_extents; bool using_tls; bool structured_replies; @@ -276,6 +278,7 @@ struct backend { int (*can_extents) (struct backend *, struct connection *conn); int (*can_fua) (struct backend *, struct connection *conn); int (*can_multi_conn) (struct backend *, struct connection *conn); + int (*can_cache) (struct backend *, struct connection *conn); int (*pread) (struct backend *, struct connection *conn, void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err); @@ -290,6 +293,8 @@ struct backend { int (*extents) (struct backend *, struct connection *conn, uint32_t count, uint64_t offset, uint32_t flags, struct nbdkit_extents *extents, int *err); + int (*cache) (struct backend *, struct connection *conn, uint32_t count, + uint64_t offset, uint32_t flags, int *err); }; /* plugins.c */ diff --git a/server/filters.c b/server/filters.c index b73e74f..e456fbf 100644 --- a/server/filters.c +++ b/server/filters.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013-2018 Red Hat Inc. + * Copyright (C) 2013-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -573,6 +573,18 @@ filter_can_multi_conn (struct backend *b, struct connection *conn) return f->backend.next->can_multi_conn (f->backend.next, conn); } +static int +filter_can_cache (struct backend *b, struct connection *conn) +{ + struct backend_filter *f = container_of (b, struct backend_filter, backend); + + debug ("%s: can_cache", f->name); + + /* FIXME: Default to f->backend.next->can_cache, once all filters + have been audited */ + return NBDKIT_CACHE_NONE; +} + static int filter_pread (struct backend *b, struct connection *conn, void *buf, uint32_t count, uint64_t offset, @@ -702,6 +714,23 @@ filter_extents (struct backend *b, struct connection *conn, extents, err); } +static int +filter_cache (struct backend *b, struct connection *conn, + uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + struct backend_filter *f = container_of (b, struct backend_filter, backend); + + assert (flags == 0); + + debug ("%s: cache count=%" PRIu32 " offset=%" PRIu64 " flags=0x%" PRIx32, + f->name, count, offset, flags); + + /* FIXME: Allow filter to rewrite request */ + return f->backend.next->cache (f->backend.next, conn, + count, offset, flags, err); +} + static struct backend filter_functions = { .free = filter_free, .thread_model = filter_thread_model, @@ -726,12 +755,14 @@ static struct backend filter_functions = { .can_extents = filter_can_extents, .can_fua = filter_can_fua, .can_multi_conn = filter_can_multi_conn, + .can_cache = filter_can_cache, .pread = filter_pread, .pwrite = filter_pwrite, .flush = filter_flush, .trim = filter_trim, .zero = filter_zero, .extents = filter_extents, + .cache = filter_cache, }; /* Register and load a filter. */ diff --git a/server/plugins.c b/server/plugins.c index e26d133..cb9a50c 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -448,6 +448,17 @@ plugin_can_multi_conn (struct backend *b, struct connection *conn) return 0; /* assume false */ } +static int +plugin_can_cache (struct backend *b, struct connection *conn) +{ + assert (connection_get_handle (conn, 0)); + + debug ("can_cache"); + + /* FIXME: return default based on plugin->cache */ + return NBDKIT_CACHE_NONE; +} + /* Plugins and filters can call this to set the true errno, in cases * where !errno_is_preserved. */ @@ -693,6 +704,27 @@ plugin_extents (struct backend *b, struct connection *conn, return r; } +static int +plugin_cache (struct backend *b, struct connection *conn, + uint32_t count, uint64_t offset, uint32_t flags, + int *err) +{ + struct backend_plugin *p = container_of (b, struct backend_plugin, backend); + int r = -1; + + assert (connection_get_handle (conn, 0)); + assert (!flags); + + debug ("cache count=%" PRIu32 " offset=%" PRIu64, count, offset); + + /* FIXME: assert plugin->cache and call it */ + assert (false); + + if (r == -1) + *err = get_error (p); + return r; +} + static struct backend plugin_functions = { .free = plugin_free, .thread_model = plugin_thread_model, @@ -717,12 +749,14 @@ static struct backend plugin_functions = { .can_extents = plugin_can_extents, .can_fua = plugin_can_fua, .can_multi_conn = plugin_can_multi_conn, + .can_cache = plugin_can_cache, .pread = plugin_pread, .pwrite = plugin_pwrite, .flush = plugin_flush, .trim = plugin_trim, .zero = plugin_zero, .extents = plugin_extents, + .cache = plugin_cache, }; /* Register and load a plugin. */ diff --git a/server/protocol-handshake.c b/server/protocol-handshake.c index d8cde77..0f3bd28 100644 --- a/server/protocol-handshake.c +++ b/server/protocol-handshake.c @@ -113,6 +113,15 @@ protocol_compute_eflags (struct connection *conn, uint16_t *flags) } } + fl = backend->can_cache (backend, conn); + if (fl == -1) + return -1; + if (fl) { + eflags |= NBD_FLAG_SEND_CACHE; + conn->can_cache = true; + conn->emulate_cache = fl == NBDKIT_CACHE_EMULATE; + } + /* The result of this is not returned to callers here (or at any * time during the handshake). However it makes sense to do it once * per connection and store the result in the handle anyway. This diff --git a/server/protocol.c b/server/protocol.c index 01d4c71..792b1ac 100644 --- a/server/protocol.c +++ b/server/protocol.c @@ -76,6 +76,7 @@ validate_request (struct connection *conn, /* Validate cmd, offset, count. */ switch (cmd) { case NBD_CMD_READ: + case NBD_CMD_CACHE: case NBD_CMD_WRITE: case NBD_CMD_TRIM: case NBD_CMD_WRITE_ZEROES: @@ -180,6 +181,14 @@ validate_request (struct connection *conn, return false; } + /* Cache allowed? */ + if (!conn->can_cache && cmd == NBD_CMD_CACHE) { + nbdkit_error ("invalid request: %s: cache operation not supported", + name_of_nbd_cmd (cmd)); + *error = EINVAL; + return false; + } + /* Block status allowed? */ if (cmd == NBD_CMD_BLOCK_STATUS) { if (!conn->structured_replies) { @@ -254,6 +263,23 @@ handle_request (struct connection *conn, return err; break; + case NBD_CMD_CACHE: + if (conn->emulate_cache) { + static char buf[MAX_REQUEST_SIZE]; /* data sink, never read */ + uint32_t limit; + + while (count) { + limit = MIN (count, sizeof buf); + if (backend->pread (backend, conn, buf, limit, offset, flags, + &err) == -1) + return err; + count -= limit; + } + } + else if (backend->cache (backend, conn, count, offset, 0, &err) == -1) + return err; + break; + case NBD_CMD_WRITE_ZEROES: if (!(flags & NBD_CMD_FLAG_NO_HOLE)) f |= NBDKIT_FLAG_MAY_TRIM; -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 02/24] plugins: Add .cache callback
Make it possible for plugins to advertise caching support, and to implement their own caching algorithm. There are no universally common cache algorithms (calling .pread and ignoring the buffer works great for local files, but actually penalizes remote network access), so the code intentionally defaults .can_cache to NONE if .cache is missing. However, as both .pread (files) and a silent no-op (for plugins serving data that only resides in memory already) are common implementations, we allow .can_cache to return success even when .cache is missing (which is different from many of the other callbacks, where .can_FOO should not return true unless .FOO is present). On the other hand, if only .cache is present, defaulting .can_cache to NATIVE makes sense. Thus, .can_cache is a tri-state similar to .can_fua. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/nbdkit-plugin.pod | 82 +++++++++++++++++++++++++++++++++++++++++ include/nbdkit-plugin.h | 2 + server/plugins.c | 18 +++++++-- 3 files changed, 98 insertions(+), 4 deletions(-) diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod index 318be4c..7f83234 100644 --- a/docs/nbdkit-plugin.pod +++ b/docs/nbdkit-plugin.pod @@ -222,6 +222,37 @@ landed in persistent storage. =back +The following defines are valid as successful return values for +C<.can_cache>: + +=over 4 + +=item C<NBDKIT_CACHE_NONE> + +The server does not advertise caching support, and rejects any +client-requested caching. Any C<.cache> callback is ignored. + +=item C<NBDKIT_CACHE_EMULATE> + +The nbdkit server advertises cache support to the client, where the +client may request that the server cache a region of the export to +potentially speed up future read and/or write operations on that +region. The nbdkit server implements the caching by calling C<.pread> +and ignoring the results. This option exists to ease the +implementation of a common form of caching; any C<.cache> callback is +ignored. + +=item C<NBDKIT_CACHE_NATIVE> + +The nbdkit server advertises cache support to the client, where the +client may request that the server cache a region of the export to +potentially speed up future read and/or write operations on that +region. The nbdkit server calls the C<.cache> callback to perform the +caching; if that callback is missing, the client's cache request +succeeds without doing anything. + +=back + =head1 ERROR HANDLING If there is an error in the plugin, the plugin should call @@ -620,6 +651,29 @@ with an error message and return C<-1>. This callback is not required. If omitted, then we return false. +=head2 C<.can_cache> + + int can_cache (void *handle); + +This is called during the option negotiation phase to find out if the +plugin supports a cache operation. The nature of the caching is +unspecified (including whether there are limits on how much can be +cached at once, and whether writes to a cached region have +write-through or write-back semantics), but the command exists to let +clients issue a hint to the server that they will be accessing that +region of the export. + +If this returns C<NBDKIT_CACHE_NONE>, cache support is not advertised +to the guest; if this returns C<NBDKIT_CACHE_EMULATE>, caching is +emulated by the server calling C<.pread> and ignoring the results; if +this returns C<NBDKIT_CACHE_NATIVE>, then the C<.cache> callback will +be used. If there is an error, C<.can_cache> should call +C<nbdkit_error> with an error message and return C<-1>. + +This callback is not required. If omitted, then we return +C<NBDKIT_CACHE_NONE> if the C<.cache> callback is missing, or +C<NBDKIT_CACHE_NATIVE> if it is defined. + =head2 C<.pread> int pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -814,6 +868,34 @@ C<nbdkit_extent_add> returns C<0> on success or C<-1> on failure. On failure C<nbdkit_error> and/or C<nbdkit_set_error> has already been called. C<errno> will be set to a suitable value. +=head2 C<.cache> + + int cache (void *handle, uint32_t count, uint64_t offset, uint32_t flags); + +During the data serving phase, this callback is used to give the +plugin a hint that the client intends to make further accesses to the +given region of the export. The nature of caching is not specified +further by the NBD specification (for example, a server may place +limits on how much may be cached at once, and there is no way to +control if writes to a cached area have write-through or write-back +semantics). In fact, the cache command can always fail and still be +compliant, and success might not guarantee a performance gain. If +this callback is omitted, then the results of C<.can_cache> determine +whether nbdkit will reject cache requests, treat them as instant +success, or emulate caching by calling C<.pread> over the same region +and ignoring the results. + +This function will not be called if C<.can_cache> did not return +C<NBDKIT_CACHE_NATIVE>. The parameter C<flags> exists in case of +future NBD protocol extensions; at this time, it will be 0 on input. A +plugin must fail this function if C<flags> includes an unrecognized +flag, as that may indicate a requirement that the plugin comply must +with a specific caching semantic. + +If there is an error, C<.cache> should call C<nbdkit_error> with an +error message, and C<nbdkit_set_error> to record an appropriate error +(unless C<errno> is sufficient), then return C<-1>. + =head1 THREADS Each nbdkit plugin must declare its thread safety model by defining diff --git a/include/nbdkit-plugin.h b/include/nbdkit-plugin.h index 54b4ce2..e9b1808 100644 --- a/include/nbdkit-plugin.h +++ b/include/nbdkit-plugin.h @@ -128,6 +128,8 @@ struct nbdkit_plugin { int (*can_extents) (void *handle); int (*extents) (void *handle, uint32_t count, uint64_t offset, uint32_t flags, struct nbdkit_extents *extents); + int (*can_cache) (void *handle); + int (*cache) (void *handle, uint32_t count, uint64_t offset, uint32_t flags); }; extern void nbdkit_set_error (int err); diff --git a/server/plugins.c b/server/plugins.c index cb9a50c..f293e6a 100644 --- a/server/plugins.c +++ b/server/plugins.c @@ -201,6 +201,8 @@ plugin_dump_fields (struct backend *b) HAS (can_multi_conn); HAS (can_extents); HAS (extents); + HAS (can_cache); + HAS (cache); #undef HAS /* Custom fields. */ @@ -451,11 +453,16 @@ plugin_can_multi_conn (struct backend *b, struct connection *conn) static int plugin_can_cache (struct backend *b, struct connection *conn) { + struct backend_plugin *p = container_of (b, struct backend_plugin, backend); + assert (connection_get_handle (conn, 0)); debug ("can_cache"); - /* FIXME: return default based on plugin->cache */ + if (p->plugin.can_cache) + return p->plugin.can_cache (connection_get_handle (conn, 0)); + if (p->plugin.cache) + return NBDKIT_CACHE_NATIVE; return NBDKIT_CACHE_NONE; } @@ -710,16 +717,19 @@ plugin_cache (struct backend *b, struct connection *conn, int *err) { struct backend_plugin *p = container_of (b, struct backend_plugin, backend); - int r = -1; + int r; assert (connection_get_handle (conn, 0)); assert (!flags); debug ("cache count=%" PRIu32 " offset=%" PRIu64, count, offset); - /* FIXME: assert plugin->cache and call it */ - assert (false); + /* A plugin may advertise caching but not provide .cache; in that + * case, caching is explicitly a no-op. */ + if (!p->plugin.cache) + return 0; + r = p->plugin.cache (connection_get_handle (conn, 0), count, offset, flags); if (r == -1) *err = get_error (p); return r; -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 03/24] file, split: Implement .cache with posix_fadvise
Since NBD_CMD_CACHE is already advisory, let's use an advisory kernel interface to implement it ;) Even when posix_fadvise() is not present, it is likely that nbdkit's fallback to .pread will actually have a similar benefit in populating the filesystem cache, since we aren't using O_DIRECT to avoid that cache, so always define .can_cache with one of the two positive results. Signed-off-by: Eric Blake <eblake@redhat.com> --- configure.ac | 3 ++- plugins/file/file.c | 37 ++++++++++++++++++++++++++++++ plugins/split/split.c | 52 ++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 90 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 58031f3..06124c5 100644 --- a/configure.ac +++ b/configure.ac @@ -195,7 +195,8 @@ dnl Check for functions in libc, all optional. AC_CHECK_FUNCS([\ fdatasync \ get_current_dir_name \ - mkostemp]) + mkostemp \ + posix_fadvise]) dnl Check whether printf("%m") works AC_CACHE_CHECK([whether the printf family supports %m], diff --git a/plugins/file/file.c b/plugins/file/file.c index f0ac23b..4d4bcba 100644 --- a/plugins/file/file.c +++ b/plugins/file/file.c @@ -294,6 +294,20 @@ file_can_fua (void *handle) return NBDKIT_FUA_NATIVE; } +static int +file_can_cache (void *handle) +{ + /* Prefer posix_fadvise(), but letting nbdkit call .pread on our + * behalf also tends to work well for the local file system + * cache. + */ +#if HAVE_POSIX_FADVISE + return NBDKIT_FUA_NATIVE; +#else + return NBDKIT_FUA_EMULATE; +#endif +} + /* Flush the file to disk. */ static int file_flush (void *handle, uint32_t flags) @@ -608,6 +622,25 @@ file_extents (void *handle, uint32_t count, uint64_t offset, } #endif /* SEEK_HOLE */ +#if HAVE_POSIX_FADVISE +/* Caching. */ +static int +file_cache (void *handle, uint32_t count, uint64_t offset, uint32_t flags) +{ + struct handle *h = handle; + int r; + + /* Cache is advisory, we don't care if this fails */ + r = posix_fadvise (h->fd, offset, count, POSIX_FADV_WILLNEED); + if (r) { + errno = r; + nbdkit_error ("posix_fadvise: %m"); + return -1; + } + return 0; +} +#endif /* HAVE_POSIX_FADVISE */ + static struct nbdkit_plugin plugin = { .name = "file", .longname = "nbdkit file plugin", @@ -624,6 +657,7 @@ static struct nbdkit_plugin plugin = { .can_multi_conn = file_can_multi_conn, .can_trim = file_can_trim, .can_fua = file_can_fua, + .can_cache = file_can_cache, .pread = file_pread, .pwrite = file_pwrite, .flush = file_flush, @@ -632,6 +666,9 @@ static struct nbdkit_plugin plugin = { #ifdef SEEK_HOLE .can_extents = file_can_extents, .extents = file_extents, +#endif +#if HAVE_POSIX_FADVISE + .cache = file_cache, #endif .errno_is_preserved = 1, }; diff --git a/plugins/split/split.c b/plugins/split/split.c index cf2b2c7..1b8e69a 100644 --- a/plugins/split/split.c +++ b/plugins/split/split.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2018 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -195,6 +195,20 @@ split_get_size (void *handle) return (int64_t) h->size; } +static int +split_can_cache (void *handle) +{ + /* Prefer posix_fadvise(), but letting nbdkit call .pread on our + * behalf also tends to work well for the local file system + * cache. + */ +#if HAVE_POSIX_FADVISE + return NBDKIT_FUA_NATIVE; +#else + return NBDKIT_FUA_EMULATE; +#endif +} + /* Helper function to map the offset to the correct file. */ static int compare_offset (const void *offsetp, const void *filep) @@ -277,6 +291,38 @@ split_pwrite (void *handle, const void *buf, uint32_t count, uint64_t offset) return 0; } +#if HAVE_POSIX_FADVISE +/* Caching. */ +static int +split_cache (void *handle, uint32_t count, uint64_t offset, uint32_t flags) +{ + struct handle *h = handle; + + /* Cache is advisory, we don't care if this fails */ + while (count > 0) { + struct file *file = get_file (h, offset); + uint64_t foffs = offset - file->offset; + uint64_t max; + int r; + + max = file->size - foffs; + if (max > count) + max = count; + + r = posix_fadvise (file->fd, offset, max, POSIX_FADV_WILLNEED); + if (r) { + errno = r; + nbdkit_error ("posix_fadvise: %m"); + return -1; + } + count -= r; + offset += r; + } + + return 0; +} +#endif /* HAVE_POSIX_FADVISE */ + static struct nbdkit_plugin plugin = { .name = "split", .version = PACKAGE_VERSION, @@ -287,8 +333,12 @@ static struct nbdkit_plugin plugin = { .open = split_open, .close = split_close, .get_size = split_get_size, + .can_cache = split_can_cache, .pread = split_pread, .pwrite = split_pwrite, +#if HAVE_POSIX_FADVISE + .cache = split_cache, +#endif /* In this plugin, errno is preserved properly along error return * paths from failed system calls. */ -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 04/24] plugins: Implement .pread emulation cache
For our plugins which are reading from one or more local files, calling .pread is likely to populate the kernel's file cache to our advantage; these plugins are complicated enough that there is nothing better like posix_fadvise() that we can try. Implementing .can_cache is sufficient to let nbdkit do the desired work on our behalf. Full list of plugins changed: ext2, floppy, iso, linuxdisk, partitioning Note that the tar plugin would likewise probably benefit from .pread treatment; but for that, we'd first have to wire up .can_cache to the perl language binding. Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/ext2/ext2.c | 10 +++++++++- plugins/floppy/floppy.c | 11 ++++++++++- plugins/iso/iso.c | 10 +++++++++- plugins/linuxdisk/linuxdisk.c | 9 +++++++++ plugins/partitioning/partitioning.c | 11 ++++++++++- 5 files changed, 47 insertions(+), 4 deletions(-) diff --git a/plugins/ext2/ext2.c b/plugins/ext2/ext2.c index 17f88fe..6698d99 100644 --- a/plugins/ext2/ext2.c +++ b/plugins/ext2/ext2.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2018 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -212,6 +212,13 @@ ext2_can_fua (void *handle) return NBDKIT_FUA_NATIVE; } +static int +ext2_can_cache (void *handle) +{ + /* Let nbdkit call pread to populate the file system cache. */ + return NBDKIT_CACHE_EMULATE; +} + /* It might be possible to relax this, but it's complicated. * * It's desirable for ‘nbdkit -r’ to behave the same way as @@ -345,6 +352,7 @@ static struct nbdkit_plugin plugin = { .open = ext2_open, .close = ext2_close, .can_fua = ext2_can_fua, + .can_cache = ext2_can_cache, .get_size = ext2_get_size, .pread = ext2_pread, .pwrite = ext2_pwrite, diff --git a/plugins/floppy/floppy.c b/plugins/floppy/floppy.c index ebdea5b..41a2364 100644 --- a/plugins/floppy/floppy.c +++ b/plugins/floppy/floppy.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -128,6 +128,14 @@ floppy_can_multi_conn (void *handle) return 1; } +/* Cache. */ +static int +floppy_can_cache (void *handle) +{ + /* Let nbdkit call pread to populate the file system cache. */ + return NBDKIT_CACHE_EMULATE; +} + /* Read data from the file. */ static int floppy_pread (void *handle, void *buf, uint32_t count, uint64_t offset) @@ -199,6 +207,7 @@ static struct nbdkit_plugin plugin = { .open = floppy_open, .get_size = floppy_get_size, .can_multi_conn = floppy_can_multi_conn, + .can_cache = floppy_can_cache, .pread = floppy_pread, .errno_is_preserved = 1, }; diff --git a/plugins/iso/iso.c b/plugins/iso/iso.c index 586f1f9..4728ff3 100644 --- a/plugins/iso/iso.c +++ b/plugins/iso/iso.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -215,6 +215,13 @@ iso_can_multi_conn (void *handle) return 1; } +static int +iso_can_cache (void *handle) +{ + /* Let nbdkit call pread to populate the file system cache. */ + return NBDKIT_CACHE_EMULATE; +} + /* Read data from the file. */ static int iso_pread (void *handle, void *buf, uint32_t count, uint64_t offset) @@ -249,6 +256,7 @@ static struct nbdkit_plugin plugin = { .open = iso_open, .get_size = iso_get_size, .can_multi_conn = iso_can_multi_conn, + .can_cache = iso_can_cache, .pread = iso_pread, .errno_is_preserved = 1, }; diff --git a/plugins/linuxdisk/linuxdisk.c b/plugins/linuxdisk/linuxdisk.c index 1ba7114..99dbc99 100644 --- a/plugins/linuxdisk/linuxdisk.c +++ b/plugins/linuxdisk/linuxdisk.c @@ -159,6 +159,14 @@ linuxdisk_can_multi_conn (void *handle) return 1; } +/* Cache. */ +static int +linuxdisk_can_cache (void *handle) +{ + /* Let nbdkit call pread to populate the file system cache. */ + return NBDKIT_CACHE_EMULATE; +} + /* Read data from the virtual disk. */ static int linuxdisk_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -221,6 +229,7 @@ static struct nbdkit_plugin plugin = { .open = linuxdisk_open, .get_size = linuxdisk_get_size, .can_multi_conn = linuxdisk_can_multi_conn, + .can_cache = linuxdisk_can_cache, .pread = linuxdisk_pread, .errno_is_preserved = 1, }; diff --git a/plugins/partitioning/partitioning.c b/plugins/partitioning/partitioning.c index 630c6d2..90333bf 100644 --- a/plugins/partitioning/partitioning.c +++ b/plugins/partitioning/partitioning.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -297,6 +297,14 @@ partitioning_can_multi_conn (void *handle) return 1; } +/* Cache. */ +static int +partitioning_can_cache (void *handle) +{ + /* Let nbdkit call pread to populate the file system cache. */ + return NBDKIT_CACHE_EMULATE; +} + /* Read data. */ static int partitioning_pread (void *handle, void *buf, uint32_t count, uint64_t offset) @@ -426,6 +434,7 @@ static struct nbdkit_plugin plugin = { .open = partitioning_open, .get_size = partitioning_get_size, .can_multi_conn = partitioning_can_multi_conn, + .can_cache = partitioning_can_cache, .pread = partitioning_pread, .pwrite = partitioning_pwrite, .flush = partitioning_flush, -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 05/24] plugins: Implement no-op .cache for in-memory plugins
For our plugins which have no backing file but generate everything on the fly or store things in memory, there's really nothing to speed up, and also no reason why we can't advertise caching to the client. Implement .can_cache but not .cache to get nbdkit to do the work on our behalf. Full list of plugins changed: data, full, memory, null, pattern, random, zero Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/data/data.c | 11 +++++++++++ plugins/full/full.c | 13 ++++++++++++- plugins/memory/memory.c | 11 +++++++++++ plugins/null/null.c | 13 ++++++++++++- plugins/pattern/pattern.c | 13 ++++++++++++- plugins/random/random.c | 13 ++++++++++++- plugins/zero/zero.c | 13 ++++++++++++- 7 files changed, 82 insertions(+), 5 deletions(-) diff --git a/plugins/data/data.c b/plugins/data/data.c index 55380c6..b0e08cb 100644 --- a/plugins/data/data.c +++ b/plugins/data/data.c @@ -333,6 +333,16 @@ data_can_multi_conn (void *handle) return 1; } +/* Cache. */ +static int +data_can_cache (void *handle) +{ + /* Everything is already in memory, returning this without + * implementing .cache lets nbdkit do the correct no-op. + */ + return NBDKIT_CACHE_NATIVE; +} + /* Read data. */ static int data_pread (void *handle, void *buf, uint32_t count, uint64_t offset) @@ -389,6 +399,7 @@ static struct nbdkit_plugin plugin = { .open = data_open, .get_size = data_get_size, .can_multi_conn = data_can_multi_conn, + .can_cache = data_can_cache, .pread = data_pread, .pwrite = data_pwrite, .zero = data_zero, diff --git a/plugins/full/full.c b/plugins/full/full.c index 7661856..65b8259 100644 --- a/plugins/full/full.c +++ b/plugins/full/full.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2018 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -94,6 +94,16 @@ full_get_size (void *handle) return size; } +/* Cache. */ +static int +full_can_cache (void *handle) +{ + /* Everything is already in memory, returning this without + * implementing .cache lets nbdkit do the correct no-op. + */ + return NBDKIT_CACHE_NATIVE; +} + /* Read data. */ static int full_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -151,6 +161,7 @@ static struct nbdkit_plugin plugin = { .magic_config_key = "size", .open = full_open, .get_size = full_get_size, + .can_cache = full_can_cache, .pread = full_pread, .pwrite = full_pwrite, .zero = full_zero, diff --git a/plugins/memory/memory.c b/plugins/memory/memory.c index 90fa99e..234d414 100644 --- a/plugins/memory/memory.c +++ b/plugins/memory/memory.c @@ -128,6 +128,16 @@ memory_can_multi_conn (void *handle) return 1; } +/* Cache. */ +static int +memory_can_cache (void *handle) +{ + /* Everything is already in memory, returning this without + * implementing .cache lets nbdkit do the correct no-op. + */ + return NBDKIT_CACHE_NATIVE; +} + /* Read data. */ static int memory_pread (void *handle, void *buf, uint32_t count, uint64_t offset) @@ -184,6 +194,7 @@ static struct nbdkit_plugin plugin = { .open = memory_open, .get_size = memory_get_size, .can_multi_conn = memory_can_multi_conn, + .can_cache = memory_can_cache, .pread = memory_pread, .pwrite = memory_pwrite, .zero = memory_zero, diff --git a/plugins/null/null.c b/plugins/null/null.c index 518b63b..5af7ab9 100644 --- a/plugins/null/null.c +++ b/plugins/null/null.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2018 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -83,6 +83,16 @@ null_get_size (void *handle) return size; } +/* Cache. */ +static int +null_can_cache (void *handle) +{ + /* Everything is already in memory, returning this without + * implementing .cache lets nbdkit do the correct no-op. + */ + return NBDKIT_CACHE_NATIVE; +} + /* Read data. */ static int null_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -148,6 +158,7 @@ static struct nbdkit_plugin plugin = { .magic_config_key = "size", .open = null_open, .get_size = null_get_size, + .can_cache = null_can_cache, .pread = null_pread, .pwrite = null_pwrite, .zero = null_zero, diff --git a/plugins/pattern/pattern.c b/plugins/pattern/pattern.c index 115bd96..6b9b3a0 100644 --- a/plugins/pattern/pattern.c +++ b/plugins/pattern/pattern.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2018 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -94,6 +94,16 @@ pattern_can_multi_conn (void *handle) return 1; } +/* Cache. */ +static int +pattern_can_cache (void *handle) +{ + /* Everything is already in memory, returning this without + * implementing .cache lets nbdkit do the correct no-op. + */ + return NBDKIT_CACHE_NATIVE; +} + /* Read data. */ static int pattern_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -126,6 +136,7 @@ static struct nbdkit_plugin plugin = { .open = pattern_open, .get_size = pattern_get_size, .can_multi_conn = pattern_can_multi_conn, + .can_cache = pattern_can_cache, .pread = pattern_pread, /* In this plugin, errno is preserved properly along error return * paths from failed system calls. diff --git a/plugins/random/random.c b/plugins/random/random.c index 7fb42c8..6377310 100644 --- a/plugins/random/random.c +++ b/plugins/random/random.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2018 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -113,6 +113,16 @@ random_can_multi_conn (void *handle) return 1; } +/* Cache. */ +static int +random_can_cache (void *handle) +{ + /* Everything is already in memory, returning this without + * implementing .cache lets nbdkit do the correct no-op. + */ + return NBDKIT_CACHE_NATIVE; +} + /* Read data. */ static int random_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -156,6 +166,7 @@ static struct nbdkit_plugin plugin = { .open = random_open, .get_size = random_get_size, .can_multi_conn = random_can_multi_conn, + .can_cache = random_can_cache, .pread = random_pread, /* In this plugin, errno is preserved properly along error return * paths from failed system calls. diff --git a/plugins/zero/zero.c b/plugins/zero/zero.c index 49ce08e..12dcd6a 100644 --- a/plugins/zero/zero.c +++ b/plugins/zero/zero.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2018 Red Hat Inc. + * Copyright (C) 2017-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -67,6 +67,16 @@ zero_get_size (void *handle) return 0; } +/* Cache. */ +static int +zero_can_cache (void *handle) +{ + /* Everything is already in memory, returning this without + * implementing .cache lets nbdkit do the correct no-op. + */ + return NBDKIT_CACHE_NATIVE; +} + /* Ideally the read plugin would be optional. */ static int zero_pread (void *handle, void *buf, uint32_t count, uint64_t offset, @@ -82,6 +92,7 @@ static struct nbdkit_plugin plugin = { .config = zero_config, .open = zero_open, .get_size = zero_get_size, + .can_cache = zero_can_cache, .pread = zero_pread, /* In this plugin, errno is preserved properly along error return * paths from failed system calls. -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 06/24] nbd: Implement NBD_CMD_CACHE passthrough
In the nbd plugin, we expose caching to the client only if the server exposed it to us. Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/nbd/nbd.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c index 821f256..df7d366 100644 --- a/plugins/nbd/nbd.c +++ b/plugins/nbd/nbd.c @@ -1151,6 +1151,16 @@ nbd_can_multi_conn (void *handle) return h->flags & NBD_FLAG_CAN_MULTI_CONN; } +static int +nbd_can_cache (void *handle) +{ + struct handle *h = handle; + + if (h->flags & NBD_FLAG_SEND_CACHE) + return NBDKIT_CACHE_NATIVE; + return NBDKIT_CACHE_NONE; +} + static int nbd_can_extents (void *handle) { @@ -1245,6 +1255,18 @@ nbd_extents (void *handle, uint32_t count, uint64_t offset, return c < 0 ? c : nbd_reply (h, c); } +/* Cache a portion of the file. */ +static int +nbd_cache (void *handle, uint32_t count, uint64_t offset, uint32_t flags) +{ + struct handle *h = handle; + int c; + + assert (!flags); + c = nbd_request (h, 0, NBD_CMD_CACHE, offset, count); + return c < 0 ? c : nbd_reply (h, c); +} + static struct nbdkit_plugin plugin = { .name = "nbd", .longname = "nbdkit nbd plugin", @@ -1264,12 +1286,14 @@ static struct nbdkit_plugin plugin = { .can_fua = nbd_can_fua, .can_multi_conn = nbd_can_multi_conn, .can_extents = nbd_can_extents, + .can_cache = nbd_can_cache, .pread = nbd_pread, .pwrite = nbd_pwrite, .zero = nbd_zero, .flush = nbd_flush, .trim = nbd_trim, .extents = nbd_extents, + .cache = nbd_cache, .errno_is_preserved = 1, }; -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 07/24] sh: Implement .cache script callback
It's easy to expose new callbacks to sh plugins, by borrowing tri-state code from can_fua. It's possible that nbdkit emulate will actually work well (in our example.sh script, the kernel caching a pread from one dd invocation may indeed speed up the next access), but for the sake of the example, I demonstrated advertising a no-op handler. The shell plugin, coupled with Rich's work on libnbd as a client-side library for actually exercising calls to NBD_CMD_CACHE, will be a useful way to prove that cache commands even make it through the stack. (Remember, qemu 3.0 was released with a fatally flawed NBD_CMD_CACHE server implementation, because there were no open source clients at the time that could actually send the command to test the server with). Signed-off-by: Eric Blake <eblake@redhat.com> --- plugins/sh/nbdkit-sh-plugin.pod | 27 ++++++++--- plugins/sh/sh.c | 81 +++++++++++++++++++++++++++++++++ plugins/sh/example.sh | 7 +++ 3 files changed, 109 insertions(+), 6 deletions(-) diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod index 8af88b4..39b99a2 100644 --- a/plugins/sh/nbdkit-sh-plugin.pod +++ b/plugins/sh/nbdkit-sh-plugin.pod @@ -220,7 +220,7 @@ This method is required. Unlike in other languages, you B<must> provide the C<can_*> methods otherwise they are assumed to all return false and your C<pwrite>, -C<flush>, C<trim>, C<zero> and C<extents> methods will never be +C<flush>, C<trim>, C<zero>, and C<extents> methods will never be called. The reason for this is obscure: In other languages we can detect if (eg) a C<pwrite> method is defined and synthesize an appropriate response if no actual C<can_write> method is defined. @@ -243,13 +243,20 @@ The script should exit with code C<0> for true or code C<3> for false. =item C<can_fua> +=item C<can_cache> + /path/to/script can_fua <handle> + /path/to/script can_cache <handle> -This controls Forced Unit Access (FUA) behaviour of the core server. +These control Forced Unit Access (FUA) and caching behaviour of the +core server. -Unlike the other C<can_*> callbacks, this one is I<not> a boolean. It -must print either "none", "emulate" or "native" to stdout. The -meaning of these is described in L<nbdkit-plugin(3)>. +Unlike the other C<can_*> callbacks, these two are I<not> a boolean. +They must print either "none", "emulate" or "native" to stdout. The +meaning of these is described in L<nbdkit-plugin(3)>. Furthermore, +you B<must> provide a C<can_cache> method if you desire the C<cache> +callback to be utilized, similar to the reasoning behind requiring +C<can_write> to utilize C<pwrite>. =item C<can_multi_conn> @@ -334,6 +341,14 @@ Unlike in other languages, if you provide an C<extents> method you B<must> also provide a C<can_extents> method which exits with code C<0> (true). +=item C<cache> + + /path/to/script cache <handle> <count> <offset> + +Unlike in other languages, if you provide a C<cache> method you +B<must> also provide a C<can_cache> method which prints "native" and +exits with code C<0> (true). + =back =head2 Missing callbacks @@ -365,4 +380,4 @@ Richard W.M. Jones =head1 COPYRIGHT -Copyright (C) 2018 Red Hat Inc. +Copyright (C) 2018-2019 Red Hat Inc. diff --git a/plugins/sh/sh.c b/plugins/sh/sh.c index a5beb57..862be21 100644 --- a/plugins/sh/sh.c +++ b/plugins/sh/sh.c @@ -578,6 +578,53 @@ sh_can_multi_conn (void *handle) return boolean_method (handle, "can_multi_conn"); } +/* Not a boolean method, the method prints "none", "emulate" or "native". */ +static int +sh_can_cache (void *handle) +{ + char *h = handle; + const char *args[] = { script, "can_cache", h, NULL }; + CLEANUP_FREE char *s = NULL; + size_t slen; + int r; + + switch (call_read (&s, &slen, args)) { + case OK: + if (slen > 0 && s[slen-1] == '\n') + s[slen-1] = '\0'; + if (strcasecmp (s, "none") == 0) + r = NBDKIT_CACHE_NONE; + else if (strcasecmp (s, "emulate") == 0) + r = NBDKIT_CACHE_EMULATE; + else if (strcasecmp (s, "native") == 0) + r = NBDKIT_CACHE_NATIVE; + else { + nbdkit_error ("%s: could not parse output from can_cache method: %s", + script, s); + r = -1; + } + return r; + + case MISSING: + /* NBDKIT_CACHE_EMULATE means that nbdkit will call .pread. However + * we cannot know if that fallback would be efficient, so the safest + * default is to return NBDKIT_CACHE_NONE. + */ + return NBDKIT_CACHE_NONE; + + case ERROR: + return -1; + + case RET_FALSE: + nbdkit_error ("%s: %s method returned unexpected code (3/false)", + script, "can_cache"); + errno = EIO; + return -1; + + default: abort (); + } +} + static int sh_flush (void *handle, uint32_t flags) { @@ -782,6 +829,38 @@ sh_extents (void *handle, uint32_t count, uint64_t offset, uint32_t flags, } } +static int +sh_cache (void *handle, uint32_t count, uint64_t offset, uint32_t flags) +{ + char *h = handle; + char cbuf[32], obuf[32]; + const char *args[] = { script, "cache", h, cbuf, obuf, NULL }; + + snprintf (cbuf, sizeof cbuf, "%" PRIu32, count); + snprintf (obuf, sizeof obuf, "%" PRIu64, offset); + assert (!flags); + + switch (call (args)) { + case OK: + return 0; + + case MISSING: + /* Ignore lack of cache callback. */ + return 0; + + case ERROR: + return -1; + + case RET_FALSE: + nbdkit_error ("%s: %s method returned unexpected code (3/false)", + script, "cache"); + errno = EIO; + return -1; + + default: abort (); + } +} + #define sh_config_help \ "script=<FILENAME> (required) The shell script to run.\n" \ "[other arguments may be used by the plugin that you load]" @@ -812,6 +891,7 @@ static struct nbdkit_plugin plugin = { .can_extents = sh_can_extents, .can_fua = sh_can_fua, .can_multi_conn = sh_can_multi_conn, + .can_cache = sh_can_cache, .pread = sh_pread, .pwrite = sh_pwrite, @@ -819,6 +899,7 @@ static struct nbdkit_plugin plugin = { .trim = sh_trim, .zero = sh_zero, .extents = sh_extents, + .cache = sh_cache, .errno_is_preserved = 1, }; diff --git a/plugins/sh/example.sh b/plugins/sh/example.sh index 63228c8..60c46ce 100755 --- a/plugins/sh/example.sh +++ b/plugins/sh/example.sh @@ -133,6 +133,13 @@ case "$1" in fallocate --help >/dev/null 2>&1 || exit 3 ;; + can_cache) + # Caching is not advertised to the client unless can_cache prints + # a tri-state value. Here, we choose for caching to be a no-op, + # by omitting counterpart handling for 'cache'. + echo native + ;; + *) # Unknown methods must exit with code 2. exit 2 -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 08/24] ocaml: Implement .cache script callback
This was a bit harder than sh, but still a lot of copy-and-paste. Signed-off-by: Eric Blake <eblake@redhat.com> --- Note: I'm not sure how to actually test this beyond compilation. --- plugins/ocaml/ocaml.c | 51 ++++++++++++++++++++++++++++++++++++++++ plugins/ocaml/NBDKit.ml | 16 ++++++++++++- plugins/ocaml/NBDKit.mli | 5 ++++ 3 files changed, 71 insertions(+), 1 deletion(-) diff --git a/plugins/ocaml/ocaml.c b/plugins/ocaml/ocaml.c index 4447d7f..f664a7f 100644 --- a/plugins/ocaml/ocaml.c +++ b/plugins/ocaml/ocaml.c @@ -128,6 +128,9 @@ static value can_multi_conn_fn; static value can_extents_fn; static value extents_fn; +static value can_cache_fn; +static value cache_fn; + /*----------------------------------------------------------------------*/ /* Wrapper functions that translate calls from C (ie. nbdkit) to OCaml. */ @@ -638,6 +641,48 @@ extents_wrapper (void *h, uint32_t count, uint64_t offset, uint32_t flags, CAMLreturnT (int, 0); } +static int +can_cache_wrapper (void *h) +{ + CAMLparam0 (); + CAMLlocal1 (rv); + + caml_leave_blocking_section (); + + rv = caml_callback_exn (can_cache_fn, *(value *) h); + if (Is_exception_result (rv)) { + nbdkit_error ("%s", caml_format_exception (Extract_exception (rv))); + caml_enter_blocking_section (); + CAMLreturnT (int, -1); + } + + caml_enter_blocking_section (); + CAMLreturnT (int, Int_val (rv)); +} + +static int +cache_wrapper (void *h, uint32_t count, uint64_t offset, uint32_t flags) +{ + CAMLparam0 (); + CAMLlocal4 (rv, countv, offsetv, flagsv); + + caml_leave_blocking_section (); + + countv = caml_copy_int32 (count); + offsetv = caml_copy_int32 (offset); + flagsv = Val_flags (flags); + + value args[] = { *(value *) h, countv, offsetv, flagsv }; + rv = caml_callbackN_exn (cache_fn, sizeof args / sizeof args[0], args); + if (Is_exception_result (rv)) { + nbdkit_error ("%s", caml_format_exception (Extract_exception (rv))); + CAMLreturnT (int, -1); + } + + caml_enter_blocking_section (); + CAMLreturnT (int, 0); +} + /*----------------------------------------------------------------------*/ /* set_* functions called from OCaml code at load time to initialize * fields in the plugin struct. @@ -727,6 +772,9 @@ SET(can_multi_conn) SET(can_extents) SET(extents) +SET(can_cache) +SET(cache) + #undef SET static void @@ -766,6 +814,9 @@ remove_roots (void) REMOVE (can_extents); REMOVE (extents); + REMOVE (can_cache); + REMOVE (cache); + #undef REMOVE } diff --git a/plugins/ocaml/NBDKit.ml b/plugins/ocaml/NBDKit.ml index 7aca8c8..02aa200 100644 --- a/plugins/ocaml/NBDKit.ml +++ b/plugins/ocaml/NBDKit.ml @@ -37,6 +37,8 @@ and flag = May_trim | FUA | Req_one type fua_flag = FuaNone | FuaEmulate | FuaNative +type cache_flag = CacheNone | CacheEmulate | CacheNop + type extent = { offset : int64; length : int64; @@ -82,6 +84,9 @@ type 'a plugin = { can_extents : ('a -> bool) option; extents : ('a -> int32 -> int64 -> flags -> extent list) option; + + can_cache : ('a -> cache_flag) option; + cache : ('a -> int32 -> int64 -> flags -> unit) option; } let default_callbacks = { @@ -122,6 +127,9 @@ let default_callbacks = { can_extents = None; extents = None; + + can_cache = None; + cache = None; } type thread_model @@ -170,6 +178,9 @@ external set_can_multi_conn : ('a -> bool) -> unit = "ocaml_nbdkit_set_can_multi external set_can_extents : ('a -> bool) -> unit = "ocaml_nbdkit_set_can_extents" external set_extents : ('a -> int32 -> int64 -> flags -> extent list) -> unit = "ocaml_nbdkit_set_extents" +external set_can_cache : ('a -> cache_flag) -> unit = "ocaml_nbdkit_set_can_cache" +external set_cache : ('a -> int32 -> int64 -> flags -> unit) -> unit = "ocaml_nbdkit_set_cache" + let may f = function None -> () | Some a -> f a let register_plugin thread_model plugin @@ -229,7 +240,10 @@ let register_plugin thread_model plugin may set_can_multi_conn plugin.can_multi_conn; may set_can_extents plugin.can_extents; - may set_extents plugin.extents + may set_extents plugin.extents; + + may set_can_cache plugin.can_cache; + may set_cache plugin.cache external _set_error : int -> unit = "ocaml_nbdkit_set_error" "noalloc" diff --git a/plugins/ocaml/NBDKit.mli b/plugins/ocaml/NBDKit.mli index da110fe..bab8f7f 100644 --- a/plugins/ocaml/NBDKit.mli +++ b/plugins/ocaml/NBDKit.mli @@ -40,6 +40,8 @@ and flag = May_trim | FUA | Req_one type fua_flag = FuaNone | FuaEmulate | FuaNative +type cache_flag = CacheNone | CacheEmulate | CacheNop + type extent = { offset : int64; length : int64; @@ -86,6 +88,9 @@ type 'a plugin = { can_extents : ('a -> bool) option; extents : ('a -> int32 -> int64 -> flags -> extent list) option; + + can_cache : ('a -> cache_flag) option; + cache : ('a -> int32 -> int64 -> flags -> unit) option; } (** The plugin fields and callbacks. ['a] is the handle type. *) -- 2.20.1
Eric Blake
2019-May-16 03:57 UTC
[Libguestfs] [nbdkit PATCH v2 09/24] plugins: Document lack of .cache in language bindings
lua, perl, python, and ruby have more work before they can support .cache callbacks (for that matter, we still haven't implemented generic v2 interface support there, starting with can_fua). Signed-off-by: Eric Blake <eblake@redhat.com> --- The rust bindings are at least at version 2, and don't mention missing bindings in the .pod page. However, they do not yet have extents support, which should be added before doing the obvious copy-and-paste to enable cache support. --- plugins/lua/nbdkit-lua-plugin.pod | 3 ++- plugins/perl/nbdkit-perl-plugin.pod | 2 +- plugins/python/nbdkit-python-plugin.pod | 2 +- plugins/ruby/nbdkit-ruby-plugin.pod | 2 +- plugins/tcl/nbdkit-tcl-plugin.pod | 3 ++- 5 files changed, 7 insertions(+), 5 deletions(-) diff --git a/plugins/lua/nbdkit-lua-plugin.pod b/plugins/lua/nbdkit-lua-plugin.pod index 99883e2..cde13f9 100644 --- a/plugins/lua/nbdkit-lua-plugin.pod +++ b/plugins/lua/nbdkit-lua-plugin.pod @@ -259,7 +259,8 @@ partial, your function should call C<error>. =over 4 =item Missing: C<load>, C<unload>, C<name>, C<version>, C<longname>, -C<description>, C<config_help>, C<can_zero>, C<can_fua> +C<description>, C<config_help>, C<can_zero>, C<can_fua>, C<can_cache>, +C<cache> These are not yet supported. diff --git a/plugins/perl/nbdkit-perl-plugin.pod b/plugins/perl/nbdkit-perl-plugin.pod index 9165795..1da7ddc 100644 --- a/plugins/perl/nbdkit-perl-plugin.pod +++ b/plugins/perl/nbdkit-perl-plugin.pod @@ -338,7 +338,7 @@ These are not needed because you can just use regular Perl C<BEGIN> and C<END> constructs. =item Missing: C<name>, C<version>, C<longname>, C<description>, -C<config_help> +C<config_help>, C<can_fua>, C<can_cache>, C<cache> These are not yet supported. diff --git a/plugins/python/nbdkit-python-plugin.pod b/plugins/python/nbdkit-python-plugin.pod index b422796..3d234d7 100644 --- a/plugins/python/nbdkit-python-plugin.pod +++ b/plugins/python/nbdkit-python-plugin.pod @@ -265,7 +265,7 @@ These are not needed because you can just use ordinary Python constructs. =item Missing: C<name>, C<version>, C<longname>, C<description>, -C<config_help> +C<config_help>, C<can_fua>, C<can_cache>, C<cache> These are not yet supported. diff --git a/plugins/ruby/nbdkit-ruby-plugin.pod b/plugins/ruby/nbdkit-ruby-plugin.pod index 18934cc..4161794 100644 --- a/plugins/ruby/nbdkit-ruby-plugin.pod +++ b/plugins/ruby/nbdkit-ruby-plugin.pod @@ -272,7 +272,7 @@ These are not needed because you can just use ordinary Ruby constructs. =item Missing: C<name>, C<version>, C<longname>, C<description>, -C<config_help> +C<config_help>, C<can_fua>, C<can_cache>, C<cache> These are not yet supported. diff --git a/plugins/tcl/nbdkit-tcl-plugin.pod b/plugins/tcl/nbdkit-tcl-plugin.pod index 3c6b8f7..4151815 100644 --- a/plugins/tcl/nbdkit-tcl-plugin.pod +++ b/plugins/tcl/nbdkit-tcl-plugin.pod @@ -265,7 +265,8 @@ partial, your function should call C<error>. =over 4 =item Missing: C<load>, C<unload>, C<name>, C<version>, C<longname>, -C<description>, C<config_help>, C<can_zero>, C<can_fua> +C<description>, C<config_help>, C<can_zero>, C<can_fua>, C<can_cache>, +C<cache> These are not yet supported. -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 10/24] filters: Add .cache callback
Make it possible for filters to adjust the behavior for NBD_CMD_CACHE. To avoid any 'git bisect' breakage, this patch leaves .can_cache as NBDKIT_CACHE_NONE if a filter does not provide an override, rather than passthrough to the plugin; it will be flipped later after all necessary filters have been patched first. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/nbdkit-filter.pod | 42 ++++++++++++++++++++++++++++++++++------- include/nbdkit-filter.h | 8 ++++++++ server/filters.c | 37 ++++++++++++++++++++++++++++++++---- 3 files changed, 76 insertions(+), 11 deletions(-) diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod index 6aeaa7b..857f241 100644 --- a/docs/nbdkit-filter.pod +++ b/docs/nbdkit-filter.pod @@ -356,6 +356,8 @@ calls. =head2 C<.can_multi_conn> +=head2 C<.can_cache> + int (*can_write) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle); int (*can_flush) (struct nbdkit_next_ops *next_ops, void *nxdata, @@ -373,6 +375,8 @@ calls. void *handle); int (*can_multi_conn) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle); + int (*can_cache) (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle); These intercept the corresponding plugin methods, and control feature bits advertised to the client. @@ -385,12 +389,14 @@ the plugin's own C<.can_zero> callback returned false, because nbdkit implements a fallback to C<.pwrite> at the plugin layer. Remember that most of the feature check functions return merely a -boolean success value, while C<.can_fua> has three success values. -The difference between values may affect choices made in the filter: -when splitting a write request that requested FUA from the client, if -C<next_ops-E<gt>can_fua> returns C<NBDKIT_FUA_NATIVE>, then the filter -should pass the FUA flag on to each sub-request; while if it is known -that FUA is emulated by a flush because of a return of +boolean success value, while C<.can_fua> and C<.can_cache> have three +success values. + +The difference between C<.can_fua> values may affect choices made in +the filter: when splitting a write request that requested FUA from the +client, if C<next_ops-E<gt>can_fua> returns C<NBDKIT_FUA_NATIVE>, then +the filter should pass the FUA flag on to each sub-request; while if +it is known that FUA is emulated by a flush because of a return of C<NBDKIT_FUA_EMULATE>, it is more efficient to only flush once after all sub-requests have completed (often by passing C<NBDKIT_FLAG_FUA> on to only the final sub-request, or by dropping the flag and ending @@ -597,6 +603,28 @@ Returns the number of extents in the list. Returns a copy of the C<i>'th extent. +=head2 C<.cache> + + int (*cache) (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, + uint32_t flags, int *err); + +This intercepts the plugin C<.cache> method and can be used to modify +cache requests. + +This function will not be called if C<.can_cache> returned +C<NBDKIT_CACHE_NONE> or C<NBDKIT_CACHE_EMULATE>; in turn, the filter +should not call C<next_ops-E<gt>cache> unless +C<next_ops-E<gt>can_cache> returned C<NBDKIT_CACHE_NATIVE>. + +The parameter C<flags> exists in case of future NBD protocol +extensions; at this time, it will be 0 on input, and the filter should +not pass any flags to C<next_ops-E<gt>cache>. + +If there is an error, C<.cache> should call C<nbdkit_error> with an +error message B<and> return -1 with C<err> set to the positive errno +value to return to the client. + =head1 ERROR HANDLING If there is an error in the filter itself, the filter should call @@ -708,4 +736,4 @@ Richard W.M. Jones =head1 COPYRIGHT -Copyright (C) 2013-2018 Red Hat Inc. +Copyright (C) 2013-2019 Red Hat Inc. diff --git a/include/nbdkit-filter.h b/include/nbdkit-filter.h index 9b6cd6e..5893dd8 100644 --- a/include/nbdkit-filter.h +++ b/include/nbdkit-filter.h @@ -74,6 +74,7 @@ struct nbdkit_next_ops { int (*can_extents) (void *nxdata); int (*can_fua) (void *nxdata); int (*can_multi_conn) (void *nxdata); + int (*can_cache) (void *nxdata); int (*pread) (void *nxdata, void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err); @@ -87,6 +88,8 @@ struct nbdkit_next_ops { int *err); int (*extents) (void *nxdata, uint32_t count, uint64_t offset, uint32_t flags, struct nbdkit_extents *extents, int *err); + int (*cache) (void *nxdata, uint32_t count, uint64_t offset, uint32_t flags, + int *err); }; struct nbdkit_filter { @@ -142,6 +145,8 @@ struct nbdkit_filter { void *handle); int (*can_multi_conn) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle); + int (*can_cache) (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle); int (*pread) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, void *buf, uint32_t count, uint64_t offset, @@ -161,6 +166,9 @@ struct nbdkit_filter { int (*extents) (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, uint32_t count, uint64_t offset, uint32_t flags, struct nbdkit_extents *extents, int *err); + int (*cache) (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, uint32_t flags, + int *err); }; #define NBDKIT_REGISTER_FILTER(filter) \ diff --git a/server/filters.c b/server/filters.c index e456fbf..430d515 100644 --- a/server/filters.c +++ b/server/filters.c @@ -329,6 +329,13 @@ next_can_multi_conn (void *nxdata) return b_conn->b->can_multi_conn (b_conn->b, b_conn->conn); } +static int +next_can_cache (void *nxdata) +{ + struct b_conn *b_conn = nxdata; + return b_conn->b->can_cache (b_conn->b, b_conn->conn); +} + static int next_pread (void *nxdata, void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err) @@ -379,6 +386,15 @@ next_extents (void *nxdata, uint32_t count, uint64_t offset, uint32_t flags, extents, err); } +static int +next_cache (void *nxdata, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + struct b_conn *b_conn = nxdata; + return b_conn->b->cache (b_conn->b, b_conn->conn, count, offset, flags, + err); +} + static struct nbdkit_next_ops next_ops = { .get_size = next_get_size, .can_write = next_can_write, @@ -389,12 +405,14 @@ static struct nbdkit_next_ops next_ops = { .can_extents = next_can_extents, .can_fua = next_can_fua, .can_multi_conn = next_can_multi_conn, + .can_cache = next_can_cache, .pread = next_pread, .pwrite = next_pwrite, .flush = next_flush, .trim = next_trim, .zero = next_zero, .extents = next_extents, + .cache = next_cache, }; static int @@ -577,12 +595,18 @@ static int filter_can_cache (struct backend *b, struct connection *conn) { struct backend_filter *f = container_of (b, struct backend_filter, backend); + void *handle = connection_get_handle (conn, f->backend.i); + struct b_conn nxdata = { .b = f->backend.next, .conn = conn }; debug ("%s: can_cache", f->name); + if (f->filter.can_cache) + return f->filter.can_cache (&next_ops, &nxdata, handle); /* FIXME: Default to f->backend.next->can_cache, once all filters have been audited */ - return NBDKIT_CACHE_NONE; + else + return NBDKIT_CACHE_NONE; + return f->backend.next->can_cache (f->backend.next, conn); } static int @@ -720,15 +744,20 @@ filter_cache (struct backend *b, struct connection *conn, uint32_t flags, int *err) { struct backend_filter *f = container_of (b, struct backend_filter, backend); + void *handle = connection_get_handle (conn, f->backend.i); + struct b_conn nxdata = { .b = f->backend.next, .conn = conn }; assert (flags == 0); debug ("%s: cache count=%" PRIu32 " offset=%" PRIu64 " flags=0x%" PRIx32, f->name, count, offset, flags); - /* FIXME: Allow filter to rewrite request */ - return f->backend.next->cache (f->backend.next, conn, - count, offset, flags, err); + if (f->filter.cache) + return f->filter.cache (&next_ops, &nxdata, handle, + count, offset, flags, err); + else + return f->backend.next->cache (f->backend.next, conn, + count, offset, flags, err); } static struct backend filter_functions = { -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 11/24] test-layers: Test .cache usage
This includes hacking the test client to send NBD_CMD_CACHE to the server, and that all layers are visited in the correct order. Signed-off-by: Eric Blake <eblake@redhat.com> --- tests/test-layers-filter.c | 22 +++++++++++++++++++++- tests/test-layers-plugin.c | 17 +++++++++++++++++ tests/test-layers.c | 36 ++++++++++++++++++++++++++++++++++++ 3 files changed, 74 insertions(+), 1 deletion(-) diff --git a/tests/test-layers-filter.c b/tests/test-layers-filter.c index a8f4723..bd063bd 100644 --- a/tests/test-layers-filter.c +++ b/tests/test-layers-filter.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -184,6 +184,15 @@ test_layers_filter_can_extents (struct nbdkit_next_ops *next_ops, return next_ops->can_extents (nxdata); } +static int +test_layers_filter_can_cache (struct nbdkit_next_ops *next_ops, + void *nxdata, + void *handle) +{ + DEBUG_FUNCTION; + return next_ops->can_cache (nxdata); +} + static int test_layers_filter_pread (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, void *buf, @@ -241,6 +250,15 @@ test_layers_filter_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return next_ops->extents (nxdata, count, offset, flags, extents, err); } +static int +test_layers_filter_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + DEBUG_FUNCTION; + return next_ops->cache (nxdata, count, offset, flags, err); +} + static struct nbdkit_filter filter = { .name = "testlayers" layer, .version = PACKAGE_VERSION, @@ -262,12 +280,14 @@ static struct nbdkit_filter filter = { .can_fua = test_layers_filter_can_fua, .can_multi_conn = test_layers_filter_can_multi_conn, .can_extents = test_layers_filter_can_extents, + .can_cache = test_layers_filter_can_cache, .pread = test_layers_filter_pread, .pwrite = test_layers_filter_pwrite, .flush = test_layers_filter_flush, .trim = test_layers_filter_trim, .zero = test_layers_filter_zero, .extents = test_layers_filter_extents, + .cache = test_layers_filter_cache, }; NBDKIT_REGISTER_FILTER(filter) diff --git a/tests/test-layers-plugin.c b/tests/test-layers-plugin.c index f9b2014..e9ffd3b 100644 --- a/tests/test-layers-plugin.c +++ b/tests/test-layers-plugin.c @@ -143,6 +143,13 @@ test_layers_plugin_can_multi_conn (void *handle) return 1; } +static int +test_layers_plugin_can_cache (void *handle) +{ + DEBUG_FUNCTION; + return NBDKIT_CACHE_NATIVE; +} + static int test_layers_plugin_can_extents (void *handle) { @@ -201,6 +208,14 @@ test_layers_plugin_extents (void *handle, return nbdkit_add_extent (extents, offset, count, 0); } +static int +test_layers_plugin_cache (void *handle, + uint32_t count, uint64_t offset, uint32_t flags) +{ + DEBUG_FUNCTION; + return 0; +} + static struct nbdkit_plugin plugin = { .name = "testlayersplugin", .version = PACKAGE_VERSION, @@ -220,12 +235,14 @@ static struct nbdkit_plugin plugin = { .can_fua = test_layers_plugin_can_fua, .can_multi_conn = test_layers_plugin_can_multi_conn, .can_extents = test_layers_plugin_can_extents, + .can_cache = test_layers_plugin_can_cache, .pread = test_layers_plugin_pread, .pwrite = test_layers_plugin_pwrite, .flush = test_layers_plugin_flush, .trim = test_layers_plugin_trim, .zero = test_layers_plugin_zero, .extents = test_layers_plugin_extents, + .cache = test_layers_plugin_cache, /* In this plugin, errno is preserved properly along error return * paths from failed system calls. */ diff --git a/tests/test-layers.c b/tests/test-layers.c index 627e4ec..a820ba5 100644 --- a/tests/test-layers.c +++ b/tests/test-layers.c @@ -361,6 +361,12 @@ main (int argc, char *argv[]) "filter1: test_layers_filter_can_extents", "test_layers_plugin_can_extents", NULL); + log_verify_seen_in_order + ("filter3: test_layers_filter_can_cache", + "filter2: test_layers_filter_can_cache", + "filter1: test_layers_filter_can_cache", + "test_layers_plugin_can_cache", + NULL); fprintf (stderr, "%s: protocol connected\n", program_name); @@ -526,6 +532,36 @@ main (int argc, char *argv[]) "test_layers_plugin_zero", NULL); + request.type = htobe16 (NBD_CMD_CACHE); + request.offset = htobe64 (0); + request.count = htobe32 (512); + request.flags = htobe16 (0); + if (send (sock, &request, sizeof request, 0) != sizeof request) { + perror ("send: NBD_CMD_CACHE"); + exit (EXIT_FAILURE); + } + if (recv (sock, &reply, sizeof reply, MSG_WAITALL) != sizeof reply) { + perror ("recv: NBD_CMD_CACHE"); + exit (EXIT_FAILURE); + } + if (reply.error != NBD_SUCCESS) { + fprintf (stderr, "%s: NBD_CMD_CACHE failed with %d\n", + program_name, reply.error); + exit (EXIT_FAILURE); + } + + sleep (1); + log_verify_seen_in_order + ("testlayersfilter3: cache count=512 offset=0 flags=0x0", + "filter3: test_layers_filter_cache", + "testlayersfilter2: cache count=512 offset=0 flags=0x0", + "filter2: test_layers_filter_cache", + "testlayersfilter1: cache count=512 offset=0 flags=0x0", + "filter1: test_layers_filter_cache", + "testlayersplugin: debug: cache count=512 offset=0", + "test_layers_plugin_cache", + NULL); + /* XXX We should test NBD_CMD_BLOCK_STATUS here. However it * requires that we negotiate structured replies and base:allocation * in the handshake, and the format of the reply is more complex -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 12/24] test-eflags: Test .can_cache support
The sh bindings make this one easy to test. Signed-off-by: Eric Blake <eblake@redhat.com> --- tests/test-eflags.sh | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/tests/test-eflags.sh b/tests/test-eflags.sh index 6a90b87..eaaaae0 100755 --- a/tests/test-eflags.sh +++ b/tests/test-eflags.sh @@ -1,6 +1,6 @@ #!/usr/bin/env bash # nbdkit -# Copyright (C) 2018 Red Hat Inc. +# Copyright (C) 2018-2019 Red Hat Inc. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are @@ -286,3 +286,18 @@ EOF [ $eflags -eq $(( HAS_FLAGS|READ_ONLY|SEND_DF|CAN_MULTI_CONN )) ] || fail "expected HAS_FLAGS|READ_ONLY|SEND_DF|CAN_MULTI_CONN" + +#---------------------------------------------------------------------- +# -r +# can_cache=true + +do_nbdkit -r <<'EOF' +case "$1" in + get_size) echo 1M ;; + can_cache) echo "emulate" ;; + *) exit 2 ;; +esac +EOF + +[ $eflags -eq $(( HAS_FLAGS|READ_ONLY|SEND_DF|SEND_CACHE )) ] || + fail "expected HAS_FLAGS|READ_ONLY|SEND_DF|SEND_CACHE" -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 13/24] blocksize: Implement .cache rounding
Rely on .can_cache passthrough to imply that our .cache won't be called unless the plugin also has .cache. [Technically, that won't happen until a later patch flips the default in filters.c]. Round the cache request out, to cache the same range as would otherwise be passed to the plugin's .pread if we had instead manually set .can_cache to NBDKIT_CACHE_EMULATE. Oddly enough, a client can submit an unaligned request for just under 4G of caching where our rounding would overflow a 32-bit integer, so our rounding has to use a 64-bit temporary. Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/blocksize/blocksize.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/filters/blocksize/blocksize.c b/filters/blocksize/blocksize.c index ba5d9e7..0978887 100644 --- a/filters/blocksize/blocksize.c +++ b/filters/blocksize/blocksize.c @@ -368,6 +368,34 @@ blocksize_extents (struct nbdkit_next_ops *next_ops, void *nxdata, flags, extents, err); } +static int +blocksize_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offs, uint32_t flags, + int *err) +{ + uint32_t limit; + uint64_t remaining = count; /* Rounding out could exceed 32 bits */ + + /* Unaligned head */ + limit = offs & (minblock - 1); + remaining += limit; + offs -= limit; + + /* Unaligned tail */ + remaining = ROUND_UP (remaining, minblock); + + /* Aligned body */ + while (remaining) { + limit = MIN (maxdata, remaining); + if (next_ops->cache (nxdata, limit, offs, flags, err) == -1) + return -1; + offs += limit; + remaining -= limit; + } + + return 0; +} + static struct nbdkit_filter filter = { .name = "blocksize", .longname = "nbdkit blocksize filter", @@ -382,6 +410,7 @@ static struct nbdkit_filter filter = { .trim = blocksize_trim, .zero = blocksize_zero, .extents = blocksize_extents, + .cache = blocksize_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 14/24] cache: Implement .cache
The whole point of the cache filter is to avoid visiting the plugin more than once for a range of data; as such, passing cache requests through to the plugin is wrong, and a client request for caching a range of the file obviously means we want the data locally. Our filter defaults to cache_on_read=false, where we normally only cache data that has been written but not yet flushed; but it can be presumed that an explicit cache request should always pull data locally, regardless of the cache_on_read setting. And even when cache_on_read=true, we can implement caching more efficiently than discarding the buffer of a naive pread. Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/cache/nbdkit-cache-filter.pod | 6 +-- filters/cache/blk.h | 10 ++++- filters/cache/blk.c | 50 ++++++++++++++++++++++++- filters/cache/cache.c | 53 +++++++++++++++++++++++++++ 4 files changed, 113 insertions(+), 6 deletions(-) diff --git a/filters/cache/nbdkit-cache-filter.pod b/filters/cache/nbdkit-cache-filter.pod index 5993831..8b50231 100644 --- a/filters/cache/nbdkit-cache-filter.pod +++ b/filters/cache/nbdkit-cache-filter.pod @@ -70,9 +70,9 @@ Limit the size of the cache to C<SIZE>. See L</CACHE MAXIMUM SIZE> below. =item B<cache-on-read=true> -Cache read requests as well as write requests. Any time a block is -read from the plugin, it is saved in the cache (if there is sufficient -space) so the same data can be served more quickly later. +Cache read requests as well as write and cache requests. Any time a +block is read from the plugin, it is saved in the cache (if there is +sufficient space) so the same data can be served more quickly later. Note that if the underlying data served by the plugin can be modified by some other means (eg. something else can write to a file which is diff --git a/filters/cache/blk.h b/filters/cache/blk.h index 974a118..0d84f74 100644 --- a/filters/cache/blk.h +++ b/filters/cache/blk.h @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -49,11 +49,17 @@ extern void blk_free (void); /* Allocate or resize the cache file and bitmap. */ extern int blk_set_size (uint64_t new_size); -/* Read a single block from the cache or plugin. */ +/* Read a single block from the cache or plugin. If cache_on_read is set, + * also ensure it is cached. */ extern int blk_read (struct nbdkit_next_ops *next_ops, void *nxdata, uint64_t blknum, uint8_t *block, int *err) __attribute__((__nonnull__ (1, 4, 5))); +/* If a single block is not cached, copy it from the plugin. */ +extern int blk_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + uint64_t blknum, uint8_t *block, int *err) + __attribute__((__nonnull__ (1, 4, 5))); + /* Write to the cache and the plugin. */ extern int blk_writethrough (struct nbdkit_next_ops *next_ops, void *nxdata, uint64_t blknum, const uint8_t *block, diff --git a/filters/cache/blk.c b/filters/cache/blk.c index acbed61..cf7145d 100644 --- a/filters/cache/blk.c +++ b/filters/cache/blk.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -217,6 +217,54 @@ blk_read (struct nbdkit_next_ops *next_ops, void *nxdata, } } +int +blk_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + uint64_t blknum, uint8_t *block, int *err) +{ + off_t offset = blknum * blksize; + enum bm_entry state = bitmap_get_blk (&bm, blknum, BLOCK_NOT_CACHED); + + reclaim (fd, &bm); + + nbdkit_debug ("cache: blk_cache block %" PRIu64 " (offset %" PRIu64 ") is %s", + blknum, (uint64_t) offset, + state == BLOCK_NOT_CACHED ? "not cached" : + state == BLOCK_CLEAN ? "clean" : + state == BLOCK_DIRTY ? "dirty" : + "unknown"); + + if (state == BLOCK_NOT_CACHED) { + off_t offset = blknum * blksize; + + /* Read underlying plugin, copy to cache regardless of cache-on-read. */ + if (next_ops->pread (nxdata, block, blksize, offset, 0, err) == -1) + return -1; + + nbdkit_debug ("cache: cache block %" PRIu64 " (offset %" PRIu64 ")", + blknum, (uint64_t) offset); + + if (pwrite (fd, block, blksize, offset) == -1) { + *err = errno; + nbdkit_error ("pwrite: %m"); + return -1; + } + bitmap_set_blk (&bm, blknum, BLOCK_CLEAN); + lru_set_recently_accessed (blknum); + } + else { +#if HAVE_POSIX_FADVISE + int r = posix_fadvise (fd, offset, blksize, POSIX_FADV_WILLNEED); + if (r) { + errno = r; + nbdkit_error ("posix_fadvise: %m"); + return -1; + } +#endif + lru_set_recently_accessed (blknum); + } + return 0; +} + int blk_writethrough (struct nbdkit_next_ops *next_ops, void *nxdata, uint64_t blknum, const uint8_t *block, uint32_t flags, diff --git a/filters/cache/cache.c b/filters/cache/cache.c index e215cac..2d2f39d 100644 --- a/filters/cache/cache.c +++ b/filters/cache/cache.c @@ -230,6 +230,13 @@ cache_prepare (struct nbdkit_next_ops *next_ops, void *nxdata, return 0; } +/* Override the plugin's .can_cache, because we are caching here instead */ +static int +cache_can_cache (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle) +{ + return NBDKIT_CACHE_NATIVE; +} + /* Read data. */ static int cache_pread (struct nbdkit_next_ops *next_ops, void *nxdata, @@ -548,6 +555,50 @@ flush_dirty_block (uint64_t blknum, void *datav) return 0; /* continue scanning and flushing. */ } +/* Cache data. */ +static int +cache_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + CLEANUP_FREE uint8_t *block = NULL; + uint64_t blknum, blkoffs; + int r; + uint64_t remaining = count; /* Rounding out could exceed 32 bits */ + + assert (!flags); + block = malloc (blksize); + if (block == NULL) { + *err = errno; + nbdkit_error ("malloc: %m"); + return -1; + } + + blknum = offset / blksize; /* block number */ + blkoffs = offset % blksize; /* offset within the block */ + + /* Unaligned head */ + remaining += blkoffs; + offset -= blkoffs; + + /* Unaligned tail */ + remaining = ROUND_UP (remaining, blksize); + + /* Aligned body */ + while (remaining) { + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); + r = blk_cache (next_ops, nxdata, blknum, block, err); + if (r == -1) + return -1; + + remaining -= blksize; + offset += blksize; + blknum++; + } + + return 0; +} + static struct nbdkit_filter filter = { .name = "cache", .longname = "nbdkit caching filter", @@ -558,10 +609,12 @@ static struct nbdkit_filter filter = { .config_complete = cache_config_complete, .prepare = cache_prepare, .get_size = cache_get_size, + .can_cache = cache_can_cache, .pread = cache_pread, .pwrite = cache_pwrite, .zero = cache_zero, .flush = cache_flush, + .cache = cache_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 15/24] cow: Implement .cache
Our default cow caching behavior is that if we have not yet overwritten a portion of the image, we should pass the cache request on to the underlying plugin (to make the upcoming reads from the plugin possibly faster). But if we HAVE copied something locally, we can use posix_fadvise (if available) to tell the kernel that we have an upcoming reuse of that area of our local disk, and don't need to bother the plugin. This is the default because it keeps the COW image as thin as possible. However, another sane behavior is comparable to the 'cache' filter's 'cache_on_read' parameter: a user may be wanting to force portions of the COW overlay to become populated, but want something more efficient in network traffic than NBD_CMD_READ followed by NBD_CMD_WRITE of unchanged data. Hence, this patch also adds a 'cow_on_cache' parameter to opt-in to the second behavior. Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/cow/nbdkit-cow-filter.pod | 15 ++++-- filters/cow/blk.h | 16 +++++- filters/cow/blk.c | 41 +++++++++++++- filters/cow/cow.c | 90 +++++++++++++++++++++++++++++++ 4 files changed, 157 insertions(+), 5 deletions(-) diff --git a/filters/cow/nbdkit-cow-filter.pod b/filters/cow/nbdkit-cow-filter.pod index 448f48c..ae8c5e1 100644 --- a/filters/cow/nbdkit-cow-filter.pod +++ b/filters/cow/nbdkit-cow-filter.pod @@ -57,9 +57,18 @@ serve the same data to each client. =head1 PARAMETERS -There are no parameters specific to nbdkit-cow-filter. Any parameters -are passed through to and processed by the underlying plugin in the -normal way. +=over 4 + +=item B<cow-on-cache=true> + +Treat a client cache request as a shortcut for copying unmodified data +from the plugin to the overlay, rather than the default of passing +cache requests on to the plugin. This parameter defaults to false +(which leaves the overlay as small as possible), but setting it can be +useful for converting cache commands into a form of copy-on-read +behavior, in addition to the filter's normal copy-on-write semantics. + +=back =head1 EXAMPLES diff --git a/filters/cow/blk.h b/filters/cow/blk.h index 429bb53..1c1d922 100644 --- a/filters/cow/blk.h +++ b/filters/cow/blk.h @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -59,6 +59,20 @@ extern int blk_read (struct nbdkit_next_ops *next_ops, void *nxdata, uint64_t blknum, uint8_t *block, int *err) __attribute__((__nonnull__ (1, 4, 5))); +/* Cache mode for blocks not already in overlay */ +enum cache_mode { + BLK_CACHE_IGNORE, /* Do nothing */ + BLK_CACHE_PASSTHROUGH, /* Make cache request to plugin */ + BLK_CACHE_READ, /* Make ignored read request to plugin */ + BLK_CACHE_COW, /* Make read request to plugin, and write to overlay */ +}; + +/* Cache a single block from the plugin. */ +extern int blk_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + uint64_t blknum, uint8_t *block, enum cache_mode, + int *err) + __attribute__((__nonnull__ (1, 4, 6))); + /* Write a single block. */ extern int blk_write (uint64_t blknum, const uint8_t *block, int *err) __attribute__((__nonnull__ (2, 3))); diff --git a/filters/cow/blk.c b/filters/cow/blk.c index 9c99aee..be43f2f 100644 --- a/filters/cow/blk.c +++ b/filters/cow/blk.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -190,6 +190,45 @@ blk_read (struct nbdkit_next_ops *next_ops, void *nxdata, } } +int +blk_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + uint64_t blknum, uint8_t *block, enum cache_mode mode, int *err) +{ + off_t offset = blknum * BLKSIZE; + bool allocated = blk_is_allocated (blknum); + + nbdkit_debug ("cow: blk_cache block %" PRIu64 " (offset %" PRIu64 ") is %s", + blknum, (uint64_t) offset, + !allocated ? "a hole" : "allocated"); + + if (allocated) { +#if HAVE_POSIX_FADVISE + int r = posix_fadvise (fd, offset, BLKSIZE, POSIX_FADV_WILLNEED); + if (r) { + errno = r; + nbdkit_error ("posix_fadvise: %m"); + return -1; + } +#endif + return 0; + } + if (mode == BLK_CACHE_IGNORE) + return 0; + if (mode == BLK_CACHE_PASSTHROUGH) + return next_ops->cache (nxdata, BLKSIZE, offset, 0, err); + if (next_ops->pread (nxdata, block, BLKSIZE, offset, 0, err) == -1) + return -1; + if (mode == BLK_CACHE_COW) { + if (pwrite (fd, block, BLKSIZE, offset) == -1) { + *err = errno; + nbdkit_error ("pwrite: %m"); + return -1; + } + blk_set_allocated (blknum); + } + return 0; +} + int blk_write (uint64_t blknum, const uint8_t *block, int *err) { diff --git a/filters/cow/cow.c b/filters/cow/cow.c index aa1348b..006007e 100644 --- a/filters/cow/cow.c +++ b/filters/cow/cow.c @@ -58,6 +58,8 @@ */ static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; +bool cow_on_cache; + static void cow_load (void) { @@ -71,6 +73,24 @@ cow_unload (void) blk_free (); } +static int +cow_config (nbdkit_next_config *next, void *nxdata, + const char *key, const char *value) +{ + if (strcmp (key, "cow-on-cache") == 0) { + int r; + + r = nbdkit_parse_bool (value); + if (r == -1) + return -1; + cow_on_cache = r; + return 0; + } + else { + return next (nxdata, key, value); + } +} + static void * cow_open (nbdkit_next_open *next, void *nxdata, int readonly) { @@ -152,6 +172,12 @@ cow_can_fua (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle) return NBDKIT_FUA_EMULATE; } +static int +cow_can_cache (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle) +{ + return NBDKIT_FUA_NATIVE; +} + static int cow_flush (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, uint32_t flags, int *err); /* Read data. */ @@ -391,6 +417,67 @@ cow_flush (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, return r; } +static int +cow_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + CLEANUP_FREE uint8_t *block = NULL; + uint64_t blknum, blkoffs; + int r; + uint64_t remaining = count; /* Rounding out could exceed 32 bits */ + enum cache_mode mode; /* XXX Cache this per connection? */ + + switch (next_ops->can_cache (nxdata)) { + case NBDKIT_CACHE_NONE: + mode = BLK_CACHE_IGNORE; + break; + case NBDKIT_CACHE_EMULATE: + mode = BLK_CACHE_READ; + break; + case NBDKIT_CACHE_NATIVE: + mode = BLK_CACHE_PASSTHROUGH; + break; + default: + *err = EINVAL; + return -1; + } + if (cow_on_cache) + mode = BLK_CACHE_COW; + + assert (!flags); + block = malloc (BLKSIZE); + if (block == NULL) { + *err = errno; + nbdkit_error ("malloc: %m"); + return -1; + } + + blknum = offset / BLKSIZE; /* block number */ + blkoffs = offset % BLKSIZE; /* offset within the block */ + + /* Unaligned head */ + remaining += blkoffs; + offset -= blkoffs; + + /* Unaligned tail */ + remaining = ROUND_UP (remaining, BLKSIZE); + + /* Aligned body */ + while (remaining) { + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); + r = blk_cache (next_ops, nxdata, blknum, block, mode, err); + if (r == -1) + return -1; + + remaining -= BLKSIZE; + offset += BLKSIZE; + blknum++; + } + + return 0; +} + static struct nbdkit_filter filter = { .name = "cow", .longname = "nbdkit copy-on-write (COW) filter", @@ -398,6 +485,7 @@ static struct nbdkit_filter filter = { .load = cow_load, .unload = cow_unload, .open = cow_open, + .config = cow_config, .prepare = cow_prepare, .get_size = cow_get_size, .can_write = cow_can_write, @@ -405,10 +493,12 @@ static struct nbdkit_filter filter = { .can_trim = cow_can_trim, .can_extents = cow_can_extents, .can_fua = cow_can_fua, + .can_cache = cow_can_cache, .pread = cow_pread, .pwrite = cow_pwrite, .zero = cow_zero, .flush = cow_flush, + .cache = cow_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 16/24] delay: Implement .cache
In the delay filter, all we need to do is pass through cache operations, but with an added delay (well, technically, this isn't enabled until a later patch flips the default for .can_cache to pass-through). Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/delay/nbdkit-delay-filter.pod | 9 ++++++++- filters/delay/delay.c | 27 ++++++++++++++++++++++++++- 2 files changed, 34 insertions(+), 2 deletions(-) diff --git a/filters/delay/nbdkit-delay-filter.pod b/filters/delay/nbdkit-delay-filter.pod index 2e2ac74..730cea4 100644 --- a/filters/delay/nbdkit-delay-filter.pod +++ b/filters/delay/nbdkit-delay-filter.pod @@ -11,7 +11,7 @@ nbdkit-delay-filter - nbdkit delay filter nbdkit --filter=delay plugin [plugin-args ...] delay-read=(SECS|NNms) delay-write=(SECS|NNms) delay-zero=(SECS|NNms) delay-trim=(SECS|NNms) - delay-extents=(SECS|NNms) + delay-extents=(SECS|NNms) delay-cache=(SECS|NNms) =head1 DESCRIPTION @@ -71,6 +71,13 @@ Delay trim/discard operations by C<SECS> seconds or C<NN> milliseconds. Delay block status (extents) operations by C<SECS> seconds or C<NN> milliseconds. +=item B<delay-cache=>SECS + +=item B<delay-cache=>NNB<ms> + +Delay advisory cache operations by C<SECS> seconds or C<NN> +milliseconds. + =item B<wdelay=>SECS =item B<wdelay=>NNB<ms> diff --git a/filters/delay/delay.c b/filters/delay/delay.c index 862af93..486a24e 100644 --- a/filters/delay/delay.c +++ b/filters/delay/delay.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -47,6 +47,7 @@ static int delay_write_ms = 0; /* write delay (milliseconds) */ static int delay_zero_ms = 0; /* zero delay (milliseconds) */ static int delay_trim_ms = 0; /* trim delay (milliseconds) */ static int delay_extents_ms = 0;/* extents delay (milliseconds) */ +static int delay_cache_ms = 0; /* cache delay (milliseconds) */ static int parse_delay (const char *key, const char *value) @@ -116,6 +117,12 @@ extents_delay (void) delay (delay_extents_ms); } +static void +cache_delay (void) +{ + delay (delay_cache_ms); +} + /* Called for each key=value passed on the command line. */ static int delay_config (nbdkit_next_config *next, void *nxdata, @@ -167,6 +174,12 @@ delay_config (nbdkit_next_config *next, void *nxdata, return -1; return 0; } + else if (strcmp (key, "delay-cache") == 0) { + delay_cache_ms = parse_delay (key, value); + if (delay_cache_ms == -1) + return -1; + return 0; + } else return next (nxdata, key, value); } @@ -178,6 +191,7 @@ delay_config (nbdkit_next_config *next, void *nxdata, "delay-zero=<NN>[ms] Zero delay in seconds/milliseconds.\n" \ "delay-trim=<NN>[ms] Trim delay in seconds/milliseconds.\n" \ "delay-extents=<NN>[ms] Extents delay in seconds/milliseconds.\n" \ + "delay-cache=<NN>[ms] Cache delay in seconds/milliseconds.\n" \ "wdelay=<NN>[ms] Write, zero and trim delay in secs/msecs." /* Read data. */ @@ -231,6 +245,16 @@ delay_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return next_ops->extents (nxdata, count, offset, flags, extents, err); } +/* Cache. */ +static int +delay_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, uint32_t flags, + int *err) +{ + cache_delay (); + return next_ops->cache (nxdata, count, offset, flags, err); +} + static struct nbdkit_filter filter = { .name = "delay", .longname = "nbdkit delay filter", @@ -242,6 +266,7 @@ static struct nbdkit_filter filter = { .zero = delay_zero, .trim = delay_trim, .extents = delay_extents, + .cache = delay_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 17/24] error: Implement .cache
In the error filter, all we need to do is copy-and-paste existing per-command errors to one more class of commands (well, technically, this isn't enabled until a later patch flips the default for .can_cache to pass-through). Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/error/nbdkit-error-filter.pod | 8 ++++++- filters/error/error.c | 32 ++++++++++++++++++++++++--- 2 files changed, 36 insertions(+), 4 deletions(-) diff --git a/filters/error/nbdkit-error-filter.pod b/filters/error/nbdkit-error-filter.pod index 16ef184..eb738a3 100644 --- a/filters/error/nbdkit-error-filter.pod +++ b/filters/error/nbdkit-error-filter.pod @@ -13,6 +13,7 @@ nbdkit-error-filter - inject errors for testing clients [error-trim=...] [error-trim-rate=...] [error-trim-file=...] [error-zero=...] [error-zero-rate=...] [error-zero-file=...] [error-extents=...] [error-extents-rate=...] [error-extents-file=...] + [error-cache=...] [error-cache-rate=...] [error-cache-file=...] =head1 DESCRIPTION @@ -30,7 +31,7 @@ Inject a low rate of errors randomly into the connection: nbdkit --filter=error file disk.img error-rate=1% -Reading, trimming and extents (block status) requests will be +Reading, trimming, cache and extents (block status) requests will be successful, but all writes and zeroing will return "No space left on device": @@ -112,6 +113,11 @@ settings to NBD zero requests. Same as C<error>, C<error-rate> and C<error-file> but only apply the settings to NBD block status requests to read extents. +=item B<error-cache>, B<error-cache-rate>, B<error-cache-file>. + +Same as C<error>, C<error-rate> and C<error-file> but only apply the +settings to NBD cache requests. + =back =head1 NOTES diff --git a/filters/error/error.c b/filters/error/error.c index 8932292..aba6213 100644 --- a/filters/error/error.c +++ b/filters/error/error.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -64,6 +64,7 @@ static struct error_settings pwrite_settings = ERROR_DEFAULT; static struct error_settings trim_settings = ERROR_DEFAULT; static struct error_settings zero_settings = ERROR_DEFAULT; static struct error_settings extents_settings = ERROR_DEFAULT; +static struct error_settings cache_settings = ERROR_DEFAULT; /* Random state. * This must only be accessed when holding the lock (except for load). @@ -85,6 +86,7 @@ error_unload (void) free (trim_settings.file); free (zero_settings.file); free (extents_settings.file); + free (cache_settings.file); } static const struct { const char *name; int error; } errors[] = { @@ -165,7 +167,7 @@ error_config (nbdkit_next_config *next, void *nxdata, return -1; pread_settings.error = pwrite_settings.error trim_settings.error = zero_settings.error - extents_settings.error = i; + extents_settings.error = cache_settings.error = i; return 0; } else if (strcmp (key, "error-pread") == 0) @@ -178,13 +180,15 @@ error_config (nbdkit_next_config *next, void *nxdata, return parse_error (key, value, &zero_settings.error); else if (strcmp (key, "error-extents") == 0) return parse_error (key, value, &extents_settings.error); + else if (strcmp (key, "error-cache") == 0) + return parse_error (key, value, &cache_settings.error); else if (strcmp (key, "error-rate") == 0) { if (parse_error_rate (key, value, &d) == -1) return -1; pread_settings.rate = pwrite_settings.rate trim_settings.rate = zero_settings.rate - extents_settings.rate = d; + extents_settings.rate = cache_settings.rate = d; return 0; } else if (strcmp (key, "error-pread-rate") == 0) @@ -197,6 +201,8 @@ error_config (nbdkit_next_config *next, void *nxdata, return parse_error_rate (key, value, &zero_settings.rate); else if (strcmp (key, "error-extents-rate") == 0) return parse_error_rate (key, value, &extents_settings.rate); + else if (strcmp (key, "error-cache-rate") == 0) + return parse_error_rate (key, value, &cache_settings.rate); /* NB: We are using nbdkit_absolute_path here because the trigger * file probably doesn't exist yet. @@ -212,6 +218,8 @@ error_config (nbdkit_next_config *next, void *nxdata, zero_settings.file = nbdkit_absolute_path (value); free (extents_settings.file); extents_settings.file = nbdkit_absolute_path (value); + free (cache_settings.file); + cache_settings.file = nbdkit_absolute_path (value); return 0; } else if (strcmp (key, "error-pread-file") == 0) { @@ -239,6 +247,11 @@ error_config (nbdkit_next_config *next, void *nxdata, extents_settings.file = nbdkit_absolute_path (value); return 0; } + else if (strcmp (key, "error-cache-file") == 0) { + free (cache_settings.file); + cache_settings.file = nbdkit_absolute_path (value); + return 0; + } else return next (nxdata, key, value); @@ -349,6 +362,18 @@ error_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return next_ops->extents (nxdata, count, offset, flags, extents, err); } +/* Extents. */ +static int +error_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + if (random_error (&cache_settings, "cache", err)) + return -1; + + return next_ops->cache (nxdata, count, offset, flags, err); +} + static struct nbdkit_filter filter = { .name = "error", .longname = "nbdkit error filter", @@ -362,6 +387,7 @@ static struct nbdkit_filter filter = { .trim = error_trim, .zero = error_zero, .extents = error_extents, + .cache = error_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 18/24] log: Implement .cache
In the log filter, all we need to do is copy-and-paste existing per-command logging to add the initial value of advertised cache support, as well as logging actual cache commands (well, technically, this isn't enabled until a later patch flips the default for .can_cache to pass-through). Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/log/nbdkit-log-filter.pod | 10 +++++----- filters/log/log.c | 29 +++++++++++++++++++++++++---- 2 files changed, 30 insertions(+), 9 deletions(-) diff --git a/filters/log/nbdkit-log-filter.pod b/filters/log/nbdkit-log-filter.pod index 973024b..9e102bc 100644 --- a/filters/log/nbdkit-log-filter.pod +++ b/filters/log/nbdkit-log-filter.pod @@ -46,11 +46,11 @@ the impact of the caching. This filter writes to the file specified by the C<logfile=FILE> parameter. All lines include a timestamp, a connection counter, then details about the command. The following actions are logged: Connect, -Read, Write, Zero, Trim, Extents, Flush, and Disconnect. Except for -Connect and Disconnect, an event is logged across two lines for call -and return value, to allow tracking duration and tracing any parallel -execution, using id for correlation (incremented per action on the -connection). +Read, Write, Zero, Trim, Extents, Cache, Flush, and Disconnect. +Except for Connect and Disconnect, an event is logged across two lines +for call and return value, to allow tracking duration and tracing any +parallel execution, using id for correlation (incremented per action +on the connection). An example logging session of a client that performs a single successful read is: diff --git a/filters/log/log.c b/filters/log/log.c index 466160e..133e352 100644 --- a/filters/log/log.c +++ b/filters/log/log.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -243,13 +243,15 @@ log_prepare (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle) int z = next_ops->can_zero (nxdata); int F = next_ops->can_fua (nxdata); int e = next_ops->can_extents (nxdata); + int c = next_ops->can_cache (nxdata); - if (size < 0 || w < 0 || f < 0 || r < 0 || t < 0 || z < 0 || F < 0 || e < 0) + if (size < 0 || w < 0 || f < 0 || r < 0 || t < 0 || z < 0 || F < 0 || + e < 0 || c < 0) return -1; output (h, "Connect", 0, "size=0x%" PRIx64 " write=%d flush=%d " - "rotational=%d trim=%d zero=%d fua=%d extents=%d", - size, w, f, r, t, z, F, e); + "rotational=%d trim=%d zero=%d fua=%d extents=%d cache=%d", + size, w, f, r, t, z, F, e, c); return 0; } @@ -396,6 +398,24 @@ log_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return r; } +/* Cache data. */ +static int +log_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offs, uint32_t flags, + int *err) +{ + struct handle *h = handle; + uint64_t id = get_id (h); + int r; + + assert (!flags); + output (h, "Cache", id, "offset=0x%" PRIx64 " count=0x%x ...", + offs, count); + r = next_ops->cache (nxdata, count, offs, flags, err); + output_return (h, "...Cache", id, r, err); + return r; +} + static struct nbdkit_filter filter = { .name = "log", .longname = "nbdkit log filter", @@ -414,6 +434,7 @@ static struct nbdkit_filter filter = { .trim = log_trim, .zero = log_zero, .extents = log_extents, + .cache = log_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 19/24] offset, partition: Implement .cache
In the offset and partition filters, all we need to do is adjust cache requests to the correct offset (well, technically, this isn't enabled until a later patch flips the default for .can_cache to pass-through). Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/offset/offset.c | 12 +++++++++++- filters/partition/partition.c | 12 ++++++++++++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/filters/offset/offset.c b/filters/offset/offset.c index 633a1c7..fe07d28 100644 --- a/filters/offset/offset.c +++ b/filters/offset/offset.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2018 Red Hat Inc. + * Copyright (C) 2018-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -164,6 +164,15 @@ offset_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return 0; } +/* Cache data. */ +static int +offset_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offs, uint32_t flags, + int *err) +{ + return next_ops->cache (nxdata, count, offs + offset, flags, err); +} + static struct nbdkit_filter filter = { .name = "offset", .longname = "nbdkit offset filter", @@ -177,6 +186,7 @@ static struct nbdkit_filter filter = { .trim = offset_trim, .zero = offset_zero, .extents = offset_extents, + .cache = offset_cache, }; NBDKIT_REGISTER_FILTER(filter) diff --git a/filters/partition/partition.c b/filters/partition/partition.c index a635df8..ee2cc77 100644 --- a/filters/partition/partition.c +++ b/filters/partition/partition.c @@ -254,6 +254,17 @@ partition_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return 0; } +/* Cache data. */ +static int +partition_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offs, uint32_t flags, + int *err) +{ + struct handle *h = handle; + + return next_ops->cache (nxdata, count, offs + h->offset, flags, err); +} + static struct nbdkit_filter filter = { .name = "partition", .longname = "nbdkit partition filter", @@ -270,6 +281,7 @@ static struct nbdkit_filter filter = { .trim = partition_trim, .zero = partition_zero, .extents = partition_extents, + .cache = partition_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 20/24] readahead, xz: Implement .can_cache via emulation
The readahead and xz filters serve as a limited cache, so we want to advertise caching regardless of what the underlying plugin provides. But it turns out that we are best serviced by relying on nbdkit's ability to emulate caching by .pread requests. Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/readahead/readahead.c | 13 +++++++++++++ filters/xz/xz.c | 15 ++++++++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/filters/readahead/readahead.c b/filters/readahead/readahead.c index dc27bae..95bda0e 100644 --- a/filters/readahead/readahead.c +++ b/filters/readahead/readahead.c @@ -109,6 +109,18 @@ readahead_get_size (struct nbdkit_next_ops *next_ops, void *nxdata, return r; } +/* Cache */ +static int +readahead_can_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle) +{ + /* We are already operating as a cache regardless of the plugin's + * underlying .can_cache, but it's easiest to just rely on nbdkit's + * behavior of calling .pread for caching. + */ + return NBDKIT_CACHE_EMULATE; +} + /* Read data. */ static int @@ -241,6 +253,7 @@ static struct nbdkit_filter filter = { .unload = readahead_unload, .prepare = readahead_prepare, .get_size = readahead_get_size, + .can_cache = readahead_can_cache, .pread = readahead_pread, .pwrite = readahead_pwrite, .trim = readahead_trim, diff --git a/filters/xz/xz.c b/filters/xz/xz.c index 366000b..8ada294 100644 --- a/filters/xz/xz.c +++ b/filters/xz/xz.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2013-2018 Red Hat Inc. + * Copyright (C) 2013-2019 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -194,6 +194,18 @@ xz_can_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return 0; } +/* Cache */ +static int +xz_can_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle) +{ + /* We are already operating as a cache regardless of the plugin's + * underlying .can_cache, but it's easiest to just rely on nbdkit's + * behavior of calling .pread for caching. + */ + return NBDKIT_CACHE_EMULATE; +} + /* Read data from the file. */ static int xz_pread (struct nbdkit_next_ops *next_ops, void *nxdata, @@ -248,6 +260,7 @@ static struct nbdkit_filter filter = { .get_size = xz_get_size, .can_write = xz_can_write, .can_extents = xz_can_extents, + .can_cache = xz_can_cache, .pread = xz_pread, }; -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 21/24] stats: Implement .cache
In the stats filter, all we need to do is copy-and-paste existing per-command stats to one more class of commands (well, technically, this isn't enabled until a later patch flips the default for .can_cache to pass-through). Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/stats/stats.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/filters/stats/stats.c b/filters/stats/stats.c index 785cce6..037fc61 100644 --- a/filters/stats/stats.c +++ b/filters/stats/stats.c @@ -61,6 +61,7 @@ static uint64_t pwrite_ops, pwrite_bytes; static uint64_t trim_ops, trim_bytes; static uint64_t zero_ops, zero_bytes; static uint64_t extents_ops, extents_bytes; +static uint64_t cache_ops, cache_bytes; static inline double calc_bps (uint64_t bytes, int64_t usecs) @@ -88,6 +89,9 @@ print_stats (int64_t usecs) if (extents_ops > 0) fprintf (fp, "extents: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", extents_ops, extents_bytes, calc_bps (extents_bytes, usecs)); + if (cache_ops > 0) + fprintf (fp, "cache: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", + cache_ops, cache_bytes, calc_bps (cache_bytes, usecs)); fflush (fp); } @@ -246,6 +250,24 @@ stats_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return r; } +/* Cache. */ +static int +stats_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, + uint32_t count, uint64_t offset, uint32_t flags, + int *err) +{ + int r; + + r = next_ops->cache (nxdata, count, offset, flags, err); + if (r == 0) { + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); + cache_ops++; + cache_bytes += count; + } + return r; +} + static struct nbdkit_filter filter = { .name = "stats", .longname = "nbdkit stats filter", @@ -258,6 +280,7 @@ static struct nbdkit_filter filter = { .trim = stats_trim, .zero = stats_zero, .extents = stats_extents, + .cache = stats_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 22/24] truncate: Implement .cache
In the truncate filter, all we need to do is copy-and-paste existing per-command count rewriting (well, technically, this isn't enabled until a later patch flips the default for .can_cache to pass-through). Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/truncate/truncate.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/filters/truncate/truncate.c b/filters/truncate/truncate.c index 6408c35..38b1cd9 100644 --- a/filters/truncate/truncate.c +++ b/filters/truncate/truncate.c @@ -354,6 +354,29 @@ truncate_extents (struct nbdkit_next_ops *next_ops, void *nxdata, return 0; } +/* Cache. */ +static int +truncate_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offset, + uint32_t flags, int *err) +{ + int r; + uint32_t n; + struct handle *h = handle; + + if (offset < h->real_size) { + if (offset + count <= h->real_size) + n = count; + else + n = h->real_size - offset; + r = next_ops->cache (nxdata, n, offset, flags, err); + if (r == -1) + return -1; + } + + return 0; +} + static struct nbdkit_filter filter = { .name = "truncate", .longname = "nbdkit truncate filter", @@ -369,6 +392,7 @@ static struct nbdkit_filter filter = { .trim = truncate_trim, .zero = truncate_zero, .extents = truncate_extents, + .cache = truncate_cache, }; NBDKIT_REGISTER_FILTER(filter) -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 23/24] filters: Pass through .can_cache for remaining filters
The previous patches finished accounting for all filters that must modify behavior of caching. All remaining filters can pass through .can_cache to the plugin. Signed-off-by: Eric Blake <eblake@redhat.com> --- server/filters.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/server/filters.c b/server/filters.c index 430d515..3bd91fe 100644 --- a/server/filters.c +++ b/server/filters.c @@ -602,11 +602,8 @@ filter_can_cache (struct backend *b, struct connection *conn) if (f->filter.can_cache) return f->filter.can_cache (&next_ops, &nxdata, handle); - /* FIXME: Default to f->backend.next->can_cache, once all filters - have been audited */ else - return NBDKIT_CACHE_NONE; - return f->backend.next->can_cache (f->backend.next, conn); + return f->backend.next->can_cache (f->backend.next, conn); } static int -- 2.20.1
Eric Blake
2019-May-16 03:58 UTC
[Libguestfs] [nbdkit PATCH v2 24/24] nocache: Implement new filter
Similar to the existing fua, nozero and noextents filters, add a filter to make it easy to override the basic caching functionality, in part to facilitate timing tests of whether a plugin's cache implementation is worthwhile. A worthwhile test to add to the testsuite would connect the log filter both before and after the nocache filter, to prove how caching requests are altered. However, until we use libnbd in the nbdkit testsuite, there isn't really a clean way to trigger NBD_CMD_CACHE calls to test this (qemu-nbd can't do it). But I can at least exploit test-eflags to use 'qemu-nbd --list' to demonstrate that the advertised flag changes, with a little tweak to allow the test to run filters with arguments. Signed-off-by: Eric Blake <eblake@redhat.com> --- filters/fua/nbdkit-fua-filter.pod | 5 +- filters/nocache/nbdkit-nocache-filter.pod | 69 +++++++++++ filters/noextents/nbdkit-noextents-filter.pod | 2 + filters/nozero/nbdkit-nozero-filter.pod | 2 + configure.ac | 2 + filters/nocache/nocache.c | 113 ++++++++++++++++++ filters/nocache/Makefile.am | 61 ++++++++++ tests/test-eflags.sh | 21 +++- 8 files changed, 272 insertions(+), 3 deletions(-) create mode 100644 filters/nocache/nbdkit-nocache-filter.pod create mode 100644 filters/nocache/nocache.c create mode 100644 filters/nocache/Makefile.am diff --git a/filters/fua/nbdkit-fua-filter.pod b/filters/fua/nbdkit-fua-filter.pod index bd08a8c..b76917b 100644 --- a/filters/fua/nbdkit-fua-filter.pod +++ b/filters/fua/nbdkit-fua-filter.pod @@ -66,7 +66,10 @@ L<nbdkit(1)>, L<nbdkit-file-plugin(1)>, L<nbdkit-filter(3)>, L<nbdkit-blocksize-filter(1)>, -L<nbdkit-log-filter(1)>. +L<nbdkit-log-filter(1)>, +L<nbdkit-nocache-filter(1)>, +L<nbdkit-noextents-filter(1)>, +L<nbdkit-nozero-filter(1)>. =head1 AUTHORS diff --git a/filters/nocache/nbdkit-nocache-filter.pod b/filters/nocache/nbdkit-nocache-filter.pod new file mode 100644 index 0000000..0f43433 --- /dev/null +++ b/filters/nocache/nbdkit-nocache-filter.pod @@ -0,0 +1,69 @@ +=head1 NAME + +nbdkit-nocache-filter - nbdkit nocache filter + +=head1 SYNOPSIS + + nbdkit --filter=nocache plugin [cachemode=MODE] [plugin-args...] + +=head1 DESCRIPTION + +C<nbdkit-nocache-filter> is a filter that intentionally disables +efficient handling of advisory client cache requests across the NBD +protocol. It is mainly useful for evaluating timing differences to +determine the impact of caching requests. + +Note that the effects of this filter (in crippling handling of client +cache requests) is somewhat orthogonal from that of the +L<nbdkit-cache-filter(1)> (adding local caching of client read/write +requests); the two filters can be run together to experiment with +timings. + +=head1 PARAMETERS + +=over 4 + +=item B<cachemode=none|emulate|nop> + +Optional, controls which mode the filter will use. Mode B<none> +(default) means that cache support is not advertised to the +client. Mode B<emulate> means that cache support is emulated by the +filter using the plugin's C<pread> callback, regardless of whether the +plugin itself implemented the C<cache> callback. Mode B<nop> means +that cache requests are always accepted and immediately ignored, +rather than having any actual impact. + +=back + +=head1 EXAMPLES + +Serve the file F<disk.img>, but prevent C<NBD_CMD_CACHE> requests +altogether, to get a baseline timing of behavior when the client is +unable to make cache requests: + + nbdkit --filter=nocache file disk.img + +Serve the file F<disk.img>, but with cache requests silently ignored, +rather than being forwarded on to the file plugin (which attempts to +use L<posix_fadvise(3)>), to compare against the timings without the +filter and determine whether the file plugin caching was worthwhile: + + nbdkit --filter=nocache file disk.img cachemode=nop + +=head1 SEE ALSO + +L<nbdkit(1)>, +L<nbdkit-file-plugin(1)>, +L<nbdkit-filter(3)>, +L<nbdkit-cache-filter(1)>, +L<nbdkit-fua-filter(1)>, +L<nbdkit-noextents-filter(1)>, +L<nbdkit-nozero-filter(1)>. + +=head1 AUTHORS + +Eric Blake + +=head1 COPYRIGHT + +Copyright (C) 2019 Red Hat Inc. diff --git a/filters/noextents/nbdkit-noextents-filter.pod b/filters/noextents/nbdkit-noextents-filter.pod index 46f6bdb..24519a0 100644 --- a/filters/noextents/nbdkit-noextents-filter.pod +++ b/filters/noextents/nbdkit-noextents-filter.pod @@ -27,6 +27,8 @@ plugin in the normal way. L<nbdkit(1)>, L<nbdkit-filter(3)>, +L<nbdkit-fua-filter(1)>, +L<nbdkit-nocache-filter(1)>, L<nbdkit-nozero-filter(1)>, L<nbdkit-file-plugin(1)>. diff --git a/filters/nozero/nbdkit-nozero-filter.pod b/filters/nozero/nbdkit-nozero-filter.pod index 8e694bb..d94dd8d 100644 --- a/filters/nozero/nbdkit-nozero-filter.pod +++ b/filters/nozero/nbdkit-nozero-filter.pod @@ -52,6 +52,8 @@ the data to be written explicitly rather than punching any holes: L<nbdkit(1)>, L<nbdkit-file-plugin(1)>, L<nbdkit-filter(3)>, +L<nbdkit-fua-filter(1)>, +L<nbdkit-nocache-filter(1)>, L<nbdkit-noextents-filter(1)>. =head1 AUTHORS diff --git a/configure.ac b/configure.ac index 06124c5..2163930 100644 --- a/configure.ac +++ b/configure.ac @@ -831,6 +831,7 @@ filters="\ error \ fua \ log \ + nocache \ noextents \ nozero \ offset \ @@ -906,6 +907,7 @@ AC_CONFIG_FILES([Makefile filters/error/Makefile filters/fua/Makefile filters/log/Makefile + filters/nocache/Makefile filters/noextents/Makefile filters/nozero/Makefile filters/offset/Makefile diff --git a/filters/nocache/nocache.c b/filters/nocache/nocache.c new file mode 100644 index 0000000..abb042e --- /dev/null +++ b/filters/nocache/nocache.c @@ -0,0 +1,113 @@ +/* nbdkit + * Copyright (C) 2018-2019 Red Hat Inc. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * * Neither the name of Red Hat nor the names of its contributors may be + * used to endorse or promote products derived from this software without + * specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A + * PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <stdint.h> +#include <string.h> +#include <stdbool.h> +#include <assert.h> + +#include <nbdkit-filter.h> + +#include "minmax.h" + +#define THREAD_MODEL NBDKIT_THREAD_MODEL_PARALLEL + +static enum CacheMode { + NONE, + EMULATE, + NOP, +} cachemode; + +static int +nocache_config (nbdkit_next_config *next, void *nxdata, + const char *key, const char *value) +{ + if (strcmp (key, "cachemode") == 0) { + if (strcmp (value, "emulate") == 0) + cachemode = EMULATE; + else if (strcmp (value, "nop") == 0 || + strcmp (value, "no-op") == 0) + cachemode = NOP; + else if (strcmp (value, "none") != 0) { + nbdkit_error ("unknown cachemode '%s'", value); + return -1; + } + return 0; + } + return next (nxdata, key, value); +} + +#define nocache_config_help \ + "cachemode=<MODE> Either 'none' (default), 'emulate', or 'nop'.\n" \ + +/* Advertise desired FLAG_SEND_CACHE mode. */ +static int +nocache_can_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle) +{ + switch (cachemode) { + case NONE: + return NBDKIT_CACHE_NONE; + case EMULATE: + return NBDKIT_CACHE_EMULATE; + case NOP: + return NBDKIT_CACHE_NATIVE; + } + abort (); +} + +static int +nocache_cache (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t count, uint64_t offs, uint32_t flags, + int *err) +{ + assert (cachemode == NOP); + assert (!flags); + + return 0; +} + +static struct nbdkit_filter filter = { + .name = "nocache", + .longname = "nbdkit nocache filter", + .version = PACKAGE_VERSION, + .config = nocache_config, + .config_help = nocache_config_help, + .can_cache = nocache_can_cache, + .cache = nocache_cache, +}; + +NBDKIT_REGISTER_FILTER(filter) diff --git a/filters/nocache/Makefile.am b/filters/nocache/Makefile.am new file mode 100644 index 0000000..aa83937 --- /dev/null +++ b/filters/nocache/Makefile.am @@ -0,0 +1,61 @@ +# nbdkit +# Copyright (C) 2018-2019 Red Hat Inc. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions are +# met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# * Neither the name of Red Hat nor the names of its contributors may be +# used to endorse or promote products derived from this software without +# specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND +# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, +# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR +# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF +# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT +# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +# SUCH DAMAGE. + +include $(top_srcdir)/common-rules.mk + +EXTRA_DIST = nbdkit-nocache-filter.pod + +filter_LTLIBRARIES = nbdkit-nocache-filter.la + +nbdkit_nocache_filter_la_SOURCES = \ + nocache.c \ + $(top_srcdir)/include/nbdkit-filter.h + +nbdkit_nocache_filter_la_CPPFLAGS = \ + -I$(top_srcdir)/include \ + -I$(top_srcdir)/common/include +nbdkit_nocache_filter_la_CFLAGS = \ + $(WARNINGS_CFLAGS) +nbdkit_nocache_filter_la_LDFLAGS = \ + -module -avoid-version -shared \ + -Wl,--version-script=$(top_srcdir)/filters/filters.syms + +if HAVE_POD + +man_MANS = nbdkit-nocache-filter.1 +CLEANFILES += $(man_MANS) + +nbdkit-nocache-filter.1: nbdkit-nocache-filter.pod + $(PODWRAPPER) --section=1 --man $@ \ + --html $(top_builddir)/html/$@.html \ + $< + +endif HAVE_POD diff --git a/tests/test-eflags.sh b/tests/test-eflags.sh index eaaaae0..14a0099 100755 --- a/tests/test-eflags.sh +++ b/tests/test-eflags.sh @@ -51,6 +51,7 @@ if ! qemu-nbd --help | grep -sq -- --list; then fi files="eflags.out" +late_args rm -f $files cleanup_fn rm -f $files @@ -70,7 +71,7 @@ SEND_CACHE=$(( 1 << 10 )) do_nbdkit () { - nbdkit -v -U - "$@" sh - --run 'qemu-nbd --list -k $unixsocket' | + nbdkit -v -U - "$@" sh - $late_args --run 'qemu-nbd --list -k $unixsocket' | grep -E "flags: 0x" | grep -Eoi '0x[a-f0-9]+' > eflags.out echo -n eflags=; cat eflags.out @@ -289,7 +290,7 @@ EOF #---------------------------------------------------------------------- # -r -# can_cache=true +# can_cache=native do_nbdkit -r <<'EOF' case "$1" in @@ -301,3 +302,19 @@ EOF [ $eflags -eq $(( HAS_FLAGS|READ_ONLY|SEND_DF|SEND_CACHE )) ] || fail "expected HAS_FLAGS|READ_ONLY|SEND_DF|SEND_CACHE" + +#---------------------------------------------------------------------- +# -r +# --filter=nocache cachemode=none +# can_cache=native + +late_args="cachemode=none" do_nbdkit -r --filter=nocache <<'EOF' +case "$1" in + get_size) echo 1M ;; + can_cache) echo "emulate" ;; + *) exit 2 ;; +esac +EOF + +[ $eflags -eq $(( HAS_FLAGS|READ_ONLY|SEND_DF )) ] || + fail "expected HAS_FLAGS|READ_ONLY|SEND_DF" -- 2.20.1
Richard W.M. Jones
2019-May-17 17:12 UTC
Re: [Libguestfs] [nbdkit PATCH v2 07/24] sh: Implement .cache script callback
On Wed, May 15, 2019 at 10:57:57PM -0500, Eric Blake wrote:> It's easy to expose new callbacks to sh plugins, by borrowing > tri-state code from can_fua. It's possible that nbdkit emulate will > actually work well (in our example.sh script, the kernel caching a > pread from one dd invocation may indeed speed up the next access), but > for the sake of the example, I demonstrated advertising a no-op > handler. > > The shell plugin, coupled with Rich's work on libnbd as a client-side > library for actually exercising calls to NBD_CMD_CACHE, will be a > useful way to prove that cache commands even make it through the > stack. (Remember, qemu 3.0 was released with a fatally flawed > NBD_CMD_CACHE server implementation, because there were no open source > clients at the time that could actually send the command to test the > server with). > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > plugins/sh/nbdkit-sh-plugin.pod | 27 ++++++++--- > plugins/sh/sh.c | 81 +++++++++++++++++++++++++++++++++ > plugins/sh/example.sh | 7 +++ > 3 files changed, 109 insertions(+), 6 deletions(-) > > diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod > index 8af88b4..39b99a2 100644 > --- a/plugins/sh/nbdkit-sh-plugin.pod > +++ b/plugins/sh/nbdkit-sh-plugin.pod > @@ -220,7 +220,7 @@ This method is required. > > Unlike in other languages, you B<must> provide the C<can_*> methods > otherwise they are assumed to all return false and your C<pwrite>, > -C<flush>, C<trim>, C<zero> and C<extents> methods will never be > +C<flush>, C<trim>, C<zero>, and C<extents> methods will never beI wonder if you meant to add C<cache> to this sentence? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Richard W.M. Jones
2019-May-17 17:16 UTC
Re: [Libguestfs] [nbdkit PATCH v2 08/24] ocaml: Implement .cache script callback
On Wed, May 15, 2019 at 10:57:58PM -0500, Eric Blake wrote:> +static int > +can_cache_wrapper (void *h) > +{ > + CAMLparam0 (); > + CAMLlocal1 (rv); > + > + caml_leave_blocking_section (); > + > + rv = caml_callback_exn (can_cache_fn, *(value *) h); > + if (Is_exception_result (rv)) { > + nbdkit_error ("%s", caml_format_exception (Extract_exception (rv))); > + caml_enter_blocking_section (); > + CAMLreturnT (int, -1); > + } > + > + caml_enter_blocking_section (); > + CAMLreturnT (int, Int_val (rv));The not very obvious implicit assumption here is that the order of the Cache* flags in the OCaml code is the same as the numbering of the NBDKIT_CACHE_* flags in the C code, which it is, so we should be good. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Richard W.M. Jones
2019-May-17 17:17 UTC
Re: [Libguestfs] [nbdkit PATCH v2 24/24] nocache: Implement new filter
Yeah this series looks generally fine to me. Please push it and if there are any problems we can tidy them up later. Thanks! Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Maybe Matching Threads
- [nbdkit PATCH 0/9] RFC: implement NBD_CMD_CACHE
- [nbdkit PATCH 0/2] ext2 export list tweaks
- [nbdkit PATCH 0/9] can_FOO caching, more filter validation
- [nbdkit PATCH v3 00/15] Add FUA support to nbdkit
- [nbdkit PATCH 0/2] RFC: tweak error handling, add log filter