Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 00/22] NBD 64-bit extensions (libnbd portion)
v2 was here: https://listman.redhat.com/archives/libguestfs/2022-November/030292.html For v3, there are now approved specs to code to: https://listman.redhat.com/archives/libguestfs/2023-April/031251.html visible on an upstream branch at https://github.com/NetworkBlockDevice/nbd/blob/extension-ext-header/doc/proto.md (upstream leaves extensions in a branch state until there are multiple implementations that comply with that extension) as well as interoperability with counterpart qemu patches: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03607.html also available at https://repo.or.cz/qemu/ericb.git/shortlog/refs/tags/exthdr-v3 I'm still working on adding extended header support into nbdkit, which will involve adding a v3 protocol for plugins; that work is probably much further out. Changes since v2: 001/22:[0002] [FC] 'block_status: Refactor array storage' 002/22:[0016] [FC] 'internal: Refactor layout of replies in sbuf' 003/22:[0002] [FC] 'protocol: Add definitions for extended headers' 004/22:[0006] [FC] 'states: Prepare to send 64-bit requests' 005/22:[0013] [FC] 'states: Prepare to receive 64-bit replies' 006/22:[0007] [FC] 'states: Break deadlock if server goofs on extended replies' 007/22:[0021] [FC] 'generator: Add struct nbd_extent in prep for 64-bit extents' 008/22:[----] [-C] 'block_status: Track 64-bit extents internally' 009/22:[0012] [FC] 'block_status: Accept 64-bit extents during block status' 010/22:[0063] [FC] 'api: Add [aio_]nbd_block_status_64' 011/22:[0006] [FC] 'api: Add several functions for controlling extended headers' 012/22:[----] [--] 'copy: Update nbdcopy to use 64-bit block status' 013/22:[0004] [FC] 'dump: Update nbddump to use 64-bit block status' 014/22:[----] [--] 'info: Expose extended-headers support through nbdinfo' 015/22:[0006] [FC] 'info: Update nbdinfo --map to use 64-bit block status' 016/22:[----] [-C] 'examples: Update copy-libev to use 64-bit block status' 017/22:[0002] [FC] 'ocaml: Add example for 64-bit extents' 018/22:[0004] [FC] 'generator: Actually request extended headers' 019/22:[0008] [FC] 'api: Add nbd_[aio_]opt_extended_headers()' 020/22:[0004] [FC] 'interop: Add test of 64-bit block status' 021/22:[0011] [FC] 'api: Add nbd_can_block_status_payload()' 022/22:[0004] [FC] 'api: Add nbd_[aio_]block_status_filter()' The changes are mostly fallout from rebasing on top of style cleanups that Laszlo has been working on, but also deal with some changes with the spec compared to how it was proposed in v2 (for example, upstream decided that for NBD_CMD_BLOCK_STATUS, the server should always use 64-bit replies, rather than making the client support both 32- and 64-bit replies). Eric Blake (22): block_status: Refactor array storage internal: Refactor layout of replies in sbuf protocol: Add definitions for extended headers states: Prepare to send 64-bit requests states: Prepare to receive 64-bit replies states: Break deadlock if server goofs on extended replies generator: Add struct nbd_extent in prep for 64-bit extents block_status: Track 64-bit extents internally block_status: Accept 64-bit extents during block status api: Add [aio_]nbd_block_status_64 api: Add several functions for controlling extended headers copy: Update nbdcopy to use 64-bit block status dump: Update nbddump to use 64-bit block status info: Expose extended-headers support through nbdinfo info: Update nbdinfo --map to use 64-bit block status examples: Update copy-libev to use 64-bit block status ocaml: Add example for 64-bit extents generator: Actually request extended headers api: Add nbd_[aio_]opt_extended_headers() interop: Add test of 64-bit block status api: Add nbd_can_block_status_payload() api: Add nbd_[aio_]block_status_filter() docs/libnbd.pod | 18 +- info/nbdinfo.pod | 21 +- sh/nbdsh.pod | 2 +- lib/internal.h | 41 +- lib/nbd-protocol.h | 112 +++- generator/API.mli | 1 + generator/API.ml | 534 +++++++++++++++--- generator/C.ml | 24 +- generator/GoLang.ml | 24 + generator/Makefile.am | 1 + generator/OCaml.ml | 23 +- generator/Python.ml | 20 +- generator/state_machine.ml | 50 +- generator/states-issue-command.c | 33 +- .../states-newstyle-opt-extended-headers.c | 110 ++++ generator/states-newstyle-opt-starttls.c | 7 +- .../states-newstyle-opt-structured-reply.c | 3 +- generator/states-newstyle.c | 3 + generator/states-reply-simple.c | 4 +- generator/states-reply-structured.c | 253 ++++++--- generator/states-reply.c | 69 ++- lib/aio.c | 7 +- lib/flags.c | 12 + lib/handle.c | 25 +- lib/opt.c | 44 ++ lib/rw.c | 250 +++++++- python/t/110-defaults.py | 1 + python/t/120-set-non-defaults.py | 2 + python/t/465-block-status-64.py | 56 ++ ocaml/examples/Makefile.am | 1 + ocaml/examples/extents64.ml | 42 ++ ocaml/helpers.c | 20 + ocaml/nbd-c.h | 1 + ocaml/tests/Makefile.am | 1 + ocaml/tests/test_110_defaults.ml | 2 + ocaml/tests/test_120_set_non_defaults.ml | 3 + ocaml/tests/test_465_block_status_64.ml | 58 ++ tests/Makefile.am | 4 + tests/meta-base-allocation.c | 104 +++- tests/pwrite-extended.c | 112 ++++ examples/copy-libev.c | 21 +- examples/server-flags.c | 7 +- interop/Makefile.am | 18 + interop/block-status-payload.c | 241 ++++++++ interop/block-status-payload.sh | 80 +++ interop/large-status.c | 186 ++++++ interop/large-status.sh | 49 ++ interop/opt-extended-headers.c | 153 +++++ interop/opt-extended-headers.sh | 29 + .gitignore | 4 + copy/nbd-ops.c | 22 +- dump/dump.c | 27 +- fuzzing/libnbd-fuzz-wrapper.c | 20 +- golang/Makefile.am | 1 + golang/handle.go | 6 + golang/libnbd_110_defaults_test.go | 8 + golang/libnbd_120_set_non_defaults_test.go | 12 + golang/libnbd_465_block_status_64_test.go | 119 ++++ info/can.c | 14 + info/info-can.sh | 30 + info/info-packets.sh | 17 +- info/main.c | 7 +- info/map.c | 65 ++- info/show.c | 9 +- 64 files changed, 2892 insertions(+), 351 deletions(-) create mode 100644 generator/states-newstyle-opt-extended-headers.c create mode 100644 python/t/465-block-status-64.py create mode 100644 ocaml/examples/extents64.ml create mode 100644 ocaml/tests/test_465_block_status_64.ml create mode 100644 tests/pwrite-extended.c create mode 100644 interop/block-status-payload.c create mode 100755 interop/block-status-payload.sh create mode 100644 interop/large-status.c create mode 100755 interop/large-status.sh create mode 100644 interop/opt-extended-headers.c create mode 100755 interop/opt-extended-headers.sh create mode 100644 golang/libnbd_465_block_status_64_test.go base-commit: 17d0bc6a71bbc67f3d7707d129bec4acf7e7a382 -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 01/22] block_status: Refactor array storage
For 32-bit block status, we were able to cheat and use an array with an odd number of elements, with array[0] holding the context id, and passing &array[1] to the user's callback. But once we have 64-bit extents, we can no longer abuse array element 0 like that, for two reasons: 64-bit extents contain uint64_t which might not be alignment-compatible with an array of uint32_t on all architectures, and the new NBD_REPLY_TYPE_BLOCK_STATUS_EXT adds an additional count field before the array. Split out a new state STRUCTURED_REPLY.BS_HEADER to receive the context id (and eventually the new count field for 64-bit replies) separately from the extents array, and add another structured_reply type in the payload section for tracking it. No behavioral change, other than the rare possibility of landing in the new state. Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 1 + lib/nbd-protocol.h | 17 +++++---- generator/state_machine.ml | 9 ++++- generator/states-reply-structured.c | 56 ++++++++++++++++++++--------- 4 files changed, 60 insertions(+), 23 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index b155681d..25eeea34 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -242,6 +242,7 @@ struct nbd_handle { union { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; + struct nbd_structured_reply_block_status_hdr bs_hdr; struct { struct nbd_structured_reply_error error; char msg[NBD_MAX_STRING]; /* Common to all error types */ diff --git a/lib/nbd-protocol.h b/lib/nbd-protocol.h index 0217891e..f4640a98 100644 --- a/lib/nbd-protocol.h +++ b/lib/nbd-protocol.h @@ -182,12 +182,6 @@ struct nbd_fixed_new_option_reply_meta_context { /* followed by a string */ } NBD_ATTRIBUTE_PACKED; -/* NBD_REPLY_TYPE_BLOCK_STATUS block descriptor. */ -struct nbd_block_descriptor { - uint32_t length; /* length of block */ - uint32_t status_flags; /* block type (hole etc) */ -} NBD_ATTRIBUTE_PACKED; - /* Request (client -> server). */ struct nbd_request { uint32_t magic; /* NBD_REQUEST_MAGIC. */ @@ -224,6 +218,17 @@ struct nbd_structured_reply_offset_hole { uint32_t length; /* Length of hole. */ } NBD_ATTRIBUTE_PACKED; +/* NBD_REPLY_TYPE_BLOCK_STATUS block descriptor. */ +struct nbd_block_descriptor { + uint32_t length; /* length of block */ + uint32_t status_flags; /* block type (hole etc) */ +} NBD_ATTRIBUTE_PACKED; + +struct nbd_structured_reply_block_status_hdr { + uint32_t context_id; /* metadata context ID */ + /* followed by array of nbd_block_descriptor extents */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_error { uint32_t error; /* NBD_E* error number */ uint16_t len; /* Length of human readable error. */ diff --git a/generator/state_machine.ml b/generator/state_machine.ml index 126159b9..1f0d00b0 100644 --- a/generator/state_machine.ml +++ b/generator/state_machine.ml @@ -871,10 +871,17 @@ and external_events = []; }; + State { + default_state with + name = "RECV_BS_HEADER"; + comment = "Receive header of structured reply block-status payload"; + external_events = []; + }; + State { default_state with name = "RECV_BS_ENTRIES"; - comment = "Receive a structured reply block-status payload"; + comment = "Receive entries array of structured reply block-status payload"; external_events = []; }; diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 5aca7262..96182222 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -126,19 +126,10 @@ REPLY.STRUCTURED_REPLY.CHECK: length < 12 || ((length-4) & 7) != 0) goto resync; assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); - /* We read the context ID followed by all the entries into a - * single array and deal with it at the end. - */ - free (h->bs_entries); - h->bs_entries = malloc (length); - if (h->bs_entries == NULL) { - SET_NEXT_STATE (%.DEAD); - set_error (errno, "malloc"); - break; - } - h->rbuf = h->bs_entries; - h->rlen = length; - SET_NEXT_STATE (%RECV_BS_ENTRIES); + /* Start by reading the context ID. */ + h->rbuf = &h->sbuf.sr.payload.bs_hdr; + h->rlen = sizeof h->sbuf.sr.payload.bs_hdr; + SET_NEXT_STATE (%RECV_BS_HEADER); break; default: @@ -424,9 +415,41 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: } return 0; + REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: + struct command *cmd = h->reply_cmd; + uint32_t length; + + switch (recv_into_rbuf (h)) { + case -1: SET_NEXT_STATE (%.DEAD); return 0; + case 1: + save_reply_state (h); + SET_NEXT_STATE (%.READY); + return 0; + case 0: + length = be32toh (h->sbuf.sr.structured_reply.length); + + assert (cmd); /* guaranteed by CHECK */ + assert (cmd->type == NBD_CMD_BLOCK_STATUS); + assert (length >= 12); + length -= sizeof h->sbuf.sr.payload.bs_hdr; + + free (h->bs_entries); + h->bs_entries = malloc (length); + if (h->bs_entries == NULL) { + SET_NEXT_STATE (%.DEAD); + set_error (errno, "malloc"); + return 0; + } + h->rbuf = h->bs_entries; + h->rlen = length; + SET_NEXT_STATE (%RECV_BS_ENTRIES); + } + return 0; + REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: struct command *cmd = h->reply_cmd; uint32_t length; + uint32_t count; size_t i; uint32_t context_id; @@ -445,15 +468,16 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: assert (h->bs_entries); assert (length >= 12); assert (h->meta_valid); + count = (length - sizeof h->sbuf.sr.payload.bs_hdr) / sizeof *h->bs_entries; /* Need to byte-swap the entries returned, but apart from that we * don't validate them. */ - for (i = 0; i < length/4; ++i) + for (i = 0; i < count; ++i) h->bs_entries[i] = be32toh (h->bs_entries[i]); /* Look up the context ID. */ - context_id = h->bs_entries[0]; + context_id = be32toh (h->sbuf.sr.payload.bs_hdr.context_id); for (i = 0; i < h->meta_contexts.len; ++i) if (context_id == h->meta_contexts.ptr[i].context_id) break; @@ -464,7 +488,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: if (CALL_CALLBACK (cmd->cb.fn.extent, h->meta_contexts.ptr[i].name, cmd->offset, - &h->bs_entries[1], (length-4) / 4, + h->bs_entries, count, &error) == -1) if (cmd->error == 0) cmd->error = error ? error : EPROTO; -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 02/22] internal: Refactor layout of replies in sbuf
In order to more easily add a third reply type with an even larger header, but where the payload will look the same for both structured and extended replies, it is nicer if simple and structured replies are nested inside the same layer of sbuf.reply.hdr. While at it, note that while .or and .sr are structs declared within the overall sbuf union, we never read into both halves of those structs at the same time, so it does not matter if their two halves are consecutive. Dropping the packed notation on those structs means the compiler can align .payload more naturally, which may slightly improve performance on some platforms, even if it makes the overall union a few bytes larger due to padding. Visually, this patch changes the layout from: offset simple structured +------------------------------------------------------------+ | union sbuf | | +---------------------+------------------------------+ | | | struct simple_reply | union sr | | | | +-----------------+ | +--------------------------+ | | | | | | | | struct structured_reply | | | | | | | | | +----------------------+ | | | | 0 | | uint32_t magic | | | | uint32_t magic | | | | | 4 | | uint32_t error | | | | uint16_t flags | | | | | 6 | | | | | | uint16_t type | | | | | 8 | | uint64_t handle | | | | uint64_t handle | | | | | | +-----------------+ | | | | | | | | 16 | [padding] | | | uint32_t length | | | | | | | | +----------------------+ | | | | | | | union payload | | | | | | | +-----------+----------+ | | | | 20 | | | | ... | ... | | | | | | | | +-----------+----------+ | | | | | | +--------------------------+ | | | +---------------------+------------------------------+ | +------------------------------------------------------------+ to: offset simple structured +-------------------------------------------------------------+ | union sbuf | | +-----------------------------------------------------+ | | | struct reply | | | | +-------------------------------------------------+ | | | | | union hdr | | | | | | +--------------------+------------------------+ | | | | | | | struct simple | struct structured | | | | | | | | +----------------+ | +--------------------+ | | | | | 0 | | | | uint32_t magic | | | uint32_t magic | | | | | | 4 | | | | uint32_t error | | | uint16_t flags | | | | | | 6 | | | | | | | uint16_t type | | | | | | 8 | | | | uint64_t handle| | | uint64_t handle | | | | | | | | | +----------------+ | | | | | | | | 16 | | | [padding] | | uint32_t length | | | | | | | | | | +--------------------+ | | | | | 20 | | | | [padding] | | | | | | | +--------------------+------------------------+ | | | | | | union payload | | | | | | +--------------------+------------------------+ | | | | 24 | | | ... | ... | | | | | | | +--------------------+------------------------+ | | | | | +-------------------------------------------------+ | | | +-----------------------------------------------------+ | +-------------------------------------------------------------+ Technically, whether the payload union offset moves to byte 24 (with 20-23 now padding) or stays at 20 depends on compiler ABI; but many systems prefer that any struct with a uint64_t provide 8-byte alignment to its containing union. The commit is largely mechanical, and there should be no semantic change. Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 12 ++-- generator/states-reply-simple.c | 4 +- generator/states-reply-structured.c | 103 ++++++++++++++-------------- generator/states-reply.c | 14 ++-- 4 files changed, 68 insertions(+), 65 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 25eeea34..c71980ef 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -231,14 +231,16 @@ struct nbd_handle { struct { struct nbd_fixed_new_option_reply_meta_context context; char str[NBD_MAX_STRING]; - } __attribute__ ((packed)) context; + } __attribute__ ((packed)) context; char err_msg[NBD_MAX_STRING]; } payload; - } __attribute__ ((packed)) or; + } or; struct nbd_export_name_option_reply export_name_reply; - struct nbd_simple_reply simple_reply; struct { - struct nbd_structured_reply structured_reply; + union { + struct nbd_simple_reply simple; + struct nbd_structured_reply structured; + } hdr; union { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; @@ -249,7 +251,7 @@ struct nbd_handle { uint64_t offset; /* Only used for NBD_REPLY_TYPE_ERROR_OFFSET */ } __attribute__ ((packed)) error; } payload; - } __attribute__ ((packed)) sr; + } reply; uint16_t gflags; uint32_t cflags; uint32_t len; diff --git a/generator/states-reply-simple.c b/generator/states-reply-simple.c index 8fd9f62a..e6f1ee23 100644 --- a/generator/states-reply-simple.c +++ b/generator/states-reply-simple.c @@ -23,7 +23,7 @@ REPLY.SIMPLE_REPLY.START: struct command *cmd = h->reply_cmd; uint32_t error; - error = be32toh (h->sbuf.simple_reply.error); + error = be32toh (h->sbuf.reply.hdr.simple.error); if (cmd == NULL) { /* Unexpected reply. If error was set or we have structured @@ -39,7 +39,7 @@ REPLY.SIMPLE_REPLY.START: if (error || h->structured_replies) SET_NEXT_STATE (%^FINISH_COMMAND); else { - uint64_t cookie = be64toh (h->sbuf.simple_reply.handle); + uint64_t cookie = be64toh (h->sbuf.reply.hdr.simple.handle); SET_NEXT_STATE (%.DEAD); set_error (EPROTO, "no matching cookie %" PRIu64 " found for server reply, " diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 96182222..6f96945a 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -49,9 +49,9 @@ REPLY.STRUCTURED_REPLY.START: * so read the remaining part. */ h->rbuf = &h->sbuf; - h->rbuf = (char *)h->rbuf + sizeof h->sbuf.simple_reply; - h->rlen = sizeof h->sbuf.sr.structured_reply; - h->rlen -= sizeof h->sbuf.simple_reply; + h->rbuf = (char *)h->rbuf + sizeof h->sbuf.reply.hdr.simple; + h->rlen = sizeof h->sbuf.reply.hdr.structured; + h->rlen -= sizeof h->sbuf.reply.hdr.simple; SET_NEXT_STATE (%RECV_REMAINING); return 0; @@ -71,9 +71,9 @@ REPLY.STRUCTURED_REPLY.CHECK: uint16_t flags, type; uint32_t length; - flags = be16toh (h->sbuf.sr.structured_reply.flags); - type = be16toh (h->sbuf.sr.structured_reply.type); - length = be32toh (h->sbuf.sr.structured_reply.length); + flags = be16toh (h->sbuf.reply.hdr.structured.flags); + type = be16toh (h->sbuf.reply.hdr.structured.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); /* Reject a server that replies with too much information, but don't * reject a single structured reply to NBD_CMD_READ on the largest @@ -82,7 +82,7 @@ REPLY.STRUCTURED_REPLY.CHECK: * oversized reply is going to take long enough to resync that it is * not worth keeping the connection alive. */ - if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.sr.payload.offset_data) { + if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.reply.payload.offset_data) { set_error (0, "invalid server reply length %" PRIu32, length); SET_NEXT_STATE (%.DEAD); return 0; @@ -105,19 +105,19 @@ REPLY.STRUCTURED_REPLY.CHECK: * them as an extension, so we use < instead of <=. */ if (cmd->type != NBD_CMD_READ || - length < sizeof h->sbuf.sr.payload.offset_data) + length < sizeof h->sbuf.reply.payload.offset_data) goto resync; - h->rbuf = &h->sbuf.sr.payload.offset_data; - h->rlen = sizeof h->sbuf.sr.payload.offset_data; + h->rbuf = &h->sbuf.reply.payload.offset_data; + h->rlen = sizeof h->sbuf.reply.payload.offset_data; SET_NEXT_STATE (%RECV_OFFSET_DATA); break; case NBD_REPLY_TYPE_OFFSET_HOLE: if (cmd->type != NBD_CMD_READ || - length != sizeof h->sbuf.sr.payload.offset_hole) + length != sizeof h->sbuf.reply.payload.offset_hole) goto resync; - h->rbuf = &h->sbuf.sr.payload.offset_hole; - h->rlen = sizeof h->sbuf.sr.payload.offset_hole; + h->rbuf = &h->sbuf.reply.payload.offset_hole; + h->rlen = sizeof h->sbuf.reply.payload.offset_hole; SET_NEXT_STATE (%RECV_OFFSET_HOLE); break; @@ -127,8 +127,8 @@ REPLY.STRUCTURED_REPLY.CHECK: goto resync; assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); /* Start by reading the context ID. */ - h->rbuf = &h->sbuf.sr.payload.bs_hdr; - h->rlen = sizeof h->sbuf.sr.payload.bs_hdr; + h->rbuf = &h->sbuf.reply.payload.bs_hdr; + h->rlen = sizeof h->sbuf.reply.payload.bs_hdr; SET_NEXT_STATE (%RECV_BS_HEADER); break; @@ -139,10 +139,10 @@ REPLY.STRUCTURED_REPLY.CHECK: * compliant, will favor the wire error over EPROTO during more * length checks in RECV_ERROR_MESSAGE and RECV_ERROR_TAIL. */ - if (length < sizeof h->sbuf.sr.payload.error.error.error) + if (length < sizeof h->sbuf.reply.payload.error.error.error) goto resync; - h->rbuf = &h->sbuf.sr.payload.error.error; - h->rlen = MIN (length, sizeof h->sbuf.sr.payload.error.error); + h->rbuf = &h->sbuf.reply.payload.error.error; + h->rlen = MIN (length, sizeof h->sbuf.reply.payload.error.error); SET_NEXT_STATE (%RECV_ERROR); } else @@ -168,19 +168,19 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - assert (length >= sizeof h->sbuf.sr.payload.error.error.error); + length = be32toh (h->sbuf.reply.hdr.structured.length); + assert (length >= sizeof h->sbuf.reply.payload.error.error.error); assert (cmd); - if (length < sizeof h->sbuf.sr.payload.error.error) + if (length < sizeof h->sbuf.reply.payload.error.error) goto resync; - msglen = be16toh (h->sbuf.sr.payload.error.error.len); - if (msglen > length - sizeof h->sbuf.sr.payload.error.error || - msglen > sizeof h->sbuf.sr.payload.error.msg) + msglen = be16toh (h->sbuf.reply.payload.error.error.len); + if (msglen > length - sizeof h->sbuf.reply.payload.error.error || + msglen > sizeof h->sbuf.reply.payload.error.msg) goto resync; - h->rbuf = h->sbuf.sr.payload.error.msg; + h->rbuf = h->sbuf.reply.payload.error.msg; h->rlen = msglen; SET_NEXT_STATE (%RECV_ERROR_MESSAGE); } @@ -188,11 +188,11 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: resync: /* Favor the error packet's errno over RESYNC's EPROTO. */ - error = be32toh (h->sbuf.sr.payload.error.error.error); + error = be32toh (h->sbuf.reply.payload.error.error.error); if (cmd->error == 0) cmd->error = nbd_internal_errno_of_nbd_error (error); h->rbuf = NULL; - h->rlen = length - MIN (length, sizeof h->sbuf.sr.payload.error.error); + h->rlen = length - MIN (length, sizeof h->sbuf.reply.payload.error.error); SET_NEXT_STATE (%RESYNC); return 0; @@ -207,15 +207,15 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - msglen = be16toh (h->sbuf.sr.payload.error.error.len); - type = be16toh (h->sbuf.sr.structured_reply.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); + msglen = be16toh (h->sbuf.reply.payload.error.error.len); + type = be16toh (h->sbuf.reply.hdr.structured.type); - length -= sizeof h->sbuf.sr.payload.error.error + msglen; + length -= sizeof h->sbuf.reply.payload.error.error + msglen; if (msglen) debug (h, "structured error server message: %.*s", (int)msglen, - h->sbuf.sr.payload.error.msg); + h->sbuf.reply.payload.error.msg); /* Special case two specific errors; silently ignore tail for all others */ h->rbuf = NULL; @@ -227,11 +227,11 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: "the server may have a bug"); break; case NBD_REPLY_TYPE_ERROR_OFFSET: - if (length != sizeof h->sbuf.sr.payload.error.offset) + if (length != sizeof h->sbuf.reply.payload.error.offset) debug (h, "unable to safely extract error offset, " "the server may have a bug"); else - h->rbuf = &h->sbuf.sr.payload.error.offset; + h->rbuf = &h->sbuf.reply.payload.error.offset; break; } SET_NEXT_STATE (%RECV_ERROR_TAIL); @@ -250,8 +250,8 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_TAIL: SET_NEXT_STATE (%.READY); return 0; case 0: - error = be32toh (h->sbuf.sr.payload.error.error.error); - type = be16toh (h->sbuf.sr.structured_reply.type); + error = be32toh (h->sbuf.reply.payload.error.error.error); + type = be16toh (h->sbuf.reply.hdr.structured.type); assert (cmd); /* guaranteed by CHECK */ @@ -266,7 +266,7 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_TAIL: * user callback if present. Ignore the offset if it was bogus. */ if (type == NBD_REPLY_TYPE_ERROR_OFFSET && h->rbuf) { - uint64_t offset = be64toh (h->sbuf.sr.payload.error.offset); + uint64_t offset = be64toh (h->sbuf.reply.payload.error.offset); if (structured_reply_in_bounds (offset, 0, cmd) && cmd->type == NBD_CMD_READ && CALLBACK_IS_NOT_NULL (cmd->cb.fn.chunk)) { @@ -307,8 +307,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - offset = be64toh (h->sbuf.sr.payload.offset_data.offset); + length = be32toh (h->sbuf.reply.hdr.structured.length); + offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ @@ -346,8 +346,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - offset = be64toh (h->sbuf.sr.payload.offset_data.offset); + length = be32toh (h->sbuf.reply.hdr.structured.length); + offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ if (CALLBACK_IS_NOT_NULL (cmd->cb.fn.chunk)) { @@ -377,8 +377,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: SET_NEXT_STATE (%.READY); return 0; case 0: - offset = be64toh (h->sbuf.sr.payload.offset_hole.offset); - length = be32toh (h->sbuf.sr.payload.offset_hole.length); + offset = be64toh (h->sbuf.reply.payload.offset_hole.offset); + length = be32toh (h->sbuf.reply.payload.offset_hole.length); assert (cmd); /* guaranteed by CHECK */ @@ -426,12 +426,12 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); + length = be32toh (h->sbuf.reply.hdr.structured.length); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (length >= 12); - length -= sizeof h->sbuf.sr.payload.bs_hdr; + length -= sizeof h->sbuf.reply.payload.bs_hdr; free (h->bs_entries); h->bs_entries = malloc (length); @@ -460,7 +460,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); + length = be32toh (h->sbuf.reply.hdr.structured.length); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); @@ -468,7 +468,8 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: assert (h->bs_entries); assert (length >= 12); assert (h->meta_valid); - count = (length - sizeof h->sbuf.sr.payload.bs_hdr) / sizeof *h->bs_entries; + count = (length - sizeof h->sbuf.reply.payload.bs_hdr) / + sizeof *h->bs_entries; /* Need to byte-swap the entries returned, but apart from that we * don't validate them. @@ -477,7 +478,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: h->bs_entries[i] = be32toh (h->bs_entries[i]); /* Look up the context ID. */ - context_id = be32toh (h->sbuf.sr.payload.bs_hdr.context_id); + context_id = be32toh (h->sbuf.reply.payload.bs_hdr.context_id); for (i = 0; i < h->meta_contexts.len; ++i) if (context_id == h->meta_contexts.ptr[i].context_id) break; @@ -524,8 +525,8 @@ REPLY.STRUCTURED_REPLY.RESYNC: SET_NEXT_STATE (%^FINISH_COMMAND); return 0; } - type = be16toh (h->sbuf.sr.structured_reply.type); - length = be32toh (h->sbuf.sr.structured_reply.length); + type = be16toh (h->sbuf.reply.hdr.structured.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); debug (h, "unexpected reply type %u or payload length %" PRIu32 " for cookie %" PRIu64 " and command %" PRIu32 ", this is probably a server bug", @@ -539,7 +540,7 @@ REPLY.STRUCTURED_REPLY.RESYNC: REPLY.STRUCTURED_REPLY.FINISH: uint16_t flags; - flags = be16toh (h->sbuf.sr.structured_reply.flags); + flags = be16toh (h->sbuf.reply.hdr.structured.flags); if (flags & NBD_REPLY_FLAG_DONE) { SET_NEXT_STATE (%^FINISH_COMMAND); } diff --git a/generator/states-reply.c b/generator/states-reply.c index f7888154..31e4bd2f 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -69,8 +69,8 @@ REPLY.START: assert (h->reply_cmd == NULL); assert (h->rlen == 0); - h->rbuf = &h->sbuf; - h->rlen = sizeof h->sbuf.simple_reply; + h->rbuf = &h->sbuf.reply.hdr; + h->rlen = sizeof h->sbuf.reply.hdr.simple; r = h->sock->ops->recv (h, h->sock, h->rbuf, h->rlen); if (r == -1) { @@ -115,7 +115,7 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: uint32_t magic; uint64_t cookie; - magic = be32toh (h->sbuf.simple_reply.magic); + magic = be32toh (h->sbuf.reply.hdr.simple.magic); if (magic == NBD_SIMPLE_REPLY_MAGIC) { SET_NEXT_STATE (%SIMPLE_REPLY.START); } @@ -127,8 +127,8 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: SET_NEXT_STATE (%.DEAD); set_error (0, "invalid reply magic 0x%" PRIx32, magic); #if 0 /* uncomment to see desynchronized data */ - nbd_internal_hexdump (&h->sbuf.simple_reply, - sizeof (h->sbuf.simple_reply), + nbd_internal_hexdump (&h->sbuf.reply.hdr.simple, + sizeof (h->sbuf.reply.hdr.simple), stderr); #endif return 0; @@ -138,7 +138,7 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: * handle (our cookie) is stored at the same offset. */ h->chunks_received++; - cookie = be64toh (h->sbuf.simple_reply.handle); + cookie = be64toh (h->sbuf.reply.hdr.simple.handle); /* Find the command amongst the commands in flight. If the server sends * a reply for an unknown cookie, FINISH will diagnose that later. */ @@ -157,7 +157,7 @@ REPLY.FINISH_COMMAND: /* NB: This works for both simple and structured replies because the * handle (our cookie) is stored at the same offset. */ - cookie = be64toh (h->sbuf.simple_reply.handle); + cookie = be64toh (h->sbuf.reply.hdr.simple.handle); /* Find the command amongst the commands in flight. */ for (cmd = h->cmds_in_flight, prev_cmd = NULL; cmd != NULL; -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 03/22] protocol: Add definitions for extended headers
Add the magic numbers and new structs necessary to implement the NBD protocol extension of extended headers providing 64-bit lengths. This corresponds to upstream nbd commits 36abf47d and a9384e2f on the extension-ext-header branch[1] (commit e6f3b94a for NBD_FLAG_BLOCK_STATUS_PAYLOAD is saved for a later patch). [1] https://github.com/NetworkBlockDevice/nbd/blob/extension-ext-header/doc/proto.md Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/nbd-protocol.h | 66 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 52 insertions(+), 14 deletions(-) diff --git a/lib/nbd-protocol.h b/lib/nbd-protocol.h index f4640a98..b6fa9b8a 100644 --- a/lib/nbd-protocol.h +++ b/lib/nbd-protocol.h @@ -124,6 +124,7 @@ struct nbd_fixed_new_option_reply { #define NBD_OPT_STRUCTURED_REPLY 8 #define NBD_OPT_LIST_META_CONTEXT 9 #define NBD_OPT_SET_META_CONTEXT 10 +#define NBD_OPT_EXTENDED_HEADERS 11 #define NBD_REP_ERR(val) (0x80000000 | (val)) #define NBD_REP_IS_ERR(val) (!!((val) & 0x80000000)) @@ -141,6 +142,7 @@ struct nbd_fixed_new_option_reply { #define NBD_REP_ERR_SHUTDOWN NBD_REP_ERR (7) #define NBD_REP_ERR_BLOCK_SIZE_REQD NBD_REP_ERR (8) #define NBD_REP_ERR_TOO_BIG NBD_REP_ERR (9) +#define NBD_REP_ERR_EXT_HEADER_REQD NBD_REP_ERR (10) #define NBD_INFO_EXPORT 0 #define NBD_INFO_NAME 1 @@ -182,16 +184,26 @@ struct nbd_fixed_new_option_reply_meta_context { /* followed by a string */ } NBD_ATTRIBUTE_PACKED; -/* Request (client -> server). */ +/* Compact request (client -> server). */ struct nbd_request { uint32_t magic; /* NBD_REQUEST_MAGIC. */ - uint16_t flags; /* Request flags. */ - uint16_t type; /* Request type. */ + uint16_t flags; /* Request flags: NBD_CMD_FLAG_*. */ + uint16_t type; /* Request type: NBD_CMD_*. */ uint64_t handle; /* Opaque handle. */ uint64_t offset; /* Request offset. */ uint32_t count; /* Request length. */ } NBD_ATTRIBUTE_PACKED; +/* Extended request (client -> server). */ +struct nbd_request_ext { + uint32_t magic; /* NBD_EXTENDED_REQUEST_MAGIC. */ + uint16_t flags; /* Request flags: NBD_CMD_FLAG_*. */ + uint16_t type; /* Request type: NBD_CMD_*. */ + uint64_t handle; /* Opaque handle. */ + uint64_t offset; /* Request offset. */ + uint64_t count; /* Request effect or payload length. */ +} NBD_ATTRIBUTE_PACKED; + /* Simple reply (server -> client). */ struct nbd_simple_reply { uint32_t magic; /* NBD_SIMPLE_REPLY_MAGIC. */ @@ -208,6 +220,16 @@ struct nbd_structured_reply { uint32_t length; /* Length of payload which follows. */ } NBD_ATTRIBUTE_PACKED; +/* Extended reply (server -> client). */ +struct nbd_extended_reply { + uint32_t magic; /* NBD_EXTENDED_REPLY_MAGIC. */ + uint16_t flags; /* NBD_REPLY_FLAG_* */ + uint16_t type; /* NBD_REPLY_TYPE_* */ + uint64_t handle; /* Opaque handle. */ + uint64_t offset; /* Client's offset. */ + uint64_t length; /* Length of payload which follows. */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_offset_data { uint64_t offset; /* offset */ /* Followed by data. */ @@ -224,11 +246,23 @@ struct nbd_block_descriptor { uint32_t status_flags; /* block type (hole etc) */ } NBD_ATTRIBUTE_PACKED; +/* NBD_REPLY_TYPE_BLOCK_STATUS_EXT block descriptor. */ +struct nbd_block_descriptor_ext { + uint64_t length; /* length of block */ + uint64_t status_flags; /* block type (hole etc) */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_block_status_hdr { uint32_t context_id; /* metadata context ID */ /* followed by array of nbd_block_descriptor extents */ } NBD_ATTRIBUTE_PACKED; +struct nbd_structured_reply_block_status_ext_hdr { + uint32_t context_id; /* metadata context ID */ + uint32_t count; /* length of following array */ + /* followed by array of nbd_block_descriptor_ext extents */ +} NBD_ATTRIBUTE_PACKED; + struct nbd_structured_reply_error { uint32_t error; /* NBD_E* error number */ uint16_t len; /* Length of human readable error. */ @@ -236,8 +270,10 @@ struct nbd_structured_reply_error { } NBD_ATTRIBUTE_PACKED; #define NBD_REQUEST_MAGIC 0x25609513 +#define NBD_EXTENDED_REQUEST_MAGIC 0x21e41c71 #define NBD_SIMPLE_REPLY_MAGIC 0x67446698 #define NBD_STRUCTURED_REPLY_MAGIC 0x668e33ef +#define NBD_EXTENDED_REPLY_MAGIC 0x6e8a278c /* Structured reply flags. */ #define NBD_REPLY_FLAG_DONE (1<<0) @@ -246,12 +282,13 @@ struct nbd_structured_reply_error { #define NBD_REPLY_TYPE_IS_ERR(val) (!!((val) & (1<<15))) /* Structured reply types. */ -#define NBD_REPLY_TYPE_NONE 0 -#define NBD_REPLY_TYPE_OFFSET_DATA 1 -#define NBD_REPLY_TYPE_OFFSET_HOLE 2 -#define NBD_REPLY_TYPE_BLOCK_STATUS 5 -#define NBD_REPLY_TYPE_ERROR NBD_REPLY_TYPE_ERR (1) -#define NBD_REPLY_TYPE_ERROR_OFFSET NBD_REPLY_TYPE_ERR (2) +#define NBD_REPLY_TYPE_NONE 0 +#define NBD_REPLY_TYPE_OFFSET_DATA 1 +#define NBD_REPLY_TYPE_OFFSET_HOLE 2 +#define NBD_REPLY_TYPE_BLOCK_STATUS 5 +#define NBD_REPLY_TYPE_BLOCK_STATUS_EXT 6 +#define NBD_REPLY_TYPE_ERROR NBD_REPLY_TYPE_ERR (1) +#define NBD_REPLY_TYPE_ERROR_OFFSET NBD_REPLY_TYPE_ERR (2) /* NBD commands. */ #define NBD_CMD_READ 0 @@ -263,11 +300,12 @@ struct nbd_structured_reply_error { #define NBD_CMD_WRITE_ZEROES 6 #define NBD_CMD_BLOCK_STATUS 7 -#define NBD_CMD_FLAG_FUA (1<<0) -#define NBD_CMD_FLAG_NO_HOLE (1<<1) -#define NBD_CMD_FLAG_DF (1<<2) -#define NBD_CMD_FLAG_REQ_ONE (1<<3) -#define NBD_CMD_FLAG_FAST_ZERO (1<<4) +#define NBD_CMD_FLAG_FUA (1<<0) +#define NBD_CMD_FLAG_NO_HOLE (1<<1) +#define NBD_CMD_FLAG_DF (1<<2) +#define NBD_CMD_FLAG_REQ_ONE (1<<3) +#define NBD_CMD_FLAG_FAST_ZERO (1<<4) +#define NBD_CMD_FLAG_PAYLOAD_LEN (1<<5) /* NBD error codes. */ #define NBD_SUCCESS 0 -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 04/22] states: Prepare to send 64-bit requests
Support sending 64-bit requests if extended headers were negotiated. This includes setting NBD_CMD_FLAG_PAYLOAD_LEN any time we send an extended NBD_CMD_WRITE; this is such a fundamental part of the protocol that for now it is easier to silently ignore whatever value the user passes in for that bit in the flags parameter of nbd_pwrite regardless of the current settings in set_strict_mode, rather than trying to force the user to pass in the correct value to match whether extended mode is negotiated. However, when we later add APIs to give the user more control for interoperability testing, it may be worth adding a new set_strict_mode control knob to explicitly enable the client to intentionally violate the protocol (the testsuite added in this patch would then be updated to match). At this point, h->extended_headers is permanently false (we can't enable it until all other aspects of the protocol have likewise been converted). Support for using FLAG_PAYLOAD_LEN with NBD_CMD_BLOCK_STATUS is less fundamental, and deserves to be in its own patch. Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 10 ++- generator/API.ml | 20 +++-- generator/states-issue-command.c | 29 ++++--- generator/states-reply-structured.c | 2 +- lib/rw.c | 17 +++-- tests/Makefile.am | 4 + tests/pwrite-extended.c | 112 ++++++++++++++++++++++++++++ .gitignore | 1 + 8 files changed, 169 insertions(+), 26 deletions(-) create mode 100644 tests/pwrite-extended.c diff --git a/lib/internal.h b/lib/internal.h index c71980ef..8a5f93d4 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -110,6 +110,9 @@ struct nbd_handle { char *tls_username; /* Username, NULL = use current username */ char *tls_psk_file; /* PSK filename, NULL = no PSK */ + /* Extended headers. */ + bool extended_headers; /* If we negotiated NBD_OPT_EXTENDED_HEADERS */ + /* Desired metadata contexts. */ bool request_sr; bool request_meta; @@ -263,7 +266,10 @@ struct nbd_handle { /* Issuing a command must use a buffer separate from sbuf, for the * case when we interrupt a request to service a reply. */ - struct nbd_request request; + union { + struct nbd_request compact; + struct nbd_request_ext extended; + } req; bool in_write_payload; bool in_write_shutdown; @@ -364,7 +370,7 @@ struct command { uint16_t type; uint64_t cookie; uint64_t offset; - uint32_t count; + uint64_t count; void *data; /* Buffer for read/write */ struct command_cb cb; bool initialized; /* For read, true if getting a hole may skip memset */ diff --git a/generator/API.ml b/generator/API.ml index 5fcb0e1f..02c1260d 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -198,11 +198,12 @@ let cmd_flags flag_prefix = "CMD_FLAG"; guard = Some "((h->strict & LIBNBD_STRICT_FLAGS) || flags > UINT16_MAX)"; flags = [ - "FUA", 1 lsl 0; - "NO_HOLE", 1 lsl 1; - "DF", 1 lsl 2; - "REQ_ONE", 1 lsl 3; - "FAST_ZERO", 1 lsl 4; + "FUA", 1 lsl 0; + "NO_HOLE", 1 lsl 1; + "DF", 1 lsl 2; + "REQ_ONE", 1 lsl 3; + "FAST_ZERO", 1 lsl 4; + "PAYLOAD_LEN", 1 lsl 5; ] } let handshake_flags = { @@ -2507,7 +2508,7 @@ "pread_structured", { "pwrite", { default_call with args = [ BytesIn ("buf", "count"); UInt64 "offset" ]; - optargs = [ OFlags ("flags", cmd_flags, Some ["FUA"]) ]; + optargs = [ OFlags ("flags", cmd_flags, Some ["FUA"; "PAYLOAD_LEN"]) ]; ret = RErr; permitted_states = [ Connected ]; shortdesc = "write to the NBD server"; @@ -2530,7 +2531,10 @@ "pwrite", { C<LIBNBD_CMD_FLAG_FUA> meaning that the server should not return until the data has been committed to permanent storage (if that is supported - some servers cannot do this, see -L<nbd_can_fua(3)>)." +L<nbd_can_fua(3)>). For convenience, libnbd ignores the presence +or absence of the flag C<LIBNBD_CMD_FLAG_PAYLOAD_LEN> in C<flags>, +while correctly using the flag over the wire according to whether +extended headers were negotiated." ^ strict_call_description; see_also = [Link "can_fua"; Link "is_read_only"; Link "aio_pwrite"; Link "get_block_size"; @@ -3220,7 +3224,7 @@ "aio_pwrite", { default_call with args = [ BytesPersistIn ("buf", "count"); UInt64 "offset" ]; optargs = [ OClosure completion_closure; - OFlags ("flags", cmd_flags, Some ["FUA"]) ]; + OFlags ("flags", cmd_flags, Some ["FUA"; "PAYLOAD_LEN"]) ]; ret = RCookie; permitted_states = [ Connected ]; shortdesc = "write to the NBD server"; diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c index 111e131c..79136b61 100644 --- a/generator/states-issue-command.c +++ b/generator/states-issue-command.c @@ -41,15 +41,24 @@ ISSUE_COMMAND.START: return 0; } - h->request.magic = htobe32 (NBD_REQUEST_MAGIC); - h->request.flags = htobe16 (cmd->flags); - h->request.type = htobe16 (cmd->type); - h->request.handle = htobe64 (cmd->cookie); - h->request.offset = htobe64 (cmd->offset); - h->request.count = htobe32 (cmd->count); + /* These fields are coincident between req.compact and req.extended */ + h->req.compact.flags = htobe16 (cmd->flags); + h->req.compact.type = htobe16 (cmd->type); + h->req.compact.handle = htobe64 (cmd->cookie); + h->req.compact.offset = htobe64 (cmd->offset); + if (h->extended_headers) { + h->req.extended.magic = htobe32 (NBD_EXTENDED_REQUEST_MAGIC); + h->req.extended.count = htobe64 (cmd->count); + h->wlen = sizeof (h->req.extended); + } + else { + assert (cmd->count <= UINT32_MAX); + h->req.compact.magic = htobe32 (NBD_REQUEST_MAGIC); + h->req.compact.count = htobe32 (cmd->count); + h->wlen = sizeof (h->req.compact); + } h->chunks_sent++; - h->wbuf = &h->request; - h->wlen = sizeof (h->request); + h->wbuf = &h->req; if (cmd->type == NBD_CMD_WRITE || cmd->next) h->wflags = MSG_MORE; SET_NEXT_STATE (%SEND_REQUEST); @@ -74,7 +83,7 @@ ISSUE_COMMAND.PREPARE_WRITE_PAYLOAD: assert (h->cmds_to_issue != NULL); cmd = h->cmds_to_issue; - assert (cmd->cookie == be64toh (h->request.handle)); + assert (cmd->cookie == be64toh (h->req.compact.handle)); if (cmd->type == NBD_CMD_WRITE) { h->wbuf = cmd->data; h->wlen = cmd->count; @@ -120,7 +129,7 @@ ISSUE_COMMAND.FINISH: assert (!h->wlen); assert (h->cmds_to_issue != NULL); cmd = h->cmds_to_issue; - assert (cmd->cookie == be64toh (h->request.handle)); + assert (cmd->cookie == be64toh (h->req.compact.handle)); h->cmds_to_issue = cmd->next; if (h->cmds_to_issue_tail == cmd) h->cmds_to_issue_tail = NULL; diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 6f96945a..df509216 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -34,7 +34,7 @@ structured_reply_in_bounds (uint64_t offset, uint32_t length, offset + length > cmd->offset + cmd->count) { set_error (0, "range of structured reply is out of bounds, " "offset=%" PRIu64 ", cmd->offset=%" PRIu64 ", " - "length=%" PRIu32 ", cmd->count=%" PRIu32 ": " + "length=%" PRIu32 ", cmd->count=%" PRIu64 ": " "this is likely to be a bug in the NBD server", offset, cmd->offset, length, cmd->count); return false; diff --git a/lib/rw.c b/lib/rw.c index 3dc3499e..8b2bd4cc 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -223,13 +223,11 @@ nbd_internal_command_common (struct nbd_handle *h, } break; - /* Other commands are currently limited by the 32 bit field in the - * command structure on the wire, but in future we hope to support - * 64 bit values here with a change to the NBD protocol which is - * being discussed upstream. + /* Other commands are limited by the 32 bit field in the command + * structure on the wire, unless extended headers were negotiated. */ default: - if (count > UINT32_MAX) { + if (!h->extended_headers && count > UINT32_MAX) { set_error (ERANGE, "request too large: maximum request size is %" PRIu32, UINT32_MAX); goto err; @@ -358,6 +356,15 @@ nbd_unlocked_aio_pwrite (struct nbd_handle *h, const void *buf, return -1; } } + /* It is more convenient to manage PAYLOAD_LEN by what was negotiated + * than to require the user to have to set it correctly. + * TODO: Add new h->strict bit to allow intentional protocol violation + * for interoperability testing. + */ + if (h->extended_headers) + flags |= LIBNBD_CMD_FLAG_PAYLOAD_LEN; + else + flags &= ~LIBNBD_CMD_FLAG_PAYLOAD_LEN; SET_CALLBACK_TO_NULL (*completion); return nbd_internal_command_common (h, flags, NBD_CMD_WRITE, offset, count, diff --git a/tests/Makefile.am b/tests/Makefile.am index 3a93251e..8b839bf5 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -232,6 +232,7 @@ check_PROGRAMS += \ closure-lifetimes \ pread-initialize \ socket-activation-name \ + pwrite-extended \ $(NULL) TESTS += \ @@ -650,6 +651,9 @@ socket_activation_name_SOURCES = \ requires.h socket_activation_name_LDADD = $(top_builddir)/lib/libnbd.la +pwrite_extended_SOURCES = pwrite-extended.c +pwrite_extended_LDADD = $(top_builddir)/lib/libnbd.la + #---------------------------------------------------------------------- # Testing TLS support. diff --git a/tests/pwrite-extended.c b/tests/pwrite-extended.c new file mode 100644 index 00000000..f0b5a3f3 --- /dev/null +++ b/tests/pwrite-extended.c @@ -0,0 +1,112 @@ +/* NBD client library in userspace + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Check behavior of pwrite with PAYLOAD_LEN flag for extended headers. */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <errno.h> +#include <unistd.h> +#include <sys/stat.h> + +#include <libnbd.h> + +static char *progname; +static char buf[512]; + +static void +check (int experr, const char *prefix) +{ + const char *msg = nbd_get_error (); + int errnum = nbd_get_errno (); + + fprintf (stderr, "error: \"%s\"\n", msg); + fprintf (stderr, "errno: %d (%s)\n", errnum, strerror (errnum)); + if (strncmp (msg, prefix, strlen (prefix)) != 0) { + fprintf (stderr, "%s: test failed: missing context prefix: %s\n", + progname, msg); + exit (EXIT_FAILURE); + } + if (errnum != experr) { + fprintf (stderr, "%s: test failed: " + "expected errno = %d (%s), but got %d\n", + progname, experr, strerror (experr), errnum); + exit (EXIT_FAILURE); + } +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + const char *cmd[] = { + "nbdkit", "-s", "-v", "--exit-with-parent", "memory", "1048576", NULL + }; + uint32_t strict; + + progname = argv[0]; + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* Connect to the server. */ + if (nbd_connect_command (nbd, (char **)cmd) == -1) { + fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* FIXME: future API addition to test if server negotiated extended mode. + * Until then, strict flags must ignore the PAYLOAD_LEN flag for pwrite, + * even though it is rejected for other commands. + */ + strict = nbd_get_strict_mode (nbd); + if (!(strict & LIBNBD_STRICT_FLAGS)) { + fprintf (stderr, "%s: test failed: " + "nbd_get_strict_mode did not have expected flag set\n", + argv[0]); + exit (EXIT_FAILURE); + } + if (nbd_aio_pread (nbd, buf, 512, 0, NBD_NULL_COMPLETION, + LIBNBD_CMD_FLAG_PAYLOAD_LEN) != -1) { + fprintf (stderr, "%s: test failed: " + "nbd_aio_pread did not fail with unexpected flag\n", + argv[0]); + exit (EXIT_FAILURE); + } + check (EINVAL, "nbd_aio_pread: "); + + if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, + LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1) { + fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ()); + exit (EXIT_FAILURE); + } + + if (nbd_aio_pwrite (nbd, buf, 512, 0, NBD_NULL_COMPLETION, 0) == -1) { + fprintf (stderr, "%s: %s\n", argv[0], nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_close (nbd); + exit (EXIT_SUCCESS); +} diff --git a/.gitignore b/.gitignore index fe7feffa..bc7c2c37 100644 --- a/.gitignore +++ b/.gitignore @@ -250,6 +250,7 @@ Makefile.in /tests/pki/ /tests/pread-initialize /tests/private-data +/tests/pwrite-extended /tests/read-only-flag /tests/read-write-flag /tests/server-death -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 05/22] states: Prepare to receive 64-bit replies
Support receiving headers for 64-bit replies if extended headers were negotiated. We already insist that the server not send us too much payload in one reply, so we can exploit that and merge the 64-bit length back into a normalized 32-bit field for the rest of the payload length calculations. The NBD protocol specifically documents that extended mode takes precedence over structured replies, and that there are no simple replies in extended mode. We can also take advantage that the handle field is in the same offset in all the various reply types. Note that if we negotiate extended headers, but a non-compliant server replies with a non-extended header, this patch will stall waiting for the server to send more bytes rather than noticing that the magic number is wrong (for aio operations, you'll get a magic number mismatch once you send a second command that elicits a reply; but for blocking operations, we basically deadlock). The easy alternative would be to read just the first 4 bytes of magic, then determine how many more bytes to expect; but that would require more states and syscalls, and not worth it since the typical server will be compliant. The other alternative is what the next patch implements: teaching REPLY.RECV_REPLY to handle short reads that were at least long enough to transmit magic to specifically look for magic number mismatch. At this point, h->extended_headers is permanently false (we can't enable it until all other aspects of the protocol have likewise been converted). Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 1 + generator/states-reply-structured.c | 52 +++++++++++++++++++---------- generator/states-reply.c | 34 ++++++++++++------- 3 files changed, 58 insertions(+), 29 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 8a5f93d4..e4e21a4d 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -243,6 +243,7 @@ struct nbd_handle { union { struct nbd_simple_reply simple; struct nbd_structured_reply structured; + struct nbd_extended_reply extended; } hdr; union { struct nbd_structured_reply_offset_data offset_data; diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index df509216..c53aed7b 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -45,14 +45,20 @@ structured_reply_in_bounds (uint64_t offset, uint32_t length, STATE_MACHINE { REPLY.STRUCTURED_REPLY.START: - /* We've only read the simple_reply. The structured_reply is longer, - * so read the remaining part. + /* If we have extended headers, we've already read the entire header. + * Otherwise, we've only read enough for a simple_reply; since structured + * replies are longer, read the remaining part. */ - h->rbuf = &h->sbuf; - h->rbuf = (char *)h->rbuf + sizeof h->sbuf.reply.hdr.simple; - h->rlen = sizeof h->sbuf.reply.hdr.structured; - h->rlen -= sizeof h->sbuf.reply.hdr.simple; - SET_NEXT_STATE (%RECV_REMAINING); + if (h->extended_headers) { + assert (h->rbuf == sizeof h->sbuf.reply.hdr.extended + (char*)&h->sbuf); + SET_NEXT_STATE (%CHECK); + } + else { + assert (h->rbuf == sizeof h->sbuf.reply.hdr.simple + (char*)&h->sbuf); + h->rlen = sizeof h->sbuf.reply.hdr.structured - + sizeof h->sbuf.reply.hdr.simple; + SET_NEXT_STATE (%RECV_REMAINING); + } return 0; REPLY.STRUCTURED_REPLY.RECV_REMAINING: @@ -69,11 +75,18 @@ REPLY.STRUCTURED_REPLY.RECV_REMAINING: REPLY.STRUCTURED_REPLY.CHECK: struct command *cmd = h->reply_cmd; uint16_t flags, type; - uint32_t length; + uint64_t length; + uint64_t offset = -1; + assert (cmd); flags = be16toh (h->sbuf.reply.hdr.structured.flags); type = be16toh (h->sbuf.reply.hdr.structured.type); - length = be32toh (h->sbuf.reply.hdr.structured.length); + if (h->extended_headers) { + length = be64toh (h->sbuf.reply.hdr.extended.length); + offset = be64toh (h->sbuf.reply.hdr.extended.offset); + } + else + length = be32toh (h->sbuf.reply.hdr.structured.length); /* Reject a server that replies with too much information, but don't * reject a single structured reply to NBD_CMD_READ on the largest @@ -83,13 +96,18 @@ REPLY.STRUCTURED_REPLY.CHECK: * not worth keeping the connection alive. */ if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.reply.payload.offset_data) { - set_error (0, "invalid server reply length %" PRIu32, length); + set_error (0, "invalid server reply length %" PRIu64, length); SET_NEXT_STATE (%.DEAD); return 0; } + /* For convenience, we now normalize extended replies into compact, + * doable since we validated that payload length fits in 32 bits. + */ + h->sbuf.reply.hdr.structured.length = length; /* Skip an unexpected structured reply, including to an unknown cookie. */ - if (cmd == NULL || !h->structured_replies) + if (cmd == NULL || !h->structured_replies || + (h->extended_headers && offset != cmd->offset)) goto resync; switch (type) { @@ -168,7 +186,7 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ assert (length >= sizeof h->sbuf.reply.payload.error.error.error); assert (cmd); @@ -207,7 +225,7 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ msglen = be16toh (h->sbuf.reply.payload.error.error.len); type = be16toh (h->sbuf.reply.hdr.structured.type); @@ -307,7 +325,7 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ @@ -346,7 +364,7 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ @@ -426,7 +444,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); @@ -460,7 +478,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.reply.hdr.structured.length); + length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); diff --git a/generator/states-reply.c b/generator/states-reply.c index 31e4bd2f..4e9f2dde 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -62,15 +62,19 @@ REPLY.START: */ ssize_t r; - /* We read all replies initially as if they are simple replies, but - * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. - * This works because the structured_reply header is larger. + /* With extended headers, there is only one size to read, so we can do it + * all in one syscall. But for older structured replies, we don't know if + * we have a simple or structured reply until we read the magic number, + * requiring a two-part read with CHECK_SIMPLE_OR_STRUCTURED_REPLY below. */ assert (h->reply_cmd == NULL); assert (h->rlen == 0); h->rbuf = &h->sbuf.reply.hdr; - h->rlen = sizeof h->sbuf.reply.hdr.simple; + if (h->extended_headers) + h->rlen = sizeof h->sbuf.reply.hdr.extended; + else + h->rlen = sizeof h->sbuf.reply.hdr.simple; r = h->sock->ops->recv (h, h->sock, h->rbuf, h->rlen); if (r == -1) { @@ -116,16 +120,22 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: uint64_t cookie; magic = be32toh (h->sbuf.reply.hdr.simple.magic); - if (magic == NBD_SIMPLE_REPLY_MAGIC) { + switch (magic) { + case NBD_SIMPLE_REPLY_MAGIC: + if (h->extended_headers) + goto invalid; SET_NEXT_STATE (%SIMPLE_REPLY.START); - } - else if (magic == NBD_STRUCTURED_REPLY_MAGIC) { + break; + case NBD_STRUCTURED_REPLY_MAGIC: + case NBD_EXTENDED_REPLY_MAGIC: + if ((magic == NBD_STRUCTURED_REPLY_MAGIC) == h->extended_headers) + goto invalid; SET_NEXT_STATE (%STRUCTURED_REPLY.START); - } - else { - /* We've probably lost synchronization. */ - SET_NEXT_STATE (%.DEAD); - set_error (0, "invalid reply magic 0x%" PRIx32, magic); + break; + default: + invalid: + SET_NEXT_STATE (%.DEAD); /* We've probably lost synchronization. */ + set_error (0, "invalid or unexpected reply magic 0x%" PRIx32, magic); #if 0 /* uncomment to see desynchronized data */ nbd_internal_hexdump (&h->sbuf.reply.hdr.simple, sizeof (h->sbuf.reply.hdr.simple), -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 06/22] states: Break deadlock if server goofs on extended replies
One of the benefits of extended replies is that we can do a fixed-length read for the entire header of every server reply, which is fewer syscalls than the split-read approach required by structured replies. But one of the drawbacks of doing a large read is that if the server is non-compliant (not a problem for normal servers, but something I hit rather more than I'd like to admit while developing extended header support in servers), nbd_pwrite() and friends will deadlock if the server replies with the wrong header. Add in some code to catch that failure mode and move the state machine to DEAD sooner, to make it easier to diagnose the fault in the server. Unlike in the case of an unexpected simply reply from a structured server (where we never over-read, and can therefore commit b31e7bac can merely fail the command with EPROTO and successfully move on to the next server reply), in this case we really do have to move to DEAD: in addition to having already read the 16 or 20 bytes that the server sent in its (short) reply for this command, we may have already read the initial bytes of the server's next reply, but we have no way to push those extra bytes back onto our read stream for parsing on our next pass through the state machine. Signed-off-by: Eric Blake <eblake at redhat.com> --- generator/states-reply.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/generator/states-reply.c b/generator/states-reply.c index 4e9f2dde..d4710d91 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -109,7 +109,28 @@ REPLY.START: REPLY.RECV_REPLY: switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; - case 1: SET_NEXT_STATE (%.READY); return 0; + case 1: SET_NEXT_STATE (%.READY); + /* Special case: if we have a short read, but got at least far + * enough to decode the magic number, we can check if the server + * is matching our expectations. This lets us avoid deadlocking if + * a buggy server sends only 16 bytes of a simple reply, and is + * waiting for our next command, while we are blocked waiting for + * the server to send 32 bytes of an extended reply. + */ + if (h->extended_headers && + (char *)h->rbuf >= (char *)&h->sbuf.reply.hdr.extended.flags) { + uint32_t magic = be32toh (h->sbuf.reply.hdr.extended.magic); + if (magic != NBD_EXTENDED_REPLY_MAGIC) { + SET_NEXT_STATE (%.DEAD); /* We've probably lost synchronization. */ + set_error (0, "invalid or unexpected reply magic 0x%" PRIx32, magic); +#if 0 /* uncomment to see desynchronized data */ + nbd_internal_hexdump (&h->sbuf.reply.hdr.extended.flags, + sizeof (h->sbuf.reply.hdr.extended.flags), + stderr); +#endif + } + } + return 0; case 0: SET_NEXT_STATE (%CHECK_SIMPLE_OR_STRUCTURED_REPLY); } return 0; -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 07/22] generator: Add struct nbd_extent in prep for 64-bit extents
The existing nbd_block_status() callback is permanently stuck with an array of uint32_t pairs (len/2 extents), which is both constrained on maximum extent size (no larger than 4G) and on the status flags (must fit in 32 bits). While the "base:allocation" metacontext will never exceed 32 bits, it is not hard to envision other extension metacontexts where a 64-bit status would be useful (for example, Zoned Block Devices expressing a 64-bit offset[1]). Exposing 64-bit extents will require a new API; we now have the decision of whether to copy the existing API practice of returning a bare array containing len/2 extent pairs, or with a saner idea of an array of structs with len extents. Returning an array of structs is actually easier to map to various language bindings, particularly since we know the length field will generally fit in 63-bits (fallout from the fact that NBD exports are effectively capped in size by signed off_t), but where the status field must represent a full 64 bits (assuming that the user wants to connect to a metadata extension that utilizes 64 bits, rather than existing metadata contexts that only expose 32 bits). This patch introduces the struct we plan to use in the new API, along with language bindings. The bindings for Python and OCaml were relatively straightforward; the Golang bindings took a bit more effort for me to write. Temporary unused attributes are needed to keep the compiler happy until a later patch exposes a new API using the new callback type. Note that 'struct nbd_block_descriptor_ext' in lib/nbd-protocol.h is exactly the same as what we want to use in C. But it is easier to stick a new public type in <libnbd.h> than to figure out how to expose just part of a header we only want to use internally. [1] https://zonedstorage.io/docs/linux/zbd-api Signed-off-by: Eric Blake <eblake at redhat.com> --- generator/API.mli | 1 + generator/API.ml | 12 +++++++++++- generator/C.ml | 24 +++++++++++++++++++++--- generator/GoLang.ml | 24 ++++++++++++++++++++++++ generator/OCaml.ml | 24 +++++++++++++++++++++--- generator/Python.ml | 21 ++++++++++++++++++++- ocaml/helpers.c | 20 ++++++++++++++++++++ ocaml/nbd-c.h | 1 + golang/handle.go | 6 ++++++ 9 files changed, 125 insertions(+), 8 deletions(-) diff --git a/generator/API.mli b/generator/API.mli index c5bba8c2..96626c9b 100644 --- a/generator/API.mli +++ b/generator/API.mli @@ -52,6 +52,7 @@ and | BytesPersistOut of string * string | Closure of closure (** function pointer + void *opaque *) | Enum of string * enum (** enum/union type, int in C *) +| Extent64 of string (** extent descriptor, struct nbd_extent in C *) | Fd of string (** file descriptor *) | Flags of string * flags (** flags, uint32_t in C *) | Int of string (** small int *) diff --git a/generator/API.ml b/generator/API.ml index 02c1260d..1da1ec53 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -42,6 +42,7 @@ | BytesPersistOut of string * string | Closure of closure | Enum of string * enum +| Extent64 of string | Fd of string | Flags of string * flags | Int of string @@ -157,6 +158,14 @@ let extent_closure "nr_entries"); CBMutable (Int "error") ] } +let extent64_closure = { + cbname = "extent64"; + cbargs = [ CBString "metacontext"; + CBUInt64 "offset"; + CBArrayAndLen (Extent64 "entries", + "nr_entries"); + CBMutable (Int "error") ] +} let list_closure = { cbname = "list"; cbargs = [ CBString "name"; CBString "description" ] @@ -166,7 +175,8 @@ let context_closure cbargs = [ CBString "name" ] } let all_closures = [ chunk_closure; completion_closure; - debug_closure; extent_closure; list_closure; + debug_closure; extent_closure; extent64_closure; + list_closure; context_closure ] (* Enums. *) diff --git a/generator/C.ml b/generator/C.ml index f772117c..d8472153 100644 --- a/generator/C.ml +++ b/generator/C.ml @@ -94,6 +94,7 @@ let | Closure { cbname } -> [ sprintf "%s_callback" cbname; sprintf "%s_user_data" cbname ] | Enum (n, _) -> [n] + | Extent64 n -> [n] | Fd n -> [n] | Flags (n, _) -> [n] | Int n -> [n] @@ -129,7 +130,7 @@ let (* list of strings should be marked as non-null *) | StringList n -> [ true ] (* numeric and other non-pointer types are not able to be null *) - | Bool _ | Closure _ | Enum _ | Fd _ | Flags _ + | Bool _ | Closure _ | Enum _ | Extent64 _ | Fd _ | Flags _ | Int _ | Int64 _ | SizeT _ | UInt _ | UInt32 _ | UInt64 _ | UIntPtr _ -> [ false ] @@ -191,6 +192,9 @@ and | Enum (n, _) -> if types then pr "int "; pr "%s" n + | Extent64 n -> + if types then pr "nbd_extent "; + pr "%s" n | Flags (n, _) -> if types then pr "uint32_t "; pr "%s" n @@ -305,6 +309,11 @@ and pr "%s, " n; if types then pr "size_t "; pr "%s" len + | CBArrayAndLen (Extent64 n, len) -> + if types then pr "nbd_extent *"; + pr "%s, " n; + if types then pr "size_t "; + pr "%s" len | CBArrayAndLen _ -> assert false | CBBytesIn (n, len) -> if types then pr "const void *"; @@ -477,6 +486,13 @@ let pr "extern int nbd_get_errno (void);\n"; pr "#define LIBNBD_HAVE_NBD_GET_ERRNO 1\n"; pr "\n"; + pr "/* This is used in the callback for nbd_block_status_64.\n"; + pr " */\n"; + pr "typedef struct {\n"; + pr " uint64_t length;\n"; + pr " uint64_t flags;\n"; + pr "} nbd_extent;\n"; + pr "\n"; print_closure_structs (); List.iter ( fun (name, { args; optargs; ret }) -> @@ -740,7 +756,7 @@ let pr " char *%s_printable =\n" n; pr " nbd_internal_printable_string_list (%s);\n" n | BytesOut _ | BytesPersistOut _ - | Bool _ | Closure _ | Enum _ | Flags _ | Fd _ | Int _ + | Bool _ | Closure _ | Enum _ | Extent64 _ | Flags _ | Fd _ | Int _ | Int64 _ | SizeT _ | SockAddrAndLen _ | UInt _ | UInt32 _ | UInt64 _ | UIntPtr _ -> () ) args; @@ -766,6 +782,7 @@ let pr " %s=\\\"%%s\\\" %s=%%zu" n count | Closure { cbname } -> pr " %s=%%s" cbname | Enum (n, _) -> pr " %s=%%d" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> pr " %s=0x%%x" n | Fd n | Int n -> pr " %s=%%d" n | Int64 n -> pr " %s=%%\"PRIi64\"" n @@ -804,6 +821,7 @@ let pr "%s_printable ? %s_printable : \"\", %s" n n count | Closure _ -> pr "\"<fun>\"" | Enum (n, _) -> pr "%s" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> pr "%s" n | Fd n | Int n | Int64 n | SizeT n -> pr "%s" n | SockAddrAndLen (_, len) -> pr "(int) %s" len @@ -836,7 +854,7 @@ let | StringList n -> pr " free (%s_printable);\n" n | BytesOut _ | BytesPersistOut _ - | Bool _ | Closure _ | Enum _ | Flags _ | Fd _ | Int _ + | Bool _ | Closure _ | Enum _ | Extent64 _ | Flags _ | Fd _ | Int _ | Int64 _ | SizeT _ | SockAddrAndLen _ | UInt _ | UInt32 _ | UInt64 _ | UIntPtr _ -> () ) args; diff --git a/generator/GoLang.ml b/generator/GoLang.ml index 4ab6b262..8922812b 100644 --- a/generator/GoLang.ml +++ b/generator/GoLang.ml @@ -49,6 +49,7 @@ let | BytesPersistOut (n, len) -> n | Closure { cbname } -> cbname | Enum (n, _) -> n + | Extent64 n -> n | Fd n -> n | Flags (n, _) -> n | Int n -> n @@ -71,6 +72,7 @@ let | BytesPersistOut _ -> "AioBuffer" | Closure { cbname } -> sprintf "%sCallback" (camel_case cbname) | Enum (_, { enum_prefix }) -> camel_case enum_prefix + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd _ -> "int" | Flags (_, { flag_prefix }) -> camel_case flag_prefix | Int _ -> "int" @@ -262,6 +264,7 @@ let pr " c_%s.user_data = C.alloc_cbid(C.long(%s_cbid))\n" cbname cbname | Enum (n, _) -> pr " c_%s := C.int (%s)\n" n n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd n -> pr " c_%s := C.int (%s)\n" n n | Flags (n, _) -> @@ -334,6 +337,7 @@ let | BytesPersistOut (n, len) -> pr ", c_%s, c_%s" n len | Closure { cbname } -> pr ", c_%s" cbname | Enum (n, _) -> pr ", c_%s" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd n -> pr ", c_%s" n | Flags (n, _) -> pr ", c_%s" n | Int n -> pr ", c_%s" n @@ -524,6 +528,18 @@ let copy(ret, s) return ret } + +func copy_extent_array (entries *C.nbd_extent, count C.size_t) []LibnbdExtent { + unsafePtr := unsafe.Pointer(entries) + arrayPtr := (*[1 << 20]C.nbd_extent)(unsafePtr) + slice := arrayPtr[:count:count] + ret := make([]LibnbdExtent, count) + for i := 0; i < int (count); i++ { + ret[i].Length = uint64 (slice[i].length) + ret[i].Flags = uint64 (slice[i].flags) + } + return ret +} "; List.iter ( @@ -537,6 +553,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, _) -> pr "%s []uint32" n; + | CBArrayAndLen (Extent64 n, _) -> + pr "%s []LibnbdExtent" n; | CBBytesIn (n, len) -> pr "%s []byte" n; | CBInt n -> @@ -563,6 +581,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, count) -> pr "%s *C.uint32_t, %s C.size_t" n count + | CBArrayAndLen (Extent64 n, count) -> + pr "%s *C.nbd_extent, %s C.size_t" n count | CBBytesIn (n, len) -> pr "%s unsafe.Pointer, %s C.size_t" n len | CBInt n -> @@ -605,6 +625,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, count) -> pr "copy_uint32_array (%s, %s)" n count + | CBArrayAndLen (Extent64 n, count) -> + pr "copy_extent_array (%s, %s)" n count | CBBytesIn (n, len) -> pr "C.GoBytes (%s, C.int (%s))" n len | CBInt n -> @@ -756,6 +778,8 @@ let match cbarg with | CBArrayAndLen (UInt32 n, count) -> pr "uint32_t *%s, size_t %s" n count + | CBArrayAndLen (Extent64 n, count) -> + pr "nbd_extent *%s, size_t %s" n count | CBBytesIn (n, len) -> pr "void *%s, size_t %s" n len | CBInt n -> diff --git a/generator/OCaml.ml b/generator/OCaml.ml index edb81f25..ea3b044a 100644 --- a/generator/OCaml.ml +++ b/generator/OCaml.ml @@ -44,6 +44,7 @@ and | Closure { cbargs } -> sprintf "(%s)" (ocaml_closuredecl_to_string cbargs) | Enum (_, { enum_prefix }) -> sprintf "%s.t" enum_prefix + | Extent64 _ -> "extent" | Fd _ -> "Unix.file_descr" | Flags (_, { flag_prefix }) -> sprintf "%s.t list" flag_prefix | Int _ -> "int" @@ -102,6 +103,7 @@ let | BytesPersistOut (n, len) -> n | Closure { cbname } -> cbname | Enum (n, _) -> n + | Extent64 n -> n | Fd n -> n | Flags (n, _) -> n | Int n -> n @@ -147,6 +149,9 @@ let type cookie = int64 +type extent = int64 * int64 +(** Length and flags of an extent in block_status_64 callback. *) + "; List.iter ( @@ -268,6 +273,7 @@ let exception Error of string * Unix.error option exception Closed of string type cookie = int64 +type extent = int64 * int64 (* Give the exceptions names so that they can be raised from the C code. *) let () @@ -498,7 +504,8 @@ let let argnames List.map ( function - | CBArrayAndLen (UInt32 n, _) | CBBytesIn (n, _) + | CBArrayAndLen (UInt32 n, _) | CBArrayAndLen (Extent64 n, _) + | CBBytesIn (n, _) | CBInt n | CBInt64 n | CBMutable (Int n) | CBString n | CBUInt n | CBUInt64 n -> n ^ "v" @@ -531,6 +538,14 @@ let pr "%s %s,\n" indent n; pr "%s %s\n" indent count; pr "%s);\n" indent + | CBArrayAndLen (Extent64 n, count) -> + pr " %sv = " n; + let fncol = output_column () in + let indent = spaces fncol in + pr "nbd_internal_ocaml_alloc_extent64_array (\n"; + pr "%s %s,\n" indent n; + pr "%s %s\n" indent count; + pr "%s);\n" indent | CBBytesIn (n, len) -> pr " %sv = caml_alloc_initialized_string (%s, %s);\n" n len n | CBInt n | CBUInt n -> @@ -554,7 +569,7 @@ let List.iter ( function - | CBArrayAndLen (UInt32 _, _) + | CBArrayAndLen (_, _) | CBBytesIn _ | CBInt _ | CBInt64 _ @@ -563,7 +578,7 @@ let | CBUInt64 _ -> () | CBMutable (Int n) -> pr " *%s = Int_val (Field (%sv, 0));\n" n n - | CBArrayAndLen _ | CBMutable _ -> assert false + | CBMutable _ -> assert false ) cbargs; pr " if (Is_exception_result (rv)) {\n"; @@ -578,6 +593,7 @@ let pr "}\n"; pr "\n"; pr "static int\n"; + pr "__attribute__ ((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; @@ -695,6 +711,7 @@ let pr " %s_callback.free = free_user_data;\n" cbname | Enum (n, { enum_prefix }) -> pr " int %s = %s_val (%sv);\n" n enum_prefix n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Fd n -> pr " /* OCaml Unix.file_descr is just an int, at least on Unix. */\n"; pr " int %s = Int_val (%sv);\n" n n @@ -780,6 +797,7 @@ let | BytesPersistOut _ | Closure _ | Enum _ + | Extent64 _ | Fd _ | Flags _ | Int _ diff --git a/generator/Python.ml b/generator/Python.ml index c81878de..ccac5fed 100644 --- a/generator/Python.ml +++ b/generator/Python.ml @@ -158,6 +158,7 @@ let let print_python_closure_wrapper { cbname; cbargs } pr "/* Wrapper for %s callback. */\n" cbname; pr "static int\n"; + pr "__attribute__ ((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; @@ -170,6 +171,7 @@ let List.iter ( function | CBArrayAndLen (UInt32 n, _) + | CBArrayAndLen (Extent64 n, _) | CBBytesIn (n, _) | CBMutable (Int n) -> pr " PyObject *py_%s = NULL;\n" n @@ -187,6 +189,16 @@ let pr " if (!py_e_%s) { PyErr_PrintEx (0); goto out; }\n" n; pr " PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n; pr " }\n" + | CBArrayAndLen (Extent64 n, len) -> + pr " py_%s = PyList_New (%s);\n" n len; + pr " size_t i_%s;\n" n; + pr " for (i_%s = 0; i_%s < %s; ++i_%s) {\n" n n len n; + pr " PyObject *py_e_%s = Py_BuildValue (\"OO\",\n" n; + pr " PyLong_FromUnsignedLong (%s[i_%s].length),\n" n n; + pr " PyLong_FromUnsignedLong (%s[i_%s].flags));\n" n n; + pr " if (!py_e_%s) { PyErr_PrintEx (0); goto out; }\n" n; + pr " PyList_SET_ITEM (py_%s, i_%s, py_e_%s);\n" n n n; + pr " }\n" | CBBytesIn (n, len) -> pr " py_%s = nbd_internal_py_get_subview (data->view, %s, %s);\n" n n len; pr " if (!py_%s) { PyErr_PrintEx (0); goto out; }\n" n @@ -206,6 +218,7 @@ let List.map ( function | CBArrayAndLen (UInt32 n, _) -> "O", sprintf "py_%s" n + | CBArrayAndLen (Extent64 n, _) -> "O", sprintf "py_%s" n | CBBytesIn (n, _) -> "O", sprintf "py_%s" n | CBInt n -> "i", n | CBInt64 n -> "L", n @@ -254,7 +267,8 @@ let pr " out:\n"; List.iter ( function - | CBArrayAndLen (UInt32 n, _) -> + | CBArrayAndLen (UInt32 n, _) + | CBArrayAndLen (Extent64 n, _) -> pr " Py_XDECREF (py_%s);\n" n | CBBytesIn (n, _) -> pr " Py_XDECREF (py_%s);\n" n @@ -307,6 +321,7 @@ let pr ".callback = %s_wrapper, .free = free_user_data" cbname); pr " };\n" | Enum (n, _) -> pr " int %s;\n" n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> pr " uint32_t %s_u32;\n" n; pr " unsigned int %s; /* really uint32_t */\n" n @@ -363,6 +378,7 @@ let "O", sprintf "&%s" n, sprintf "py_%s->buf, py_%s->len" n n | Closure { cbname } -> "O", sprintf "&py_%s_fn" cbname, cbname | Enum (n, _) -> "i", sprintf "&%s" n, n + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> "I", sprintf "&%s" n, sprintf "%s_u32" n | Fd n | Int n -> "i", sprintf "&%s" n, n | Int64 n -> "L", sprintf "&%s" n, sprintf "%s_i64" n @@ -454,6 +470,7 @@ let pr " if (!chunk_user_data->view) goto out;\n" ) | Enum _ -> () + | Extent64 _ -> () | Flags (n, _) -> pr " %s_u32 = %s;\n" n n | Fd _ | Int _ -> () | Int64 n -> pr " %s_i64 = %s;\n" n n @@ -552,6 +569,7 @@ let | Closure { cbname } -> pr " free_user_data (%s_user_data);\n" cbname | Enum _ -> () + | Extent64 _ -> () | Flags _ -> () | Fd _ | Int _ -> () | Int64 _ -> () @@ -881,6 +899,7 @@ let | BytesPersistOut (n, _) -> n, None | Closure { cbname } -> cbname, None | Enum (n, _) -> n, None + | Extent64 _ -> assert false (* only used in extent64_closure *) | Flags (n, _) -> n, None | Fd n | Int n -> n, None | Int64 n -> n, None diff --git a/ocaml/helpers.c b/ocaml/helpers.c index 3361a696..09666daf 100644 --- a/ocaml/helpers.c +++ b/ocaml/helpers.c @@ -133,6 +133,26 @@ nbd_internal_ocaml_alloc_i64_from_u32_array (uint32_t *a, size_t len) CAMLreturn (rv); } +value +nbd_internal_ocaml_alloc_extent64_array (nbd_extent *a, size_t len) +{ + CAMLparam0 (); + CAMLlocal3 (s, v, rv); + size_t i; + + rv = caml_alloc (len, 0); + for (i = 0; i < len; ++i) { + s = caml_alloc (2, 0); + v = caml_copy_int64 (a[i].length); + Store_field (s, 0, v); + v = caml_copy_int64 (a[i].flags); + Store_field (s, 1, v); + Store_field (rv, i, s); + } + + CAMLreturn (rv); +} + /* Convert a Unix.sockaddr to a C struct sockaddr. */ void nbd_internal_unix_sockaddr_to_sa (value sockaddrv, diff --git a/ocaml/nbd-c.h b/ocaml/nbd-c.h index e3abb912..adcdd15a 100644 --- a/ocaml/nbd-c.h +++ b/ocaml/nbd-c.h @@ -62,6 +62,7 @@ extern void nbd_internal_ocaml_raise_closed (const char *func) Noreturn; extern const char **nbd_internal_ocaml_string_list (value); extern value nbd_internal_ocaml_alloc_i64_from_u32_array (uint32_t *, size_t); +extern value nbd_internal_ocaml_alloc_extent64_array (nbd_extent *, size_t); extern void nbd_internal_unix_sockaddr_to_sa (value, struct sockaddr_storage *, socklen_t *); extern void nbd_internal_ocaml_exception_in_wrapper (const char *, value); diff --git a/golang/handle.go b/golang/handle.go index 5fe4ed4f..0b126005 100644 --- a/golang/handle.go +++ b/golang/handle.go @@ -58,6 +58,12 @@ func (h *Libnbd) String() string { return "&Libnbd{}" } +/* Used for block status callback. */ +type LibnbdExtent struct { + Length uint64 // length of the extent + Flags uint64 // flags describing properties of the extent +} + /* All functions (except Close) return ([result,] LibnbdError). */ type LibnbdError struct { Op string // operation which failed -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 08/22] block_status: Track 64-bit extents internally
When extended headers are in use, the server is required to use a response that can include 64-bit extent lengths and flags (even if it chooses to keep the extent length under 4G, and even for metacontexts like "base:allocation" where the flags never need more than 32 bits). For maximum flexibility, we are better off storing 64-bit data internally, with a goal of letting the client's 32-bit interface work as much as possible, and for a future API addition of a 64-bit interface to work even with older servers that only give 32-bit results. For backwards compatibility, a client that never negotiates a 64-bit status context can be handled without errors by truncating any 64-bit lengths down to just under 4G; so the old 32-bit interface will continue to work in most cases. A server should not advertise a metacontext that requires flags larger than 32 bits unless the client negotiated extended headers; but if such a client still tries to use the 32-bit interface, the code now reports EOVERFLOW for a server response with a flags value that cannot be represented without using the future API for 64-bit extent queries. Note that the existing 32-bit nbd_block_status() API is now slightly slower, particularly when talking with a server that lacks extended headers: we are doing two size conversions (32 to 64 internally, 64 back to 32 to the user). But this speed penalty is likely in the noise compared to the network delays, and ideally clients will switch over to the new 64-bit interfaces as more and more servers start supporting extended headers. One of the trickier aspects of this patch is auditing that both the user's extent and our malloc'd shim get cleaned up once on all possible paths, so that there is neither a leak nor a double free. Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 8 +++- generator/states-reply-structured.c | 30 ++++++++----- lib/handle.c | 2 +- lib/rw.c | 70 ++++++++++++++++++++++++++++- 4 files changed, 95 insertions(+), 15 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index e4e21a4d..11186e24 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -80,7 +80,7 @@ struct export { struct command_cb { union { - nbd_extent_callback extent; + nbd_extent64_callback extent; nbd_chunk_callback chunk; nbd_list_callback list; nbd_context_callback context; @@ -303,7 +303,11 @@ struct nbd_handle { size_t querynum; /* When receiving block status, this is used. */ - uint32_t *bs_entries; + union { + char *storage; /* malloc's view */ + nbd_extent *normal; /* Our 64-bit preferred internal form; n slots */ + uint32_t *narrow; /* 32-bit NBD_REPLY_TYPE_BLOCK_STATUS form; n*2 slots */ + } bs_entries; /* Commands which are waiting to be issued [meaning the request * packet is sent to the server]. This is used as a simple linked diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index c53aed7b..784adb78 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -436,6 +436,7 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: struct command *cmd = h->reply_cmd; uint32_t length; + uint32_t count; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -450,15 +451,19 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (length >= 12); length -= sizeof h->sbuf.reply.payload.bs_hdr; + count = length / (2 * sizeof (uint32_t)); - free (h->bs_entries); - h->bs_entries = malloc (length); - if (h->bs_entries == NULL) { + /* Read raw data into a subset of h->bs_entries, then expand it + * into place later during byte-swapping. + */ + free (h->bs_entries.storage); + h->bs_entries.storage = malloc (count * sizeof *h->bs_entries.normal); + if (h->bs_entries.storage == NULL) { SET_NEXT_STATE (%.DEAD); set_error (errno, "malloc"); return 0; } - h->rbuf = h->bs_entries; + h->rbuf = h->bs_entries.narrow; h->rlen = length; SET_NEXT_STATE (%RECV_BS_ENTRIES); } @@ -470,6 +475,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: uint32_t count; size_t i; uint32_t context_id; + uint32_t *raw; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -483,17 +489,21 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); - assert (h->bs_entries); + assert (h->bs_entries.normal); assert (length >= 12); assert (h->meta_valid); count = (length - sizeof h->sbuf.reply.payload.bs_hdr) / - sizeof *h->bs_entries; + (2 * sizeof (uint32_t)); /* Need to byte-swap the entries returned, but apart from that we - * don't validate them. + * don't validate them. Reverse order is essential, since we are + * expanding in-place from narrow to wider type. */ - for (i = 0; i < count; ++i) - h->bs_entries[i] = be32toh (h->bs_entries[i]); + raw = h->bs_entries.narrow; + for (i = count; i-- > 0; ) { + h->bs_entries.normal[i].flags = be32toh (raw[i * 2 + 1]); + h->bs_entries.normal[i].length = be32toh (raw[i * 2]); + } /* Look up the context ID. */ context_id = be32toh (h->sbuf.reply.payload.bs_hdr.context_id); @@ -507,7 +517,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: if (CALL_CALLBACK (cmd->cb.fn.extent, h->meta_contexts.ptr[i].name, cmd->offset, - h->bs_entries, count, + h->bs_entries.normal, count, &error) == -1) if (cmd->error == 0) cmd->error = error ? error : EPROTO; diff --git a/lib/handle.c b/lib/handle.c index 0f11bee5..fba6d1c4 100644 --- a/lib/handle.c +++ b/lib/handle.c @@ -130,7 +130,7 @@ nbd_close (struct nbd_handle *h) nbd_unlocked_clear_debug_callback (h); string_vector_empty (&h->querylist); - free (h->bs_entries); + free (h->bs_entries.storage); nbd_internal_reset_size_and_flags (h); for (i = 0; i < h->meta_contexts.len; ++i) free (h->meta_contexts.ptr[i].name); diff --git a/lib/rw.c b/lib/rw.c index 8b2bd4cc..58b05a4b 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -42,6 +42,61 @@ wait_for_command (struct nbd_handle *h, int64_t cookie) return r == -1 ? -1 : 0; } +/* Convert from 64-bit to 32-bit extent callback. */ +static int +nbd_convert_extent (void *data, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error) +{ + nbd_extent_callback *cb = data; + uint32_t *array = malloc (nr_entries * 2 * sizeof *array); + size_t i; + int ret; + bool fail = false; + + if (array == NULL) { + set_error (*error = errno, "malloc"); + return -1; + } + + for (i = 0; i < nr_entries; i++) { + array[i * 2] = entries[i].length; + array[i * 2 + 1] = entries[i].flags; + /* If an extent is larger than 32 bits, silently truncate the rest + * of the server's response; the client can then make progress + * instead of needing to see failure. Rather than track the + * connection's alignment, just clamp the large extent to 4G-64M. + * However, if flags doesn't fit in 32 bits, it's better to inform + * the caller of an EOVERFLOW failure. + * + * Technically, a server with 64-bit answers is non-compliant if + * the client did not negotiate extended headers - contexts that + * include 64-bit flags should not have been negotiated in that + * case. + */ + if (entries[i].length > UINT32_MAX) { + array[i++ * 2] = -MAX_REQUEST_SIZE; + break; + } + if (entries[i].flags > UINT32_MAX) { + *error = EOVERFLOW; + fail = true; + break; + } + } + + ret = CALL_CALLBACK (*cb, metacontext, offset, array, i * 2, error); + free (array); + return fail ? -1 : ret; +} + +static void +nbd_convert_extent_free (void *data) +{ + nbd_extent_callback *cb = data; + FREE_CALLBACK (*cb); + free (cb); +} + /* Issue a read command and wait for the reply. */ int nbd_unlocked_pread (struct nbd_handle *h, void *buf, @@ -487,12 +542,23 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, nbd_completion_callback *completion, uint32_t flags) { - struct command_cb cb = { .fn.extent = *extent, + nbd_extent_callback *shim = malloc (sizeof *shim); + struct command_cb cb = { .fn.extent.callback = nbd_convert_extent, + .fn.extent.user_data = shim, + .fn.extent.free = nbd_convert_extent_free, .completion = *completion }; + if (shim == NULL) { + set_error (errno, "malloc"); + return -1; + } + *shim = *extent; + SET_CALLBACK_TO_NULL (*extent); + if (h->strict & LIBNBD_STRICT_COMMANDS) { if (!h->structured_replies) { set_error (ENOTSUP, "server does not support structured replies"); + FREE_CALLBACK (cb.fn.extent); return -1; } @@ -500,11 +566,11 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, set_error (ENOTSUP, "did not negotiate any metadata contexts, " "either you did not call nbd_add_meta_context before " "connecting or the server does not support it"); + FREE_CALLBACK (cb.fn.extent); return -1; } } - SET_CALLBACK_TO_NULL (*extent); SET_CALLBACK_TO_NULL (*completion); return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, count, EINVAL, NULL, &cb); -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 09/22] block_status: Accept 64-bit extents during block status
Support a server giving us a 64-bit extent. Note that the protocol says a server should not give a 64-bit answer when extended headers are not negotiated; we can handle that by reporting EPROTO but otherwise accepting the information. Conversely, when extended headers are in effect, the server must respond with the 64-bit extent type, even if the extent length and flags both fit in 32 bits. Since we already store 64-bit extents internally, the user's 32-bit callback doesn't have to care which reply chunk the server uses (the shim takes care of that, and an upcoming patch adds new APIs to let the client use a 64-bit callback). Of course, until a later patch enables extended headers negotiation, no compliant server will trigger the new code here. Implementation-wise, 'normal' and 'wide' are two different types but have the same underlying size; keeping the two names makes it easier to reason about when values are still in network byte order from the server or native endian for local processing. Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 3 + generator/states-reply-structured.c | 94 +++++++++++++++++++++++------ 2 files changed, 78 insertions(+), 19 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 11186e24..e5423245 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -249,6 +249,7 @@ struct nbd_handle { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; struct nbd_structured_reply_block_status_hdr bs_hdr; + struct nbd_structured_reply_block_status_ext_hdr bs_ext_hdr; struct { struct nbd_structured_reply_error error; char msg[NBD_MAX_STRING]; /* Common to all error types */ @@ -307,6 +308,8 @@ struct nbd_handle { char *storage; /* malloc's view */ nbd_extent *normal; /* Our 64-bit preferred internal form; n slots */ uint32_t *narrow; /* 32-bit NBD_REPLY_TYPE_BLOCK_STATUS form; n*2 slots */ + struct nbd_block_descriptor_ext *wide; + /* 64-bit NBD_REPLY_TYPE_BLOCK_STATUS_EXT; n slots */ } bs_entries; /* Commands which are waiting to be issued [meaning the request diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 784adb78..55d6bb2d 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -22,6 +22,8 @@ #include <stdint.h> #include <inttypes.h> +#include "minmax.h" + /* Structured reply must be completely inside the bounds of the * requesting command. */ @@ -144,9 +146,34 @@ REPLY.STRUCTURED_REPLY.CHECK: length < 12 || ((length-4) & 7) != 0) goto resync; assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); + if (h->extended_headers) { + debug (h, "unexpected 32-bit block status with extended headers, " + "this is probably a server bug"); + if (cmd->error == 0) + cmd->error = EPROTO; + } /* Start by reading the context ID. */ h->rbuf = &h->sbuf.reply.payload.bs_hdr; h->rlen = sizeof h->sbuf.reply.payload.bs_hdr; + h->sbuf.reply.payload.bs_ext_hdr.count = 0; + SET_NEXT_STATE (%RECV_BS_HEADER); + break; + + case NBD_REPLY_TYPE_BLOCK_STATUS_EXT: + if (cmd->type != NBD_CMD_BLOCK_STATUS || + length < 24 || + (length-8) % sizeof (struct nbd_block_descriptor_ext)) + goto resync; + assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); + if (!h->extended_headers) { + debug (h, "unexpected 64-bit block status without extended headers, " + "this is probably a server bug"); + if (cmd->error == 0) + cmd->error = EPROTO; + } + /* Start by reading the context ID. */ + h->rbuf = &h->sbuf.reply.payload.bs_ext_hdr; + h->rlen = sizeof h->sbuf.reply.payload.bs_ext_hdr; SET_NEXT_STATE (%RECV_BS_HEADER); break; @@ -437,6 +464,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: struct command *cmd = h->reply_cmd; uint32_t length; uint32_t count; + uint16_t type; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -446,24 +474,43 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: return 0; case 0: length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ + type = be16toh (h->sbuf.reply.hdr.structured.type); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); - assert (length >= 12); - length -= sizeof h->sbuf.reply.payload.bs_hdr; - count = length / (2 * sizeof (uint32_t)); - /* Read raw data into a subset of h->bs_entries, then expand it + if (type == NBD_REPLY_TYPE_BLOCK_STATUS) { + assert (length >= 12); + length -= sizeof h->sbuf.reply.payload.bs_hdr; + count = length / (2 * sizeof (uint32_t)); + } + else { + assert (type == NBD_REPLY_TYPE_BLOCK_STATUS_EXT); + assert (length >= 24); + length -= sizeof h->sbuf.reply.payload.bs_ext_hdr; + count = length / sizeof (struct nbd_block_descriptor_ext); + if (count != be32toh (h->sbuf.reply.payload.bs_ext_hdr.count)) { + h->rbuf = NULL; + h->rlen = length; + SET_NEXT_STATE (%RESYNC); + return 0; + } + } + /* Normalize count for later use. */ + h->sbuf.reply.payload.bs_ext_hdr.count = count; + + /* Read raw data into an overlap of h->bs_entries, then move it * into place later during byte-swapping. */ free (h->bs_entries.storage); - h->bs_entries.storage = malloc (count * sizeof *h->bs_entries.normal); + h->bs_entries.storage = malloc (MAX (count * sizeof *h->bs_entries.normal, + length)); if (h->bs_entries.storage == NULL) { SET_NEXT_STATE (%.DEAD); set_error (errno, "malloc"); return 0; } - h->rbuf = h->bs_entries.narrow; + h->rbuf = h->bs_entries.storage; h->rlen = length; SET_NEXT_STATE (%RECV_BS_ENTRIES); } @@ -471,11 +518,10 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: struct command *cmd = h->reply_cmd; - uint32_t length; uint32_t count; size_t i; uint32_t context_id; - uint32_t *raw; + uint16_t type; switch (recv_into_rbuf (h)) { case -1: SET_NEXT_STATE (%.DEAD); return 0; @@ -484,25 +530,35 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: SET_NEXT_STATE (%.READY); return 0; case 0: - length = h->sbuf.reply.hdr.structured.length; /* normalized in CHECK */ + type = be16toh (h->sbuf.reply.hdr.structured.type); + count = h->sbuf.reply.payload.bs_ext_hdr.count; /* normalized in BS_HEADER */ assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); - assert (h->bs_entries.normal); - assert (length >= 12); + assert (h->bs_entries.normal && count); assert (h->meta_valid); - count = (length - sizeof h->sbuf.reply.payload.bs_hdr) / - (2 * sizeof (uint32_t)); /* Need to byte-swap the entries returned, but apart from that we - * don't validate them. Reverse order is essential, since we are - * expanding in-place from narrow to wider type. + * don't validate them. */ - raw = h->bs_entries.narrow; - for (i = count; i-- > 0; ) { - h->bs_entries.normal[i].flags = be32toh (raw[i * 2 + 1]); - h->bs_entries.normal[i].length = be32toh (raw[i * 2]); + if (type == NBD_REPLY_TYPE_BLOCK_STATUS) { + uint32_t *raw = h->bs_entries.narrow; + + /* Expanding in-place from narrow to wide, must use reverse order. */ + for (i = count; i-- > 0; ) { + h->bs_entries.normal[i].flags = be32toh (raw[i * 2 + 1]); + h->bs_entries.normal[i].length = be32toh (raw[i * 2]); + } + } + else { + struct nbd_block_descriptor_ext *wide = h->bs_entries.wide; + + assert (type == NBD_REPLY_TYPE_BLOCK_STATUS_EXT); + for (i = 0; i < count; i++) { + h->bs_entries.normal[i].length = be64toh (wide[i].length); + h->bs_entries.normal[i].flags = be64toh (wide[i].status_flags); + } } /* Look up the context ID. */ -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 10/22] api: Add [aio_]nbd_block_status_64
Overcome the inherent 32-bit limitation of our existing nbd_block_status command by adding a 64-bit variant. The command sent to the server does not change, but the user's callback is now handed 64-bit information regardless of whether the server replies with 32- or 64-bit extents. Unit tests prove that the new API works in each of C, Python, OCaml, and Go bindings. We can also get rid of the temporary hack added to appease the compiler in an earlier patch. Signed-off-by: Eric Blake <eblake at redhat.com> --- docs/libnbd.pod | 18 +-- sh/nbdsh.pod | 2 +- generator/API.ml | 146 +++++++++++++++++++--- generator/OCaml.ml | 1 - generator/Python.ml | 1 - lib/rw.c | 48 +++++-- python/t/465-block-status-64.py | 56 +++++++++ ocaml/tests/Makefile.am | 1 + ocaml/tests/test_465_block_status_64.ml | 58 +++++++++ tests/meta-base-allocation.c | 104 +++++++++++++-- fuzzing/libnbd-fuzz-wrapper.c | 20 ++- golang/Makefile.am | 1 + golang/libnbd_465_block_status_64_test.go | 119 ++++++++++++++++++ 13 files changed, 527 insertions(+), 48 deletions(-) create mode 100644 python/t/465-block-status-64.py create mode 100644 ocaml/tests/test_465_block_status_64.ml create mode 100644 golang/libnbd_465_block_status_64_test.go diff --git a/docs/libnbd.pod b/docs/libnbd.pod index 72f74053..ddbb09f3 100644 --- a/docs/libnbd.pod +++ b/docs/libnbd.pod @@ -661,14 +661,14 @@ In order to utilize block status, the client must call L<nbd_add_meta_context(3)> prior to connecting, for each meta context in which it is interested, then check L<nbd_can_meta_context(3)> after connection to see which contexts the server actually supports. If a -context is supported, the client can then use L<nbd_block_status(3)> -with a callback function that will receive an array of 32-bit integer -pairs describing consecutive extents within a context. In each pair, -the first integer is the length of the extent, the second is a bitmask -description of that extent (for the "base:allocation" context, the -bitmask may include C<LIBNBD_STATE_HOLE> for unallocated portions of -the file, and/or C<LIBNBD_STATE_ZERO> for portions of the file known -to read as zero). +context is supported, the client can then use +L<nbd_block_status_64(3)> with a callback function that will receive +an array of structs describing consecutive extents within a context. +Each struct gives the length of the extent, then a bitmask description +of that extent (for the "base:allocation" context, the bitmask may +include C<LIBNBD_STATE_HOLE> for unallocated portions of the file, +and/or C<LIBNBD_STATE_ZERO> for portions of the file known to read as +zero). There is a full example of requesting meta context and using block status available at @@ -917,7 +917,7 @@ will tie up resources until L<nbd_close(3)> is eventually reached). =head2 Callbacks with C<int *error> parameter Some of the high-level commands (L<nbd_pread_structured(3)>, -L<nbd_block_status(3)>) involve the use of a callback function invoked +L<nbd_block_status_64(3)>) involve the use of a callback function invoked by the state machine at appropriate points in the server's reply before the overall command is complete. These callback functions, along with all of the completion callbacks, include a parameter diff --git a/sh/nbdsh.pod b/sh/nbdsh.pod index d50b738a..d222c0ff 100644 --- a/sh/nbdsh.pod +++ b/sh/nbdsh.pod @@ -59,7 +59,7 @@ Display brief command line help and exit. =item B<--base-allocation> Request the use of the "base:allocation" meta context, which is the -most common context used with L<nbd_block_status(3)>. This is +most common context used with L<nbd_block_status_64(3)>. This is equivalent to calling S<C<h.set_meta_context(nbd.CONTEXT_BASE_ALLOCATION)>> in the shell prior to connecting, and works even when combined with C<--uri> (while diff --git a/generator/API.ml b/generator/API.ml index 1da1ec53..f04f4b79 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -1565,7 +1565,7 @@ "add_meta_context", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L<nbd_block_status(3)> command (for C<\"base:allocation\"> +by the L<nbd_block_status_64(3)> command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This call adds one metadata context to the list to be negotiated. @@ -1592,7 +1592,7 @@ "add_meta_context", { Other metadata contexts are server-specific, but include C<\"qemu:dirty-bitmap:...\"> and C<\"qemu:allocation-depth\"> for qemu-nbd (see qemu-nbd I<-B> and I<-A> options)."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "get_nr_meta_contexts"; Link "get_meta_context"; Link "clear_meta_contexts"]; }; @@ -1605,14 +1605,14 @@ "get_nr_meta_contexts", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L<nbd_block_status(3)> command (for C<\"base:allocation\"> +by the L<nbd_block_status_64(3)> command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This command returns how many meta contexts have been added to the list to request from the server via L<nbd_add_meta_context(3)>. The server is not obligated to honor all of the requests; to see what it actually supports, see L<nbd_can_meta_context(3)>."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "add_meta_context"; Link "get_meta_context"; Link "clear_meta_contexts"]; }; @@ -1625,13 +1625,13 @@ "get_meta_context", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L<nbd_block_status(3)> command (for C<\"base:allocation\"> +by the L<nbd_block_status_64(3)> command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This command returns the i'th meta context request, as added by L<nbd_add_meta_context(3)>, and bounded by L<nbd_get_nr_meta_contexts(3)>."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "add_meta_context"; Link "get_nr_meta_contexts"; Link "clear_meta_contexts"]; }; @@ -1645,7 +1645,7 @@ "clear_meta_contexts", { During connection libnbd can negotiate zero or more metadata contexts with the server. Metadata contexts are features (such as C<\"base:allocation\">) which describe information returned -by the L<nbd_block_status(3)> command (for C<\"base:allocation\"> +by the L<nbd_block_status_64(3)> command (for C<\"base:allocation\"> this is whether blocks of data are allocated, zero or sparse). This command resets the list of meta contexts to request back to @@ -1654,7 +1654,7 @@ "clear_meta_contexts", { negotiation mode is selected (see L<nbd_set_opt_mode(3)>), for altering the list of attempted contexts between subsequent export queries."; - see_also = [Link "block_status"; Link "can_meta_context"; + see_also = [Link "block_status_64"; Link "can_meta_context"; Link "add_meta_context"; Link "get_nr_meta_contexts"; Link "get_meta_context"; Link "set_opt_mode"]; }; @@ -2281,7 +2281,7 @@ "can_meta_context", { ^ non_blocking_test_call_description; see_also = [SectionLink "Flag calls"; Link "opt_info"; Link "add_meta_context"; - Link "block_status"; Link "aio_block_status"; + Link "block_status_64"; Link "aio_block_status_64"; Link "set_request_meta_context"; Link "opt_set_meta_context"]; }; @@ -2712,7 +2712,7 @@ "block_status", { optargs = [ OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; ret = RErr; permitted_states = [ Connected ]; - shortdesc = "send block status command to the NBD server"; + shortdesc = "send block status command, with 32-bit callback"; longdesc = "\ Issue the block status command to the NBD server. If supported by the server, this causes metadata context @@ -2727,7 +2727,15 @@ "block_status", { The NBD protocol does not yet have a way for a client to learn if the server will enforce an even smaller maximum block status size, although a future extension may add a constraint visible in -L<nbd_get_block_size(3)>. +L<nbd_get_block_size(3)>. Furthermore, this function is inherently +limited to 32-bit values. If the server replies with a larger +extent, the length of that extent will be truncated to just +below 32 bits and any further extents from the server will be +ignored. If the server replies with a status value larger than +32 bits (only possible when extended headers are in use), the +callback function will be passed an C<EOVERFLOW> error. To get +the full extent information from a server that supports 64-bit +extents, you must use L<nbd_block_status_64(3)>. Depending on which metadata contexts were enabled before connecting (see L<nbd_add_meta_context(3)>) and which are @@ -2747,7 +2755,7 @@ "block_status", { array is an array of pairs of integers with the first entry in each pair being the length (in bytes) of the block and the second entry being a status/flags field which is specific to the metadata context. -(The number of pairs passed to the function is C<nr_entries/2>.) The +The number of pairs passed to the function is C<nr_entries/2>. The NBD protocol document in the section about C<NBD_REPLY_TYPE_BLOCK_STATUS> describes the meaning of this array; for contexts known to libnbd, B<E<lt>libnbd.hE<gt>> contains constants @@ -2771,10 +2779,79 @@ "block_status", { does not exceed C<count> bytes; however, libnbd does not validate that the server obeyed the flag." ^ strict_call_description; - see_also = [Link "add_meta_context"; Link "can_meta_context"; + see_also = [Link "block_status_64"; + Link "add_meta_context"; Link "can_meta_context"; Link "aio_block_status"; Link "set_strict_mode"]; }; + "block_status_64", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; Closure extent64_closure ]; + optargs = [ OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; + ret = RErr; + permitted_states = [ Connected ]; + shortdesc = "send block status command, with 64-bit callback"; + longdesc = "\ +Issue the block status command to the NBD server. If +supported by the server, this causes metadata context +information about blocks beginning from the specified +offset to be returned. The C<count> parameter is a hint: the +server may choose to return less status, or the final block +may extend beyond the requested range. If multiple contexts +are supported, the number of blocks and cumulative length +of those blocks need not be identical between contexts. + +Note that not all servers can support a C<count> of 4GiB or larger. +The NBD protocol does not yet have a way for a client to learn if +the server will enforce an even smaller maximum block status size, +although a future extension may add a constraint visible in +L<nbd_get_block_size(3)>. + +Depending on which metadata contexts were enabled before +connecting (see L<nbd_add_meta_context(3)>) and which are +supported by the server (see L<nbd_can_meta_context(3)>) this call +returns information about extents by calling back to the +C<extent64> function. The callback cannot call C<nbd_*> APIs on the +same handle since it holds the handle lock and will +cause a deadlock. If the callback returns C<-1>, and no earlier +error has been detected, then the overall block status command +will fail with any non-zero value stored into the callback's +C<error> parameter (with a default of C<EPROTO>); but any further +contexts will still invoke the callback. + +The C<extent64> function is called once per type of metadata available, +with the C<user_data> passed to this function. The C<metacontext> +parameter is a string such as C<\"base:allocation\">. The C<entries> +array is an array of B<nbd_extent> structs, containing length (in bytes) +of the block and a status/flags field which is specific to the metadata +context. The number of array entries passed to the function is +C<nr_entries>. The NBD protocol document in the section about +C<NBD_REPLY_TYPE_BLOCK_STATUS> describes the meaning of this array; +for contexts known to libnbd, B<E<lt>libnbd.hE<gt>> contains constants +beginning with C<LIBNBD_STATE_> that may help decipher the values. +On entry to the callback, the C<error> parameter contains the errno +value of any previously detected error. + +It is possible for the extent function to be called +more times than you expect (if the server is buggy), +so always check the C<metacontext> field to ensure you +are receiving the data you expect. It is also possible +that the extent function is not called at all, even for +metadata contexts that you requested. This indicates +either that the server doesn't support the context +or for some other reason cannot return the data. + +The C<flags> parameter may be C<0> for no flags, or may contain +C<LIBNBD_CMD_FLAG_REQ_ONE> meaning that the server should +return only one extent per metadata context where that extent +does not exceed C<count> bytes; however, libnbd does not +validate that the server obeyed the flag." +^ strict_call_description; + see_also = [Link "block_status"; + Link "add_meta_context"; Link "can_meta_context"; + Link "aio_block_status_64"; Link "set_strict_mode"]; + }; + "poll", { default_call with args = [ Int "timeout" ]; ret = RInt; @@ -3369,7 +3446,7 @@ "aio_block_status", { OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; ret = RCookie; permitted_states = [ Connected ]; - shortdesc = "send block status command to the NBD server"; + shortdesc = "send block status command, with 32-bit callback"; longdesc = "\ Send the block status command to the NBD server. @@ -3377,13 +3454,48 @@ "aio_block_status", { Or supply the optional C<completion_callback> which will be invoked as described in L<libnbd(3)/Completion callbacks>. -Other parameters behave as documented in L<nbd_block_status(3)>." +Other parameters behave as documented in L<nbd_block_status(3)>. + +This function is inherently limited to 32-bit values. If the +server replies with a larger extent, the length of that extent +will be truncated to just below 32 bits and any further extents +from the server will be ignored. If the server replies with a +status value larger than 32 bits (only possible when extended +headers are in use), the callback function will be passed an +C<EOVERFLOW> error. To get the full extent information from a +server that supports 64-bit extents, you must use +L<nbd_aio_block_status_64(3)>. +" ^ strict_call_description; see_also = [SectionLink "Issuing asynchronous commands"; + Link "aio_block_status_64"; Link "can_meta_context"; Link "block_status"; Link "set_strict_mode"]; }; + "aio_block_status_64", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; Closure extent64_closure ]; + optargs = [ OClosure completion_closure; + OFlags ("flags", cmd_flags, Some ["REQ_ONE"]) ]; + ret = RCookie; + permitted_states = [ Connected ]; + shortdesc = "send block status command, with 64-bit callback"; + longdesc = "\ +Send the block status command to the NBD server. + +To check if the command completed, call L<nbd_aio_command_completed(3)>. +Or supply the optional C<completion_callback> which will be invoked +as described in L<libnbd(3)/Completion callbacks>. + +Other parameters behave as documented in L<nbd_block_status_64(3)>." +^ strict_call_description; + see_also = [SectionLink "Issuing asynchronous commands"; + Link "aio_block_status"; + Link "can_meta_context"; Link "block_status_64"; + Link "set_strict_mode"]; + }; + "aio_get_fd", { default_call with args = []; ret = RFd; @@ -3909,6 +4021,10 @@ let first_version "set_socket_activation_name", (1, 16); "get_socket_activation_name", (1, 16); + (* Added in 1.17.x development cycle, will be stable and supported in 1.18. *) + "block_status_64", (1, 18); + "aio_block_status_64", (1, 18); + (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. "get_tls_certificates", (1, ??); diff --git a/generator/OCaml.ml b/generator/OCaml.ml index ea3b044a..b05cd1d2 100644 --- a/generator/OCaml.ml +++ b/generator/OCaml.ml @@ -593,7 +593,6 @@ let pr "}\n"; pr "\n"; pr "static int\n"; - pr "__attribute__ ((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; diff --git a/generator/Python.ml b/generator/Python.ml index ccac5fed..6f87b0cf 100644 --- a/generator/Python.ml +++ b/generator/Python.ml @@ -158,7 +158,6 @@ let let print_python_closure_wrapper { cbname; cbargs } pr "/* Wrapper for %s callback. */\n" cbname; pr "static int\n"; - pr "__attribute__ ((unused)) /* XXX temporary hack */\n"; pr "%s_wrapper " cbname; C.print_cbarg_list ~wrap:true cbargs; pr "\n"; diff --git a/lib/rw.c b/lib/rw.c index 58b05a4b..bea55fa1 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -205,7 +205,7 @@ nbd_unlocked_zero (struct nbd_handle *h, return wait_for_command (h, cookie); } -/* Issue a block status command and wait for the reply. */ +/* Issue a block status command and wait for the reply, 32-bit callback. */ int nbd_unlocked_block_status (struct nbd_handle *h, uint64_t count, uint64_t offset, @@ -223,6 +223,25 @@ nbd_unlocked_block_status (struct nbd_handle *h, return wait_for_command (h, cookie); } +/* Issue a block status command and wait for the reply, 64-bit callback. */ +int +nbd_unlocked_block_status_64 (struct nbd_handle *h, + uint64_t count, uint64_t offset, + nbd_extent64_callback *extent64, + uint32_t flags) +{ + int64_t cookie; + nbd_completion_callback c = NBD_NULL_COMPLETION; + + cookie = nbd_unlocked_aio_block_status_64 (h, count, offset, extent64, &c, + flags); + if (cookie == -1) + return -1; + + assert (CALLBACK_IS_NULL (*extent64)); + return wait_for_command (h, cookie); +} + /* count_err represents the errno to return if bounds check fail */ int64_t nbd_internal_command_common (struct nbd_handle *h, @@ -543,10 +562,10 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, uint32_t flags) { nbd_extent_callback *shim = malloc (sizeof *shim); - struct command_cb cb = { .fn.extent.callback = nbd_convert_extent, - .fn.extent.user_data = shim, - .fn.extent.free = nbd_convert_extent_free, - .completion = *completion }; + nbd_extent64_callback wrapper = { .callback = nbd_convert_extent, + .user_data = shim, + .free = nbd_convert_extent_free, }; + int ret; if (shim == NULL) { set_error (errno, "malloc"); @@ -555,10 +574,25 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, *shim = *extent; SET_CALLBACK_TO_NULL (*extent); + ret = nbd_unlocked_aio_block_status_64 (h, count, offset, &wrapper, + completion, flags); + FREE_CALLBACK (wrapper); + return ret; +} + +int64_t +nbd_unlocked_aio_block_status_64 (struct nbd_handle *h, + uint64_t count, uint64_t offset, + nbd_extent64_callback *extent64, + nbd_completion_callback *completion, + uint32_t flags) +{ + struct command_cb cb = { .fn.extent = *extent64, + .completion = *completion }; + if (h->strict & LIBNBD_STRICT_COMMANDS) { if (!h->structured_replies) { set_error (ENOTSUP, "server does not support structured replies"); - FREE_CALLBACK (cb.fn.extent); return -1; } @@ -566,11 +600,11 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h, set_error (ENOTSUP, "did not negotiate any metadata contexts, " "either you did not call nbd_add_meta_context before " "connecting or the server does not support it"); - FREE_CALLBACK (cb.fn.extent); return -1; } } + SET_CALLBACK_TO_NULL (*extent64); SET_CALLBACK_TO_NULL (*completion); return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, count, EINVAL, NULL, &cb); diff --git a/python/t/465-block-status-64.py b/python/t/465-block-status-64.py new file mode 100644 index 00000000..a8a8eaea --- /dev/null +++ b/python/t/465-block-status-64.py @@ -0,0 +1,56 @@ +# libnbd Python bindings +# Copyright Red Hat +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +import os + +import nbd + +script = "%s/../tests/meta-base-allocation.sh" % os.getenv("srcdir", ".") + +h = nbd.NBD() +h.add_meta_context("base:allocation") +h.connect_command(["nbdkit", "-s", "--exit-with-parent", "-v", "sh", script]) + +entries = [] + + +def f(user_data, metacontext, offset, e, err): + global entries + assert user_data == 42 + assert err.value == 0 + if metacontext != "base:allocation": + return + entries = e + + +h.block_status_64(65536, 0, lambda *args: f(42, *args)) +print("entries = %r" % entries) +assert entries == [(8192, 0), + (8192, 1), + (16384, 3), + (16384, 2), + (16384, 0)] + +h.block_status_64(1024, 32256, lambda *args: f(42, *args)) +print("entries = %r" % entries) +assert entries == [(512, 3), + (16384, 2)] + +h.block_status_64(1024, 32256, lambda *args: f(42, *args), + nbd.CMD_FLAG_REQ_ONE) +print("entries = %r" % entries) +assert entries == [(512, 3)] diff --git a/ocaml/tests/Makefile.am b/ocaml/tests/Makefile.am index 20546350..4ddd0f9f 100644 --- a/ocaml/tests/Makefile.am +++ b/ocaml/tests/Makefile.am @@ -40,6 +40,7 @@ ML_TESTS = \ test_405_pread_structured.ml \ test_410_pwrite.ml \ test_460_block_status.ml \ + test_465_block_status_64.ml \ test_500_aio_pread.ml \ test_505_aio_pread_structured_callback.ml \ test_510_aio_pwrite.ml \ diff --git a/ocaml/tests/test_465_block_status_64.ml b/ocaml/tests/test_465_block_status_64.ml new file mode 100644 index 00000000..0634a025 --- /dev/null +++ b/ocaml/tests/test_465_block_status_64.ml @@ -0,0 +1,58 @@ +(* hey emacs, this is OCaml code: -*- tuareg -*- *) +(* libnbd OCaml test case + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + *) + +open Printf + +let script + try + let srcdir = Sys.getenv "srcdir" in + sprintf "%s/../../tests/meta-base-allocation.sh" srcdir + with + Not_found -> failwith "error: srcdir is not defined" + +let entries = ref [||] +let f user_data metacontext offset e err + assert (user_data = 42); + assert (!err = 0); + if metacontext = "base:allocation" then + entries := e; + 0 + +let () + let nbd = NBD.create () in + NBD.add_meta_context nbd "base:allocation"; + NBD.connect_command nbd ["nbdkit"; "-s"; "--exit-with-parent"; "-v"; + "sh"; script]; + + NBD.block_status_64 nbd 65536_L 0_L (f 42); + assert (!entries = [| 8192_L, 0_L; + 8192_L, 1_L; + 16384_L, 3_L; + 16384_L, 2_L; + 16384_L, 0_L |]); + + NBD.block_status_64 nbd 1024_L 32256_L (f 42); + assert (!entries = [| 512_L, 3_L; + 16384_L, 2_L |]); + + let flags = let open NBD.CMD_FLAG in [REQ_ONE] in + NBD.block_status_64 nbd 1024_L 32256_L (f 42) ~flags; + assert (!entries = [| 512_L, 3_L |]) + +let () = Gc.compact () diff --git a/tests/meta-base-allocation.c b/tests/meta-base-allocation.c index a7b3af09..7697b5da 100644 --- a/tests/meta-base-allocation.c +++ b/tests/meta-base-allocation.c @@ -32,10 +32,13 @@ #define BOGUS_CONTEXT "x-libnbd:nosuch" -static int check_extent (void *data, - const char *metacontext, - uint64_t offset, - uint32_t *entries, size_t nr_entries, int *error); +static int check_extent32 (void *data, const char *metacontext, + uint64_t offset, + uint32_t *entries, size_t nr_entries, int *error); + +static int check_extent64 (void *data, const char *metacontext, + uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error); int main (int argc, char *argv[]) @@ -43,8 +46,10 @@ main (int argc, char *argv[]) struct nbd_handle *nbd; char plugin_path[256]; int id; - nbd_extent_callback extent_callback = { .callback = check_extent, - .user_data = &id }; + nbd_extent_callback extent32_callback = { .callback = check_extent32, + .user_data = &id }; + nbd_extent64_callback extent64_callback = { .callback = check_extent64, + .user_data = &id }; int r; const char *s; char *tmp; @@ -150,23 +155,36 @@ main (int argc, char *argv[]) /* Read the block status. */ id = 1; - if (nbd_block_status (nbd, 65536, 0, extent_callback, 0) == -1) { + if (nbd_block_status (nbd, 65536, 0, extent32_callback, 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + if (nbd_block_status_64 (nbd, 65536, 0, extent64_callback, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } id = 2; - if (nbd_block_status (nbd, 1024, 32768-512, extent_callback, 0) == -1) { + if (nbd_block_status (nbd, 1024, 32768-512, extent32_callback, 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + if (nbd_block_status_64 (nbd, 1024, 32768-512, extent64_callback, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } id = 3; - if (nbd_block_status (nbd, 1024, 32768-512, extent_callback, + if (nbd_block_status (nbd, 1024, 32768-512, extent32_callback, LIBNBD_CMD_FLAG_REQ_ONE) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); } + if (nbd_block_status_64 (nbd, 1024, 32768-512, extent64_callback, + LIBNBD_CMD_FLAG_REQ_ONE) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } if (nbd_shutdown (nbd, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); @@ -178,10 +196,8 @@ main (int argc, char *argv[]) } static int -check_extent (void *data, - const char *metacontext, - uint64_t offset, - uint32_t *entries, size_t nr_entries, int *error) +check_extent32 (void *data, const char *metacontext, uint64_t offset, + uint32_t *entries, size_t nr_entries, int *error) { size_t i; int id; @@ -235,3 +251,65 @@ check_extent (void *data, return 0; } + +static int +check_extent64 (void *data, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error) +{ + size_t i; + int id; + + id = * (int *)data; + + printf ("extent: id=%d, metacontext=%s, offset=%" PRIu64 ", " + "nr_entries=%zu, error=%d\n", + id, metacontext, offset, nr_entries, *error); + + assert (*error == 0); + if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) == 0) { + for (i = 0; i < nr_entries; i++) { + printf ("\t%zu\tlength=%" PRIu64 ", status=%" PRIu64 "\n", + i, entries[i].length, entries[i].flags); + } + fflush (stdout); + + switch (id) { + case 1: + assert (nr_entries == 5); + assert (entries[0].length == 8192); + assert (entries[0].flags == 0); + assert (entries[1].length == 8192); + assert (entries[1].flags == LIBNBD_STATE_HOLE); + assert (entries[2].length == 16384); + assert (entries[2].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + assert (entries[3].length == 16384); + assert (entries[3].flags == LIBNBD_STATE_ZERO); + assert (entries[4].length == 16384); + assert (entries[4].flags == 0); + break; + + case 2: + assert (nr_entries == 2); + assert (entries[0].length == 512); + assert (entries[0].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + assert (entries[1].length == 16384); + assert (entries[1].flags == LIBNBD_STATE_ZERO); + break; + + case 3: + assert (nr_entries == 1); + assert (entries[0].length == 512); + assert (entries[0].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + break; + + default: + abort (); + } + + } + else + fprintf (stderr, "warning: ignored unexpected meta context %s\n", + metacontext); + + return 0; +} diff --git a/fuzzing/libnbd-fuzz-wrapper.c b/fuzzing/libnbd-fuzz-wrapper.c index e111866f..cbd55380 100644 --- a/fuzzing/libnbd-fuzz-wrapper.c +++ b/fuzzing/libnbd-fuzz-wrapper.c @@ -165,6 +165,17 @@ extent_callback (void *user_data, return 0; } +/* Block status (extents64) callback, does nothing. */ +static int +extent64_callback (void *user_data, + const char *metacontext, + uint64_t offset, nbd_extent *entries, + size_t nr_entries, int *error) +{ + //fprintf (stderr, "extent64 called, nr_entries = %zu\n", nr_entries); + return 0; +} + /* This is the client (parent process) running libnbd. */ static char buf[512]; static char prbuf[65536]; @@ -213,7 +224,7 @@ client (int sock) }, 0); - /* Test block status. */ + /* Test both sizes of block status. */ nbd_block_status (nbd, length, 0, (nbd_extent_callback) { .callback = extent_callback, @@ -221,6 +232,13 @@ client (int sock) .free = NULL }, 0); + nbd_block_status_64 (nbd, length, 0, + (nbd_extent64_callback) { + .callback = extent64_callback, + .user_data = NULL, + .free = NULL + }, + 0); nbd_shutdown (nbd, 0); } diff --git a/golang/Makefile.am b/golang/Makefile.am index fac65248..b4151250 100644 --- a/golang/Makefile.am +++ b/golang/Makefile.am @@ -43,6 +43,7 @@ source_files = \ libnbd_405_pread_structured_test.go \ libnbd_410_pwrite_test.go \ libnbd_460_block_status_test.go \ + libnbd_465_block_status_64_test.go \ libnbd_500_aio_pread_test.go \ libnbd_510_aio_pwrite_test.go \ libnbd_590_aio_copy_test.go \ diff --git a/golang/libnbd_465_block_status_64_test.go b/golang/libnbd_465_block_status_64_test.go new file mode 100644 index 00000000..7659a21d --- /dev/null +++ b/golang/libnbd_465_block_status_64_test.go @@ -0,0 +1,119 @@ +/* libnbd golang tests + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +package libnbd + +import ( + "fmt" + "os" + "strings" + "testing" +) + +var entries64 []LibnbdExtent + +func mcf64(metacontext string, offset uint64, e []LibnbdExtent, error *int) int { + if *error != 0 { + panic("expected *error == 0") + } + if metacontext == "base:allocation" { + entries64 = e + } + return 0 +} + +// Seriously WTF? +func mc64_compare(a1 []LibnbdExtent, a2 []LibnbdExtent) bool { + if len(a1) != len(a2) { + return false + } + for i := 0; i < len(a1); i++ { + if a1[i] != a2[i] { + return false + } + } + return true +} + +func mc64_to_string(a []LibnbdExtent) string { + ss := make([]string, len(a)) + for i := 0; i < len(a); i++ { + ss[i] = fmt.Sprintf("%#v", a[i]) + } + return strings.Join(ss, ", ") +} + +func Test465BlockStatus64(t *testing.T) { + srcdir := os.Getenv("abs_top_srcdir") + script := srcdir + "/tests/meta-base-allocation.sh" + + h, err := Create() + if err != nil { + t.Fatalf("could not create handle: %s", err) + } + defer h.Close() + + err = h.AddMetaContext("base:allocation") + if err != nil { + t.Fatalf("%s", err) + } + err = h.ConnectCommand([]string{ + "nbdkit", "-s", "--exit-with-parent", "-v", + "sh", script, + }) + if err != nil { + t.Fatalf("%s", err) + } + + err = h.BlockStatus64(65536, 0, mcf64, nil) + if err != nil { + t.Fatalf("%s", err) + } + if !mc64_compare(entries64, []LibnbdExtent{ + {8192, 0}, + {8192, 1}, + {16384, 3}, + {16384, 2}, + {16384, 0}, + }) { + t.Fatalf("unexpected entries (1): %s", mc64_to_string(entries64)) + } + + err = h.BlockStatus64(1024, 32256, mcf64, nil) + if err != nil { + t.Fatalf("%s", err) + } + if !mc64_compare(entries64, []LibnbdExtent{ + {512, 3}, + {16384, 2}, + }) { + t.Fatalf("unexpected entries (2): %s", mc64_to_string(entries64)) + } + + var optargs BlockStatus64Optargs + optargs.FlagsSet = true + optargs.Flags = CMD_FLAG_REQ_ONE + err = h.BlockStatus64(1024, 32256, mcf64, &optargs) + if err != nil { + t.Fatalf("%s", err) + } + if !mc64_compare(entries64, []LibnbdExtent{{512, 3}}) { + t.Fatalf("unexpected entries (3): %s", mc64_to_string(entries64)) + } + +} -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 11/22] api: Add several functions for controlling extended headers
The new NBD_OPT_EXTENDED_HEADERS feature is worth using by default, but there may be cases where the user explicitly wants to stick with the older 32-bit headers. nbd_set_request_extended_headers() will let the client override the default, nbd_get_request_extended_headers() determines the current state of the request, and nbd_get_extended_headers_negotiated() determines what the client and server actually settled on. These use nbd_set_request_structured_replies() and friends as a template. Note that this patch just adds the API for controlling the defaults, but ignores the state variable; a later one will then tweak the state machine to actually request structured headers when the state variable is set, as well as add nbd_opt_extended_headers() for manual control. The intent is that because extended headers takes priority over structured replies, the functions will interact as follows during the automatic handshaking that takes place prior to nbd_opt_set_mode() relinquishing control to the client in negotiation mode: 1. client default: request_eh=1, request_sr=1 => try EXTENDED_HEADERS - server supports it: nothing further to do, use extended headers, but also report structured replies as active - server lacks it: behave like case 2 2. request_eh=0 (explicit client downgrade), request_sr=1 (left at default) => try STRUCTURED_REPLY - server supports it: expect structured replies - server lacks it: expect simple replies 3. request_sr=0 (explicit client downgrade), request_eh ignored => don't try either handshake - expect simple replies Client code that wants to manually force simple replies only has to disable structured replies; and only code that wants to disable extended headers but still use structured replies has to use the new nbd_set_request_extended_headers() API. Until a later patch adds an explicit API nbd_opt_extended_headers(), there is no way to request extended headers after structured replies are already negotiated. Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 1 + generator/API.ml | 115 ++++++++++++++++-- generator/states-newstyle-opt-starttls.c | 1 + .../states-newstyle-opt-structured-reply.c | 3 +- lib/handle.c | 23 ++++ python/t/110-defaults.py | 1 + python/t/120-set-non-defaults.py | 2 + ocaml/tests/test_110_defaults.ml | 2 + ocaml/tests/test_120_set_non_defaults.ml | 3 + golang/libnbd_110_defaults_test.go | 8 ++ golang/libnbd_120_set_non_defaults_test.go | 12 ++ 11 files changed, 160 insertions(+), 11 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index e5423245..2948b77b 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -111,6 +111,7 @@ struct nbd_handle { char *tls_psk_file; /* PSK filename, NULL = no PSK */ /* Extended headers. */ + bool request_eh; /* Whether to request extended headers */ bool extended_headers; /* If we negotiated NBD_OPT_EXTENDED_HEADERS */ /* Desired metadata contexts. */ diff --git a/generator/API.ml b/generator/API.ml index f04f4b79..7616990a 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -808,6 +808,77 @@ "get_tls_psk_file", { }; *) + "set_request_extended_headers", { + default_call with + args = [Bool "request"]; ret = RErr; + permitted_states = [ Created ]; + shortdesc = "control use of extended headers"; + longdesc = "\ +By default, libnbd tries to negotiate extended headers with the +server, as this protocol extension permits the use of 64-bit +zero, trim, and block status actions. However, +for integration testing, it can be useful to clear this flag +rather than find a way to alter the server to fail the negotiation +request. + +For backwards compatibility, the setting of this knob is ignored +if L<nbd_set_request_structured_replies(3)> is also set to false, +since the use of extended headers implies structured replies."; + see_also = [Link "get_request_extended_headers"; + Link "set_handshake_flags"; Link "set_strict_mode"; + Link "get_extended_headers_negotiated"; + Link "zero"; Link "trim"; Link "cache"; + Link "block_status_64"; + Link "set_request_structured_replies"]; + }; + + "get_request_extended_headers", { + default_call with + args = []; ret = RBool; + may_set_error = false; + shortdesc = "see if extended headers are attempted"; + longdesc = "\ +Return the state of the request extended headers flag on this +handle. + +B<Note:> If you want to find out if extended headers were actually +negotiated on a particular connection use +L<nbd_get_extended_headers_negotiated(3)> instead."; + see_also = [Link "set_request_extended_headers"; + Link "get_extended_headers_negotiated"; + Link "get_request_extended_headers"]; + }; + + "get_extended_headers_negotiated", { + default_call with + args = []; ret = RBool; + permitted_states = [ Negotiating; Connected; Closed ]; + shortdesc = "see if extended headers are in use"; + longdesc = "\ +After connecting you may call this to find out if the connection is +using extended headers. + +When extended headers are not in use, commands are limited to a +32-bit length, even when the libnbd API uses a 64-bit parameter +to express the length. But even when extended headers are +supported, the server may enforce other limits, visible through +L<nbd_get_block_size(3)>. + +Note that when extended headers are negotiated, you should +prefer the use of L<nbd_block_status_64(3)> instead of +L<nbd_block_status(3)> if any of the meta contexts you requested +via L<nbd_add_meta_context(3)> might return 64-bit status +values; however, all of the well-known meta contexts covered +by current C<LIBNBD_CONTEXT_*> constants only return 32-bit +status."; + see_also = [Link "set_request_extended_headers"; + Link "get_request_extended_headers"; + Link "zero"; Link "trim"; Link "cache"; + Link "block_status_64"; Link "get_block_size"; + Link "get_protocol"; + Link "get_structured_replies_negotiated"]; + }; + "set_request_structured_replies", { default_call with args = [Bool "request"]; ret = RErr; @@ -821,12 +892,16 @@ "set_request_structured_replies", { rather than find a way to alter the server to fail the negotiation request. It is also useful to set this to false prior to using L<nbd_set_opt_mode(3)> if it is desired to control when to send -L<nbd_opt_structured_reply(3)> during negotiation."; +L<nbd_opt_structured_reply(3)> during negotiation. + +Note that setting this knob to false also disables any automatic +request for extended headers."; see_also = [Link "get_request_structured_replies"; Link "set_handshake_flags"; Link "set_strict_mode"; Link "opt_structured_reply"; Link "get_structured_replies_negotiated"; - Link "can_meta_context"; Link "can_df"]; + Link "can_meta_context"; Link "can_df"; + Link "set_request_extended_headers"]; }; "get_request_structured_replies", { @@ -842,7 +917,8 @@ "get_request_structured_replies", { negotiated on a particular connection use L<nbd_get_structured_replies_negotiated(3)> instead."; see_also = [Link "set_request_structured_replies"; - Link "get_structured_replies_negotiated"]; + Link "get_structured_replies_negotiated"; + Link "get_request_extended_headers"]; }; "get_structured_replies_negotiated", { @@ -854,11 +930,17 @@ "get_structured_replies_negotiated", { After connecting you may call this to find out if the connection is using structured replies. Note that this setting is sticky; this can return true even after a second L<nbd_opt_structured_reply(3)> -returns false because the server detected a duplicate request."; +returns false because the server detected a duplicate request. + +Note that if the connection negotiates extended headers, this +function returns true (as extended headers imply structured +replies) even if no explicit request for structured replies was +attempted."; see_also = [Link "set_request_structured_replies"; Link "get_request_structured_replies"; Link "opt_structured_reply"; - Link "get_protocol"]; + Link "get_protocol"; + Link "get_extended_headers_negotiated"]; }; "set_request_meta_context", { @@ -2623,7 +2705,9 @@ "trim", { or there is an error. Note this will generally return an error if L<nbd_can_trim(3)> is false or L<nbd_is_read_only(3)> is true. -Note that not all servers can support a C<count> of 4GiB or larger. +Note that not all servers can support a C<count> of 4GiB or larger; +L<nbd_get_extended_headers_negotiated(3)> indicates which servers +will parse a request larger than 32 bits. The NBD protocol does not yet have a way for a client to learn if the server will enforce an even smaller maximum trim size, although a future extension may add a constraint visible in @@ -2654,7 +2738,9 @@ "cache", { this command. Note this will generally return an error if L<nbd_can_cache(3)> is false. -Note that not all servers can support a C<count> of 4GiB or larger. +Note that not all servers can support a C<count> of 4GiB or larger; +L<nbd_get_extended_headers_negotiated(3)> indicates which servers +will parse a request larger than 32 bits. The NBD protocol does not yet have a way for a client to learn if the server will enforce an even smaller maximum cache size, although a future extension may add a constraint visible in @@ -2683,7 +2769,9 @@ "zero", { or there is an error. Note this will generally return an error if L<nbd_can_zero(3)> is false or L<nbd_is_read_only(3)> is true. -Note that not all servers can support a C<count> of 4GiB or larger. +Note that not all servers can support a C<count> of 4GiB or larger; +L<nbd_get_extended_headers_negotiated(3)> indicates which servers +will parse a request larger than 32 bits. The NBD protocol does not yet have a way for a client to learn if the server will enforce an even smaller maximum zero size, although a future extension may add a constraint visible in @@ -2723,7 +2811,9 @@ "block_status", { are supported, the number of blocks and cumulative length of those blocks need not be identical between contexts. -Note that not all servers can support a C<count> of 4GiB or larger. +Note that not all servers can support a C<count> of 4GiB or larger; +L<nbd_get_extended_headers_negotiated(3)> indicates which servers +will parse a request larger than 32 bits. The NBD protocol does not yet have a way for a client to learn if the server will enforce an even smaller maximum block status size, although a future extension may add a constraint visible in @@ -2801,7 +2891,9 @@ "block_status_64", { are supported, the number of blocks and cumulative length of those blocks need not be identical between contexts. -Note that not all servers can support a C<count> of 4GiB or larger. +Note that not all servers can support a C<count> of 4GiB or larger; +L<nbd_get_extended_headers_negotiated(3)> indicates which servers +will parse a request larger than 32 bits. The NBD protocol does not yet have a way for a client to learn if the server will enforce an even smaller maximum block status size, although a future extension may add a constraint visible in @@ -4024,6 +4116,9 @@ let first_version (* Added in 1.17.x development cycle, will be stable and supported in 1.18. *) "block_status_64", (1, 18); "aio_block_status_64", (1, 18); + "set_request_extended_headers", (1, 18); + "get_request_extended_headers", (1, 18); + "get_extended_headers_negotiated", (1, 18); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/generator/states-newstyle-opt-starttls.c b/generator/states-newstyle-opt-starttls.c index 700604dd..e497548c 100644 --- a/generator/states-newstyle-opt-starttls.c +++ b/generator/states-newstyle-opt-starttls.c @@ -86,6 +86,7 @@ NEWSTYLE.OPT_STARTTLS.CHECK_REPLY: } nbd_internal_reset_size_and_flags (h); h->structured_replies = false; + h->extended_headers = false; h->meta_valid = false; new_sock = nbd_internal_crypto_create_session (h, h->sock); if (new_sock == NULL) { diff --git a/generator/states-newstyle-opt-structured-reply.c b/generator/states-newstyle-opt-structured-reply.c index 18245054..bb94b70a 100644 --- a/generator/states-newstyle-opt-structured-reply.c +++ b/generator/states-newstyle-opt-structured-reply.c @@ -25,7 +25,7 @@ NEWSTYLE.OPT_STRUCTURED_REPLY.START: assert (h->opt_mode); else { assert (CALLBACK_IS_NULL (h->opt_cb.completion)); - if (!h->request_sr) { + if (!h->request_sr || h->structured_replies) { if (h->opt_mode) SET_NEXT_STATE (%.NEGOTIATING); else @@ -84,6 +84,7 @@ NEWSTYLE.OPT_STRUCTURED_REPLY.CHECK_REPLY: err = 0; break; case NBD_REP_ERR_INVALID: + case NBD_REP_ERR_EXT_HEADER_REQD: err = EINVAL; /* fallthrough */ default: diff --git a/lib/handle.c b/lib/handle.c index fba6d1c4..71034835 100644 --- a/lib/handle.c +++ b/lib/handle.c @@ -64,6 +64,7 @@ nbd_create (void) h->unique = 1; h->tls_verify_peer = true; + h->request_eh = true; h->request_sr = true; h->request_meta = true; h->request_block_size = true; @@ -436,6 +437,28 @@ nbd_unlocked_clear_meta_contexts (struct nbd_handle *h) return 0; } + +int +nbd_unlocked_set_request_extended_headers (struct nbd_handle *h, + bool request) +{ + h->request_eh = request; + return 0; +} + +/* NB: may_set_error = false. */ +int +nbd_unlocked_get_request_extended_headers (struct nbd_handle *h) +{ + return h->request_eh; +} + +int +nbd_unlocked_get_extended_headers_negotiated (struct nbd_handle *h) +{ + return h->extended_headers; +} + int nbd_unlocked_set_request_structured_replies (struct nbd_handle *h, bool request) diff --git a/python/t/110-defaults.py b/python/t/110-defaults.py index ed60b786..2e59bf39 100644 --- a/python/t/110-defaults.py +++ b/python/t/110-defaults.py @@ -21,6 +21,7 @@ assert h.get_export_name() == "" assert h.get_full_info() is False assert h.get_tls() == nbd.TLS_DISABLE +assert h.get_request_extended_headers() is True assert h.get_request_structured_replies() is True assert h.get_request_meta_context() is True assert h.get_request_block_size() is True diff --git a/python/t/120-set-non-defaults.py b/python/t/120-set-non-defaults.py index a2fafe0c..3320f8cf 100644 --- a/python/t/120-set-non-defaults.py +++ b/python/t/120-set-non-defaults.py @@ -31,6 +31,8 @@ if h.supports_tls(): h.set_tls(nbd.TLS_ALLOW) assert h.get_tls() == nbd.TLS_ALLOW +h.set_request_extended_headers(False) +assert h.get_request_extended_headers() is False h.set_request_structured_replies(False) assert h.get_request_structured_replies() is False h.set_request_meta_context(False) diff --git a/ocaml/tests/test_110_defaults.ml b/ocaml/tests/test_110_defaults.ml index b95cd1f0..d7b0463f 100644 --- a/ocaml/tests/test_110_defaults.ml +++ b/ocaml/tests/test_110_defaults.ml @@ -26,6 +26,8 @@ let assert (not info); let tls = NBD.get_tls nbd in assert (tls = NBD.TLS.DISABLE); + let eh = NBD.get_request_extended_headers nbd in + assert (eh = true); let sr = NBD.get_request_structured_replies nbd in assert sr; let meta = NBD.get_request_meta_context nbd in diff --git a/ocaml/tests/test_120_set_non_defaults.ml b/ocaml/tests/test_120_set_non_defaults.ml index 478a88f3..4f5c2eb5 100644 --- a/ocaml/tests/test_120_set_non_defaults.ml +++ b/ocaml/tests/test_120_set_non_defaults.ml @@ -39,6 +39,9 @@ let let tls = NBD.get_tls nbd in assert (tls = NBD.TLS.ALLOW); ); + NBD.set_request_extended_headers nbd false; + let eh = NBD.get_request_extended_headers nbd in + assert (eh = false); NBD.set_request_structured_replies nbd false; let sr = NBD.get_request_structured_replies nbd in assert (not sr); diff --git a/golang/libnbd_110_defaults_test.go b/golang/libnbd_110_defaults_test.go index 119cb7ca..510baa39 100644 --- a/golang/libnbd_110_defaults_test.go +++ b/golang/libnbd_110_defaults_test.go @@ -51,6 +51,14 @@ func Test110Defaults(t *testing.T) { t.Fatalf("unexpected tls state") } + eh, err := h.GetRequestExtendedHeaders() + if err != nil { + t.Fatalf("could not get extended headers state: %s", err) + } + if eh != true { + t.Fatalf("unexpected extended headers state") + } + sr, err := h.GetRequestStructuredReplies() if err != nil { t.Fatalf("could not get structured replies state: %s", err) diff --git a/golang/libnbd_120_set_non_defaults_test.go b/golang/libnbd_120_set_non_defaults_test.go index b096c4f6..4c6bb3ce 100644 --- a/golang/libnbd_120_set_non_defaults_test.go +++ b/golang/libnbd_120_set_non_defaults_test.go @@ -81,6 +81,18 @@ func Test120SetNonDefaults(t *testing.T) { } } + err = h.SetRequestExtendedHeaders(false) + if err != nil { + t.Fatalf("could not set extended headers state: %s", err) + } + eh, err := h.GetRequestExtendedHeaders() + if err != nil { + t.Fatalf("could not get extended headers state: %s", err) + } + if eh != false { + t.Fatalf("unexpected extended headers state") + } + err = h.SetRequestStructuredReplies(false) if err != nil { t.Fatalf("could not set structured replies state: %s", err) -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 12/22] copy: Update nbdcopy to use 64-bit block status
Although our use of "base:allocation" doesn't require the use of the 64-bit API for flags, we might perform slightly faster for a server that does give us 64-bit extent lengths and honors larger nbd_zero lengths. Signed-off-by: Eric Blake <eblake at redhat.com> --- copy/nbd-ops.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/copy/nbd-ops.c b/copy/nbd-ops.c index f3b3bed3..71ee0a3b 100644 --- a/copy/nbd-ops.c +++ b/copy/nbd-ops.c @@ -428,7 +428,7 @@ nbd_ops_asynch_notify_write (struct rw *rw, size_t index) * request for extents in a single round trip. */ static int add_extent (void *vp, const char *metacontext, - uint64_t offset, uint32_t *entries, size_t nr_entries, + uint64_t offset, nbd_extent *entries, size_t nr_entries, int *error); static void @@ -449,11 +449,11 @@ nbd_ops_get_extents (struct rw *rw, size_t index, size_t i; exts.len = 0; - if (nbd_block_status (nbd, count, offset, - (nbd_extent_callback) { - .user_data = &exts, - .callback = add_extent - }, 0) == -1) { + if (nbd_block_status_64 (nbd, count, offset, + (nbd_extent64_callback) { + .user_data = &exts, + .callback = add_extent + }, 0) == -1) { /* XXX We could call default_get_extents, but unclear if it's * the right thing to do if the server is returning errors. */ @@ -496,7 +496,7 @@ nbd_ops_get_extents (struct rw *rw, size_t index, static int add_extent (void *vp, const char *metacontext, - uint64_t offset, uint32_t *entries, size_t nr_entries, + uint64_t offset, nbd_extent *entries, size_t nr_entries, int *error) { extent_list *ret = vp; @@ -505,25 +505,25 @@ add_extent (void *vp, const char *metacontext, if (strcmp (metacontext, "base:allocation") != 0 || *error) return 0; - for (i = 0; i < nr_entries; i += 2) { + for (i = 0; i < nr_entries; i++) { struct extent e; e.offset = offset; - e.length = entries[i]; + e.length = entries[i].length; /* Note we deliberately don't care about the HOLE flag. There is * no need to read extent that reads as zeroes. We will convert * to it to a hole or allocated extents based on the command line * arguments. */ - e.zero = (entries[i+1] & LIBNBD_STATE_ZERO) != 0; + e.zero = (entries[i].flags & LIBNBD_STATE_ZERO) != 0; if (extent_list_append (ret, e) == -1) { perror ("realloc"); exit (EXIT_FAILURE); } - offset += entries[i]; + offset += entries[i].length; } return 0; -- 2.40.1
Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 13/22] dump: Update nbddump to use 64-bit block status
Although our use of "base:allocation" doesn't require the use of the 64-bit API for flags, we might perform slightly faster for a server that does give us 64-bit extent lengths. Signed-off-by: Eric Blake <eblake at redhat.com> --- dump/dump.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/dump/dump.c b/dump/dump.c index b4aebe84..71053277 100644 --- a/dump/dump.c +++ b/dump/dump.c @@ -38,7 +38,7 @@ #include "version.h" #include "vector.h" -DEFINE_VECTOR_TYPE (uint32_vector, uint32_t); +DEFINE_VECTOR_TYPE (uint64_vector, uint64_t); static const char *progname; static struct nbd_handle *nbd; @@ -262,10 +262,10 @@ catch_signal (int sig) static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, + nbd_extent *entries, size_t nr_entries, int *error) { - uint32_vector *list = user_data; + uint64_vector *list = user_data; size_t i; if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) != 0) @@ -273,7 +273,8 @@ extent_callback (void *user_data, const char *metacontext, /* Just append the entries we got to the list. */ for (i = 0; i < nr_entries; ++i) { - if (uint32_vector_append (list, entries[i]) == -1) { + if (uint64_vector_append (list, entries[i].length) == -1 || + uint64_vector_append (list, entries[i].flags) == -1) { perror ("realloc"); exit (EXIT_FAILURE); } @@ -284,7 +285,7 @@ extent_callback (void *user_data, const char *metacontext, static bool test_all_zeroes (uint64_t offset, size_t count) { - uint32_vector entries = empty_vector; + uint64_vector entries = empty_vector; size_t i; uint64_t count_read; @@ -296,22 +297,22 @@ test_all_zeroes (uint64_t offset, size_t count) * false, causing the main code to do a full read. We could be * smarter and keep asking the server (XXX). */ - if (nbd_block_status (nbd, count, offset, - (nbd_extent_callback) { - .callback = extent_callback, - .user_data = &entries }, - 0) == -1) { + if (nbd_block_status_64 (nbd, count, offset, + (nbd_extent64_callback) { + .callback = extent_callback, + .user_data = &entries }, + 0) == -1) { fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } count_read = 0; for (i = 0; i < entries.len; i += 2) { - uint32_t len = entries.ptr[i]; - uint32_t type = entries.ptr[i+1]; + uint64_t len = entries.ptr[i]; + uint64_t type = entries.ptr[i+1]; count_read += len; - if (!(type & 2)) /* not zero */ + if (!(type & LIBNBD_STATE_ZERO)) /* not zero */ return false; } -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 14/22] info: Expose extended-headers support through nbdinfo
Add another bit of overall server information, as well as a '--can extended-headers' silent query. For now, the testsuite is written assuming that when nbdkit finally adds extended headers support, it will also add a --no-eh kill switch comparable to its existing --no-sr switch. Signed-off-by: Eric Blake <eblake at redhat.com> --- info/nbdinfo.pod | 11 ++++++++++- info/can.c | 9 +++++++++ info/info-can.sh | 27 +++++++++++++++++++++++++++ info/info-packets.sh | 17 ++++++++++++++++- info/main.c | 7 ++++++- 5 files changed, 68 insertions(+), 3 deletions(-) diff --git a/info/nbdinfo.pod b/info/nbdinfo.pod index 7eb3c1a0..9ea4a278 100644 --- a/info/nbdinfo.pod +++ b/info/nbdinfo.pod @@ -86,6 +86,7 @@ the I<--json> parameter: "protocol": "newstyle-fixed", "TLS": false, "structured": true, + "extended": false, "exports": [ { "export-name": "", @@ -165,6 +166,11 @@ Test if the NBD URI connection is using TLS. Test if server can respond with structured replies (a prerequisite for supporting block status commands). +=item nbdinfo --can extended-headers URI + +Test if server supports extended headers (a prerequisite for +supporting 64-bit commands; implies structured replies as well). + =item nbdinfo --is rotational URI Test if the server export is backed by something which behaves like a @@ -312,6 +318,8 @@ Display brief command line help and exit. =item B<--can df> +=item B<--can extended-headers> + =item B<--can fast-zero> =item B<--can flush> @@ -341,7 +349,8 @@ and the following libnbd functions: L<nbd_can_cache(3)>, L<nbd_can_df(3)>, L<nbd_can_fast_zero(3)>, L<nbd_can_flush(3)>, L<nbd_can_fua(3)>, L<nbd_can_multi_conn(3)>, L<nbd_can_trim(3)>, L<nbd_can_zero(3)>, L<nbd_is_read_only(3)>, -L<nbd_get_structured_replies_negotiated(3)>. +L<nbd_get_structured_replies_negotiated(3)>, +L<nbd_get_extended_headers_negotiated(3)>. =item B<--color> diff --git a/info/can.c b/info/can.c index 01ab4806..31c4a1ca 100644 --- a/info/can.c +++ b/info/can.c @@ -50,6 +50,15 @@ do_can (void) strcasecmp (can, "structured_replies") == 0) feature = nbd_get_structured_replies_negotiated (nbd); + else if (strcasecmp (can, "eh") == 0 || + strcasecmp (can, "extended header") == 0 || + strcasecmp (can, "extended-header") == 0 || + strcasecmp (can, "extended_header") == 0 || + strcasecmp (can, "extended headers") == 0 || + strcasecmp (can, "extended-headers") == 0 || + strcasecmp (can, "extended_headers") == 0) + feature = nbd_get_extended_headers_negotiated (nbd); + else if (strcasecmp (can, "readonly") == 0 || strcasecmp (can, "read-only") == 0 || strcasecmp (can, "read_only") == 0) diff --git a/info/info-can.sh b/info/info-can.sh index 6cc8cbf4..8154d1ce 100755 --- a/info/info-can.sh +++ b/info/info-can.sh @@ -61,6 +61,33 @@ esac EOF test $st = 2 +# --can extended-headers cannot be positively tested until nbdkit gains +# --no-eh support. Otherwise, it is similar to --can structured-reply. + +no_eh+if nbdkit --no-eh --help >/dev/null 2>/dev/null; then + no_eh=--no-eh + nbdkit -v -U - sh - \ + --run '$VG nbdinfo --can extended-headers "nbd+unix:///?socket=$unixsocket"' <<'EOF' +case "$1" in + get_size) echo 1024 ;; + pread) ;; + *) exit 2 ;; +esac +EOF +fi + +st=0 +nbdkit -v -U - $no_eh sh - \ + --run '$VG nbdinfo --can extended-headers "nbd+unix:///?socket=$unixsocket"' <<'EOF' || st=$? +case "$1" in + get_size) echo 1024 ;; + pread) ;; + *) exit 2 ;; +esac +EOF +test $st = 2 + # --can cache and --can fua require special handling because in # nbdkit-sh-plugin we must print "native" or "none". Also the can_fua # flag is only sent if the export is writable (hence can_write below). diff --git a/info/info-packets.sh b/info/info-packets.sh index 2460052e..410faef8 100755 --- a/info/info-packets.sh +++ b/info/info-packets.sh @@ -27,12 +27,27 @@ requires nbdkit --no-sr memory --version out=info-packets.out cleanup_fn rm -f $out +# Older nbdkit does not support extended headers; --no-eh is a reliable +# witness of whether nbdkit is new enough. + +no_eh+if nbdkit --no-eh --help >/dev/null 2>/dev/null; then + no_eh=--no-eh +fi + nbdkit --no-sr -U - memory size=1M \ --run '$VG nbdinfo "nbd+unix:///?socket=$unixsocket"' > $out cat $out grep "protocol: .*using simple packets" $out -nbdkit -U - memory size=1M \ +nbdkit $no_eh -U - memory size=1M \ --run '$VG nbdinfo "nbd+unix:///?socket=$unixsocket"' > $out cat $out grep "protocol: .*using structured packets" $out + +if test x != "x$no_eh"; then + nbdkit -U - memory size=1M \ + --run '$VG nbdinfo "nbd+unix:///?socket=$unixsocket"' > $out + cat $out + grep "protocol: .*using extended packets" $out +fi diff --git a/info/main.c b/info/main.c index 1b99e089..8c923266 100644 --- a/info/main.c +++ b/info/main.c @@ -302,11 +302,13 @@ main (int argc, char *argv[]) const char *protocol; int tls_negotiated; int sr_negotiated; + int eh_negotiated; /* Print per-connection fields. */ protocol = nbd_get_protocol (nbd); tls_negotiated = nbd_get_tls_negotiated (nbd); sr_negotiated = nbd_get_structured_replies_negotiated (nbd); + eh_negotiated = nbd_get_extended_headers_negotiated (nbd); if (!json_output) { if (protocol) { @@ -314,8 +316,9 @@ main (int argc, char *argv[]) fprintf (fp, "protocol: %s", protocol); if (tls_negotiated >= 0) fprintf (fp, " %s TLS", tls_negotiated ? "with" : "without"); - if (sr_negotiated >= 0) + if (eh_negotiated >= 0 && sr_negotiated >= 0) fprintf (fp, ", using %s packets", + eh_negotiated ? "extended" : sr_negotiated ? "structured" : "simple"); fprintf (fp, "\n"); ansi_restore (fp); @@ -333,6 +336,8 @@ main (int argc, char *argv[]) fprintf (fp, "\"TLS\": %s,\n", tls_negotiated ? "true" : "false"); if (sr_negotiated >= 0) fprintf (fp, "\"structured\": %s,\n", sr_negotiated ? "true" : "false"); + if (eh_negotiated >= 0) + fprintf (fp, "\"extended\": %s,\n", eh_negotiated ? "true" : "false"); } if (!list_all) -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 15/22] info: Update nbdinfo --map to use 64-bit block status
Although we usually map "base:allocation" which doesn't require the use of the 64-bit API for flags, this application IS intended to map out other metacontexts that might have 64-bit flags. And when extended headers are in use, we might as well ask for the server to give us extents as large as it wants, rather than breaking things up at 4G boundaries. At the time this patch was written, there are no known servers that actually provide a metacontext with 64-bit flags. However, that is planned for the nbdkit v3 protocol. Signed-off-by: Eric Blake <eblake at redhat.com> --- info/map.c | 65 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 34 insertions(+), 31 deletions(-) diff --git a/info/map.c b/info/map.c index 1169fce4..50b058f2 100644 --- a/info/map.c +++ b/info/map.c @@ -36,13 +36,13 @@ #include "nbdinfo.h" -DEFINE_VECTOR_TYPE (uint32_vector, uint32_t); +DEFINE_VECTOR_TYPE (uint64_vector, uint64_t); -static void print_extents (uint32_vector *entries); -static void print_totals (uint32_vector *entries, int64_t size); +static void print_extents (uint64_vector *entries); +static void print_totals (uint64_vector *entries, int64_t size); static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, + nbd_extent *entries, size_t nr_entries, int *error); void @@ -50,7 +50,7 @@ do_map (void) { size_t i; int64_t size; - uint32_vector entries = empty_vector; + uint64_vector entries = empty_vector; uint64_t offset, align, max_len; size_t prev_entries_size; @@ -69,14 +69,16 @@ do_map (void) fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } + if (nbd_get_extended_headers_negotiated (nbd) == 1) + max_len = size; for (offset = 0; offset < size;) { prev_entries_size = entries.len; - if (nbd_block_status (nbd, MIN (size - offset, max_len), offset, - (nbd_extent_callback) { - .callback = extent_callback, - .user_data = &entries }, - 0) == -1) { + if (nbd_block_status_64 (nbd, MIN (size - offset, max_len), offset, + (nbd_extent64_callback) { + .callback = extent_callback, + .user_data = &entries }, + 0) == -1) { fprintf (stderr, "%s: %s\n", progname, nbd_get_error ()); exit (EXIT_FAILURE); } @@ -99,18 +101,18 @@ do_map (void) } /* Callback handling --map. */ -static void print_one_extent (uint64_t offset, uint64_t len, uint32_t type); -static void extent_description (const char *metacontext, uint32_t type, +static void print_one_extent (uint64_t offset, uint64_t len, uint64_t type); +static void extent_description (const char *metacontext, uint64_t type, char **descr, bool *free_descr, const char **fg, const char **bg); static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, + nbd_extent *entries, size_t nr_entries, int *error) { - uint32_vector *list = user_data; + uint64_vector *list = user_data; size_t i; if (strcmp (metacontext, map) != 0) @@ -120,7 +122,8 @@ extent_callback (void *user_data, const char *metacontext, * print_extents below. */ for (i = 0; i < nr_entries; ++i) { - if (uint32_vector_append (list, entries[i]) == -1) { + if (uint64_vector_append (list, entries[i].length) == -1 || + uint64_vector_append (list, entries[i].flags) == -1) { perror ("realloc"); exit (EXIT_FAILURE); } @@ -129,7 +132,7 @@ extent_callback (void *user_data, const char *metacontext, } static void -print_extents (uint32_vector *entries) +print_extents (uint64_vector *entries) { size_t i, j; uint64_t offset = 0; /* end of last extent printed + 1 */ @@ -138,7 +141,7 @@ print_extents (uint32_vector *entries) if (json_output) fprintf (fp, "[\n"); for (i = 0; i < entries->len; i += 2) { - uint32_t type = entries->ptr[last+1]; + uint64_t type = entries->ptr[last+1]; /* If we're coalescing and the current type is different from the * previous one then we should print everything up to this entry. @@ -157,7 +160,7 @@ print_extents (uint32_vector *entries) /* Print the last extent if there is one. */ if (last != i) { - uint32_t type = entries->ptr[last+1]; + uint64_t type = entries->ptr[last+1]; uint64_t len; for (j = last, len = 0; j < i; j += 2) @@ -169,7 +172,7 @@ print_extents (uint32_vector *entries) } static void -print_one_extent (uint64_t offset, uint64_t len, uint32_t type) +print_one_extent (uint64_t offset, uint64_t len, uint64_t type) { static bool comma = false; char *descr; @@ -185,7 +188,7 @@ print_one_extent (uint64_t offset, uint64_t len, uint32_t type) ansi_colour (bg, fp); fprintf (fp, "%10" PRIu64 " " "%10" PRIu64 " " - "%3" PRIu32, + "%3" PRIu64, offset, len, type); if (descr) fprintf (fp, " %s", descr); @@ -199,7 +202,7 @@ print_one_extent (uint64_t offset, uint64_t len, uint32_t type) fprintf (fp, "{ \"offset\": %" PRIu64 ", " "\"length\": %" PRIu64 ", " - "\"type\": %" PRIu32, + "\"type\": %" PRIu64, offset, len, type); if (descr) { fprintf (fp, ", \"description\": "); @@ -215,9 +218,9 @@ print_one_extent (uint64_t offset, uint64_t len, uint32_t type) /* --map --totals suboption */ static void -print_totals (uint32_vector *entries, int64_t size) +print_totals (uint64_vector *entries, int64_t size) { - uint32_t type; + uint64_t type; bool comma = false; /* This is necessary to avoid a divide by zero below, but if the @@ -237,16 +240,16 @@ print_totals (uint32_vector *entries, int64_t size) */ type = 0; for (;;) { - uint64_t next_type = (uint64_t)UINT32_MAX + 1; + uint64_t next_type = 0; uint64_t c = 0; size_t i; for (i = 0; i < entries->len; i += 2) { - uint32_t t = entries->ptr[i+1]; + uint64_t t = entries->ptr[i+1]; if (t == type) c += entries->ptr[i]; - else if (type < t && t < next_type) + else if (type < t && (next_type == 0 || t < next_type)) next_type = t; } @@ -263,7 +266,7 @@ print_totals (uint32_vector *entries, int64_t size) ansi_colour (fg, fp); if (bg) ansi_colour (bg, fp); - fprintf (fp, "%10" PRIu64 " %5.1f%% %3" PRIu32, + fprintf (fp, "%10" PRIu64 " %5.1f%% %3" PRIu64, c, percent, type); if (descr) fprintf (fp, " %s", descr); @@ -278,7 +281,7 @@ print_totals (uint32_vector *entries, int64_t size) fprintf (fp, "{ \"size\": %" PRIu64 ", " "\"percent\": %g, " - "\"type\": %" PRIu32, + "\"type\": %" PRIu64, c, percent, type); if (descr) { fprintf (fp, ", \"description\": "); @@ -292,7 +295,7 @@ print_totals (uint32_vector *entries, int64_t size) free (descr); } - if (next_type == (uint64_t)UINT32_MAX + 1) + if (next_type == 0) break; type = next_type; } @@ -301,7 +304,7 @@ print_totals (uint32_vector *entries, int64_t size) } static void -extent_description (const char *metacontext, uint32_t type, +extent_description (const char *metacontext, uint64_t type, char **descr, bool *free_descr, const char **fg, const char **bg) { @@ -348,7 +351,7 @@ extent_description (const char *metacontext, uint32_t type, *fg = ANSI_FG_BRIGHT_WHITE; *bg = ANSI_BG_BLACK; return; default: - if (asprintf (descr, "backing depth %u", type) == -1) { + if (asprintf (descr, "backing depth %" PRIu64, type) == -1) { perror ("asprintf"); exit (EXIT_FAILURE); } -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 16/22] examples: Update copy-libev to use 64-bit block status
Although our use of "base:allocation" doesn't require the use of the 64-bit API for flags, we might perform slightly faster for a server that does give us 64-bit extent lengths and honors larger nbd_zero lengths. Signed-off-by: Eric Blake <eblake at redhat.com> --- examples/copy-libev.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/examples/copy-libev.c b/examples/copy-libev.c index 32cb46b3..9346faf7 100644 --- a/examples/copy-libev.c +++ b/examples/copy-libev.c @@ -96,7 +96,7 @@ struct request { }; struct extent { - uint32_t length; + uint64_t length; bool zero; }; @@ -184,7 +184,7 @@ get_events (struct connection *c) static int extent_callback (void *user_data, const char *metacontext, uint64_t offset, - uint32_t *entries, size_t nr_entries, int *error) + nbd_extent *entries, size_t nr_entries, int *error) { struct request *r = user_data; @@ -199,22 +199,21 @@ extent_callback (void *user_data, const char *metacontext, uint64_t offset, return 1; } - /* Libnbd returns uint32_t pair (length, flags) for each extent. */ - extents_len = nr_entries / 2; + extents_len = nr_entries; extents = malloc (extents_len * sizeof *extents); if (extents == NULL) FAIL ("Cannot allocated extents: %s", strerror (errno)); /* Copy libnbd entries to extents array. */ - for (int i = 0, j = 0; i < extents_len; i++, j=i*2) { - extents[i].length = entries[j]; + for (int i = 0; i < extents_len; i++) { + extents[i].length = entries[i].length; /* Libnbd exposes both ZERO and HOLE flags. We care only about * ZERO status, meaning we can copy this extent using efficinet * zero method. */ - extents[i].zero = (entries[j + 1] & LIBNBD_STATE_ZERO) != 0; + extents[i].zero = (entries[i].flags & LIBNBD_STATE_ZERO) != 0; } DEBUG ("r%zu: received %zu extents for %s", @@ -286,10 +285,10 @@ start_extents (struct request *r) DEBUG ("r%zu: start extents offset=%" PRIi64 " count=%zu", r->index, offset, count); - cookie = nbd_aio_block_status ( + cookie = nbd_aio_block_status_64 ( src.nbd, count, offset, - (nbd_extent_callback) { .callback=extent_callback, - .user_data=r }, + (nbd_extent64_callback) { .callback=extent_callback, + .user_data=r }, (nbd_completion_callback) { .callback=extents_completed, .user_data=r }, 0); @@ -324,7 +323,7 @@ next_extent (struct request *r) limit = MIN (REQUEST_SIZE, size - offset); while (length < limit) { - DEBUG ("e%zu: offset=%" PRIi64 " len=%" PRIu32 " zero=%d", + DEBUG ("e%zu: offset=%" PRIi64 " len=%" PRIu64 " zero=%d", extents_pos, offset, extents[extents_pos].length, is_zero); /* If this extent is too large, steal some data from it to -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 17/22] ocaml: Add example for 64-bit extents
Since our example program for 32-bit extents is inherently limited to 32-bit lengths, it is also worth demonstrating the 64-bit extent API, including the difference in the array indexing being saner. Signed-off-by: Eric Blake <eblake at redhat.com> --- ocaml/examples/Makefile.am | 1 + ocaml/examples/extents64.ml | 42 +++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) create mode 100644 ocaml/examples/extents64.ml diff --git a/ocaml/examples/Makefile.am b/ocaml/examples/Makefile.am index 28b4ab94..a4eb47a5 100644 --- a/ocaml/examples/Makefile.am +++ b/ocaml/examples/Makefile.am @@ -20,6 +20,7 @@ include $(top_srcdir)/subdir-rules.mk ml_examples = \ asynch_copy.ml \ extents.ml \ + extents64.ml \ get_size.ml \ open_qcow2.ml \ server_flags.ml \ diff --git a/ocaml/examples/extents64.ml b/ocaml/examples/extents64.ml new file mode 100644 index 00000000..8ee7e218 --- /dev/null +++ b/ocaml/examples/extents64.ml @@ -0,0 +1,42 @@ +open Printf + +let () + NBD.with_handle ( + fun nbd -> + NBD.add_meta_context nbd "base:allocation"; + NBD.connect_command nbd + ["nbdkit"; "-s"; "--exit-with-parent"; "-r"; + "sparse-random"; "8G"]; + + (* Read the extents and print them. *) + let size = NBD.get_size nbd in + let cap + match NBD.get_extended_headers_negotiated nbd with + | true -> size + | false -> 0x8000_0000_L in + let fetch_offset = ref 0_L in + while !fetch_offset < size do + let remaining = Int64.sub size !fetch_offset in + let fetch_size = min remaining cap in + NBD.block_status_64 nbd fetch_size !fetch_offset ( + fun meta _ entries err -> + printf "nbd_block_status callback: meta=%s err=%d\n" meta !err; + if meta = "base:allocation" then ( + printf "index\t%16s %16s %s\n" "offset" "length" "flags"; + for i = 0 to Array.length entries - 1 do + let len = fst entries.(i) + and flags + match snd entries.(i) with + | 0_L -> "data" + | 1_L -> "hole" + | 2_L -> "zero" + | 3_L -> "hole+zero" + | unknown -> sprintf "unknown (%Ld)" unknown in + printf "%d:\t%16Ld %16Ld %s\n" i !fetch_offset len flags; + fetch_offset := Int64.add !fetch_offset len + done; + ); + 0 + ) (* NBD.block_status *) + done + ) -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 18/22] generator: Actually request extended headers
This is the culmination of the previous patches' preparation work for using extended headers when possible. The new states in the state machine are copied extensively from our handling of OPT_STRUCTURED_REPLY. The next patch will then expose a new API nbd_opt_extended_headers() for manual control. At the same time I posted this patch, I had patches for qemu-nbd to support extended headers as server (nbdkit is a bit tougher). The next patches will add some interop tests that pass when using a new enough qemu-nbd, showing that we have cross-project interoperability and therefore an extension worth standardizing. Signed-off-by: Eric Blake <eblake at redhat.com> --- generator/API.ml | 87 ++++++++--------- generator/Makefile.am | 1 + generator/state_machine.ml | 41 ++++++++ .../states-newstyle-opt-extended-headers.c | 94 +++++++++++++++++++ generator/states-newstyle-opt-starttls.c | 6 +- 5 files changed, 184 insertions(+), 45 deletions(-) create mode 100644 generator/states-newstyle-opt-extended-headers.c diff --git a/generator/API.ml b/generator/API.ml index 7616990a..078f140f 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -953,23 +953,24 @@ "set_request_meta_context", { (all C<nbd_connect_*> calls when L<nbd_set_opt_mode(3)> is false, or L<nbd_opt_go(3)> and L<nbd_opt_info(3)> when option mode is enabled) will also try to issue NBD_OPT_SET_META_CONTEXT when -the server supports structured replies and any contexts were -registered by L<nbd_add_meta_context(3)>. The default setting -is true; however the extra step of negotiating meta contexts is -not always desirable: performing both info and go on the same -export works without needing to re-negotiate contexts on the -second call; integration testing of other servers may benefit -from manual invocation of L<nbd_opt_set_meta_context(3)> at -other times in the negotiation sequence; and even when using just -L<nbd_opt_info(3)>, it can be faster to collect the server's +the server supports structured replies or extended headers and +any contexts were registered by L<nbd_add_meta_context(3)>. The +default setting is true; however the extra step of negotiating +meta contexts is not always desirable: performing both info and +go on the same export works without needing to re-negotiate +contexts on the second call; integration testing of other servers +may benefit from manual invocation of L<nbd_opt_set_meta_context(3)> +at other times in the negotiation sequence; and even when using +just L<nbd_opt_info(3)>, it can be faster to collect the server's results by relying on the callback function passed to L<nbd_opt_list_meta_context(3)> than a series of post-process calls to L<nbd_can_meta_context(3)>. Note that this control has no effect if the server does not -negotiate structured replies, or if the client did not request -any contexts via L<nbd_add_meta_context(3)>. Setting this -control to false may cause L<nbd_block_status(3)> to fail."; +negotiate structured replies or extended headers, or if the +client did not request any contexts via L<nbd_add_meta_context(3)>. +Setting this control to false may cause L<nbd_block_status(3)> +to fail."; see_also = [Link "set_opt_mode"; Link "opt_go"; Link "opt_info"; Link "opt_list_meta_context"; Link "opt_set_meta_context"; Link "get_structured_replies_negotiated"; @@ -1404,11 +1405,11 @@ "opt_info", { If successful, functions like L<nbd_is_read_only(3)> and L<nbd_get_size(3)> will report details about that export. If L<nbd_set_request_meta_context(3)> is set (the default) and -structured replies were negotiated, it is also valid to use -L<nbd_can_meta_context(3)> after this call. However, it may be -more efficient to clear that setting and manually utilize -L<nbd_opt_list_meta_context(3)> with its callback approach, for -learning which contexts an export supports. In general, if +structured replies or extended headers were negotiated, it is also +valid to use L<nbd_can_meta_context(3)> after this call. However, +it may be more efficient to clear that setting and manually +utilize L<nbd_opt_list_meta_context(3)> with its callback approach, +for learning which contexts an export supports. In general, if L<nbd_opt_go(3)> is called next, that call will likely succeed with the details remaining the same, although this is not guaranteed by all servers. @@ -1538,12 +1539,12 @@ "opt_set_meta_context", { recent L<nbd_set_export_name(3)> or L<nbd_connect_uri(3)>. This can only be used if L<nbd_set_opt_mode(3)> enabled option mode. Normally, this function is redundant, as L<nbd_opt_go(3)> -automatically does the same task if structured replies have -already been negotiated. But manual control over meta context -requests can be useful for fine-grained testing of how a server -handles unusual negotiation sequences. Often, use of this -function is coupled with L<nbd_set_request_meta_context(3)> to -bypass the automatic context request normally performed by +automatically does the same task if structured replies or extended +headers have already been negotiated. But manual control over +meta context requests can be useful for fine-grained testing of +how a server handles unusual negotiation sequences. Often, use +of this function is coupled with L<nbd_set_request_meta_context(3)> +to bypass the automatic context request normally performed by L<nbd_opt_go(3)>. The NBD protocol allows a client to decide how many queries to ask @@ -1597,12 +1598,13 @@ "opt_set_meta_context_queries", { or L<nbd_connect_uri(3)>. This can only be used if L<nbd_set_opt_mode(3)> enabled option mode. Normally, this function is redundant, as L<nbd_opt_go(3)> automatically does -the same task if structured replies have already been -negotiated. But manual control over meta context requests can -be useful for fine-grained testing of how a server handles -unusual negotiation sequences. Often, use of this function is -coupled with L<nbd_set_request_meta_context(3)> to bypass the -automatic context request normally performed by L<nbd_opt_go(3)>. +the same task if structured replies or extended headers have +already been negotiated. But manual control over meta context +requests can be useful for fine-grained testing of how a server +handles unusual negotiation sequences. Often, use of this +function is coupled with L<nbd_set_request_meta_context(3)> to +bypass the automatic context request normally performed by +L<nbd_opt_go(3)>. The NBD protocol allows a client to decide how many queries to ask the server. This function takes an explicit list of queries; to @@ -3281,13 +3283,13 @@ "aio_opt_set_meta_context", { recent L<nbd_set_export_name(3)> or L<nbd_connect_uri(3)>. This can only be used if L<nbd_set_opt_mode(3)> enabled option mode. Normally, this function is redundant, as L<nbd_opt_go(3)> -automatically does the same task if structured replies have -already been negotiated. But manual control over meta context -requests can be useful for fine-grained testing of how a server -handles unusual negotiation sequences. Often, use of this -function is coupled with L<nbd_set_request_meta_context(3)> to -bypass the automatic context request normally performed by -L<nbd_opt_go(3)>. +automatically does the same task if structured replies or +extended headers have already been negotiated. But manual +control over meta context requests can be useful for fine-grained +testing of how a server handles unusual negotiation sequences. +Often, use of this function is coupled with +L<nbd_set_request_meta_context(3)> to bypass the automatic +context request normally performed by L<nbd_opt_go(3)>. To determine when the request completes, wait for L<nbd_aio_is_connecting(3)> to return false. Or supply the optional @@ -3314,12 +3316,13 @@ "aio_opt_set_meta_context_queries", { or L<nbd_connect_uri(3)>. This can only be used if L<nbd_set_opt_mode(3)> enabled option mode. Normally, this function is redundant, as L<nbd_opt_go(3)> automatically does -the same task if structured replies have already been -negotiated. But manual control over meta context requests can -be useful for fine-grained testing of how a server handles -unusual negotiation sequences. Often, use of this function is -coupled with L<nbd_set_request_meta_context(3)> to bypass the -automatic context request normally performed by L<nbd_opt_go(3)>. +the same task if structured replies or extended headers have +already been negotiated. But manual control over meta context +requests can be useful for fine-grained testing of how a server +handles unusual negotiation sequences. Often, use of this +function is coupled with L<nbd_set_request_meta_context(3)> to +bypass the automatic context request normally performed by +L<nbd_opt_go(3)>. To determine when the request completes, wait for L<nbd_aio_is_connecting(3)> to return false. Or supply the optional diff --git a/generator/Makefile.am b/generator/Makefile.am index 91dbde5c..fc79b1b9 100644 --- a/generator/Makefile.am +++ b/generator/Makefile.am @@ -30,6 +30,7 @@ states_code = \ states-issue-command.c \ states-magic.c \ states-newstyle-opt-export-name.c \ + states-newstyle-opt-extended-headers.c \ states-newstyle-opt-list.c \ states-newstyle-opt-go.c \ states-newstyle-opt-meta-context.c \ diff --git a/generator/state_machine.ml b/generator/state_machine.ml index 1f0d00b0..d09ac792 100644 --- a/generator/state_machine.ml +++ b/generator/state_machine.ml @@ -297,6 +297,7 @@ and * NEGOTIATING after OPT_STRUCTURED_REPLY or any failed OPT_GO. *) Group ("OPT_STARTTLS", newstyle_opt_starttls_state_machine); + Group ("OPT_EXTENDED_HEADERS", newstyle_opt_extended_headers_state_machine); Group ("OPT_STRUCTURED_REPLY", newstyle_opt_structured_reply_state_machine); Group ("OPT_META_CONTEXT", newstyle_opt_meta_context_state_machine); Group ("OPT_GO", newstyle_opt_go_state_machine); @@ -441,6 +442,46 @@ and }; ] +(* Fixed newstyle NBD_OPT_EXTENDED_HEADERS option. + * Implementation: generator/states-newstyle-opt-extended-headers.c + *) +and newstyle_opt_extended_headers_state_machine = [ + State { + default_state with + name = "START"; + comment = "Try to negotiate newstyle NBD_OPT_EXTENDED_HEADERS"; + external_events = []; + }; + + State { + default_state with + name = "SEND"; + comment = "Send newstyle NBD_OPT_EXTENDED_HEADERS negotiation request"; + external_events = [ NotifyWrite, "" ]; + }; + + State { + default_state with + name = "RECV_REPLY"; + comment = "Receive newstyle NBD_OPT_EXTENDED_HEADERS option reply"; + external_events = [ NotifyRead, "" ]; + }; + + State { + default_state with + name = "RECV_REPLY_PAYLOAD"; + comment = "Receive any newstyle NBD_OPT_EXTENDED_HEADERS reply payload"; + external_events = [ NotifyRead, "" ]; + }; + + State { + default_state with + name = "CHECK_REPLY"; + comment = "Check newstyle NBD_OPT_EXTENDED_HEADERS option reply"; + external_events = []; + }; +] + (* Fixed newstyle NBD_OPT_STRUCTURED_REPLY option. * Implementation: generator/states-newstyle-opt-structured-reply.c *) diff --git a/generator/states-newstyle-opt-extended-headers.c b/generator/states-newstyle-opt-extended-headers.c new file mode 100644 index 00000000..1ec25e97 --- /dev/null +++ b/generator/states-newstyle-opt-extended-headers.c @@ -0,0 +1,94 @@ +/* nbd client library in userspace: state machine + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* State machine for negotiating NBD_OPT_EXTENDED_HEADERS. */ + +STATE_MACHINE { + NEWSTYLE.OPT_EXTENDED_HEADERS.START: + assert (h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE); + assert (h->opt_current != NBD_OPT_EXTENDED_HEADERS); + assert (CALLBACK_IS_NULL (h->opt_cb.completion)); + if (!h->request_eh || !h->request_sr) { + SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + return 0; + } + + h->sbuf.option.version = htobe64 (NBD_NEW_VERSION); + h->sbuf.option.option = htobe32 (NBD_OPT_EXTENDED_HEADERS); + h->sbuf.option.optlen = htobe32 (0); + h->chunks_sent++; + h->wbuf = &h->sbuf; + h->wlen = sizeof h->sbuf.option; + SET_NEXT_STATE (%SEND); + return 0; + + NEWSTYLE.OPT_EXTENDED_HEADERS.SEND: + switch (send_from_wbuf (h)) { + case -1: SET_NEXT_STATE (%.DEAD); return 0; + case 0: + h->rbuf = &h->sbuf; + h->rlen = sizeof h->sbuf.or.option_reply; + SET_NEXT_STATE (%RECV_REPLY); + } + return 0; + + NEWSTYLE.OPT_EXTENDED_HEADERS.RECV_REPLY: + switch (recv_into_rbuf (h)) { + case -1: SET_NEXT_STATE (%.DEAD); return 0; + case 0: + if (prepare_for_reply_payload (h, NBD_OPT_EXTENDED_HEADERS) == -1) { + SET_NEXT_STATE (%.DEAD); + return 0; + } + SET_NEXT_STATE (%RECV_REPLY_PAYLOAD); + } + return 0; + + NEWSTYLE.OPT_EXTENDED_HEADERS.RECV_REPLY_PAYLOAD: + switch (recv_into_rbuf (h)) { + case -1: SET_NEXT_STATE (%.DEAD); return 0; + case 0: SET_NEXT_STATE (%CHECK_REPLY); + } + return 0; + + NEWSTYLE.OPT_EXTENDED_HEADERS.CHECK_REPLY: + uint32_t reply; + + reply = be32toh (h->sbuf.or.option_reply.reply); + switch (reply) { + case NBD_REP_ACK: + debug (h, "negotiated extended headers on this connection"); + h->extended_headers = true; + /* Extended headers trump structured replies, so skip ahead. */ + h->structured_replies = true; + break; + default: + if (handle_reply_error (h) == -1) { + SET_NEXT_STATE (%.DEAD); + return 0; + } + + debug (h, "extended headers are not supported by this server"); + break; + } + + /* Next option. */ + SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + return 0; + +} /* END STATE MACHINE */ diff --git a/generator/states-newstyle-opt-starttls.c b/generator/states-newstyle-opt-starttls.c index e497548c..1e2997a3 100644 --- a/generator/states-newstyle-opt-starttls.c +++ b/generator/states-newstyle-opt-starttls.c @@ -26,7 +26,7 @@ NEWSTYLE.OPT_STARTTLS.START: else { /* If TLS was not requested we skip this option and go to the next one. */ if (h->tls == LIBNBD_TLS_DISABLE) { - SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + SET_NEXT_STATE (%^OPT_EXTENDED_HEADERS.START); return 0; } assert (CALLBACK_IS_NULL (h->opt_cb.completion)); @@ -128,7 +128,7 @@ NEWSTYLE.OPT_STARTTLS.CHECK_REPLY: SET_NEXT_STATE (%.NEGOTIATING); else { debug (h, "continuing with unencrypted connection"); - SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + SET_NEXT_STATE (%^OPT_EXTENDED_HEADERS.START); } return 0; } @@ -185,7 +185,7 @@ NEWSTYLE.OPT_STARTTLS.TLS_HANDSHAKE_DONE: if (h->opt_current == NBD_OPT_STARTTLS) SET_NEXT_STATE (%.NEGOTIATING); else - SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + SET_NEXT_STATE (%^OPT_EXTENDED_HEADERS.START); return 0; } /* END STATE MACHINE */ -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 19/22] api: Add nbd_[aio_]opt_extended_headers()
Very similar to the recent addition of nbd_opt_structured_reply, giving us fine-grained control over an extended headers request. Because nbdkit does not yet support extended headers, testsuite coverage is limited to interop testing with qemu-nbd. It shows that extended headers imply structured replies, regardless of which order the two are manually negotiated in. Signed-off-by: Eric Blake <eblake at redhat.com> --- generator/API.ml | 79 +++++++-- .../states-newstyle-opt-extended-headers.c | 30 +++- generator/states-newstyle.c | 3 + lib/opt.c | 44 +++++ interop/Makefile.am | 6 + interop/opt-extended-headers.c | 153 ++++++++++++++++++ interop/opt-extended-headers.sh | 29 ++++ .gitignore | 1 + 8 files changed, 329 insertions(+), 16 deletions(-) create mode 100644 interop/opt-extended-headers.c create mode 100755 interop/opt-extended-headers.sh diff --git a/generator/API.ml b/generator/API.ml index 078f140f..85625bbd 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -825,6 +825,7 @@ "set_request_extended_headers", { if L<nbd_set_request_structured_replies(3)> is also set to false, since the use of extended headers implies structured replies."; see_also = [Link "get_request_extended_headers"; + Link "opt_extended_headers"; Link "set_handshake_flags"; Link "set_strict_mode"; Link "get_extended_headers_negotiated"; Link "zero"; Link "trim"; Link "cache"; @@ -856,7 +857,9 @@ "get_extended_headers_negotiated", { shortdesc = "see if extended headers are in use"; longdesc = "\ After connecting you may call this to find out if the connection is -using extended headers. +using extended headers. Note that this setting is sticky; this +can return true even after a second L<nbd_opt_extended_headers(3)> +returns false because the server detected a duplicate request. When extended headers are not in use, commands are limited to a 32-bit length, even when the libnbd API uses a 64-bit parameter @@ -938,7 +941,7 @@ "get_structured_replies_negotiated", { attempted."; see_also = [Link "set_request_structured_replies"; Link "get_request_structured_replies"; - Link "opt_structured_reply"; + Link "opt_structured_reply"; Link "opt_extended_headers"; Link "get_protocol"; Link "get_extended_headers_negotiated"]; }; @@ -1211,12 +1214,13 @@ "set_opt_mode", { newstyle server. This setting has no effect when connecting to an oldstyle server. -Note that libnbd defaults to attempting C<NBD_OPT_STARTTLS> and -C<NBD_OPT_STRUCTURED_REPLY> before letting you control remaining -negotiation steps; if you need control over these steps as well, -first set L<nbd_set_tls(3)> to C<LIBNBD_TLS_DISABLE> and -L<nbd_set_request_structured_replies(3)> to false before starting -the connection attempt. +Note that libnbd defaults to attempting C<NBD_OPT_STARTTLS>, +C<NBD_OPT_EXTENDED_HEADERS>, and C<NBD_OPT_STRUCTURED_REPLY> +before letting you control remaining negotiation steps; if you +need control over these steps as well, first set L<nbd_set_tls(3)> +to C<LIBNBD_TLS_DISABLE>, and L<nbd_set_request_extended_headers(3)> +or L<nbd_set_request_structured_replies(3)> to false, before +starting the connection attempt. When option mode is enabled, you have fine-grained control over which options are negotiated, compared to the default of the server @@ -1324,6 +1328,35 @@ "opt_starttls", { Link "supports_tls"] }; + "opt_extended_headers", { + default_call with + args = []; ret = RBool; + permitted_states = [ Negotiating ]; + shortdesc = "request the server to enable extended headers"; + longdesc = "\ +Request that the server use extended headers, by sending +C<NBD_OPT_EXTENDED_HEADERS>. This can only be used if +L<nbd_set_opt_mode(3)> enabled option mode; furthermore, libnbd +defaults to automatically requesting this unless you use +L<nbd_set_request_extended_headers(3)> or +L<nbd_set_request_structured_replies(3)> prior to connecting. +This function is mainly useful for integration testing of corner +cases in server handling. + +This function returns true if the server replies with success, +false if the server replies with an error, and fails only if +the server does not reply (such as for a loss of connection). +Note that some servers fail a second request as redundant; +libnbd assumes that once one request has succeeded, then +extended headers are supported (as visible by +L<nbd_get_extended_headers_negotiated(3)>) regardless if +later calls to this function return false. If this function +returns true, the use of structured replies is implied."; + see_also = [Link "set_opt_mode"; Link "aio_opt_extended_headers"; + Link "opt_go"; Link "set_request_extended_headers"; + Link "set_request_structured_replies"] + }; + "opt_structured_reply", { default_call with args = []; ret = RBool; @@ -1345,7 +1378,9 @@ "opt_structured_reply", { libnbd assumes that once one request has succeeded, then structured replies are supported (as visible by L<nbd_get_structured_replies_negotiated(3)>) regardless if -later calls to this function return false."; +later calls to this function return false. Similarly, a +server may fail this request if extended headers are already +negotiated, since extended headers take priority."; see_also = [Link "set_opt_mode"; Link "aio_opt_structured_reply"; Link "opt_go"; Link "set_request_structured_replies"] }; @@ -3146,6 +3181,30 @@ "aio_opt_starttls", { see_also = [Link "set_opt_mode"; Link "opt_starttls"]; }; + "aio_opt_extended_headers", { + default_call with + args = []; + optargs = [ OClosure completion_closure ]; + ret = RErr; + permitted_states = [ Negotiating ]; + shortdesc = "request the server to enable extended headers"; + longdesc = "\ +Request that the server use extended headers, by sending +C<NBD_OPT_EXTENDED_HEADERS>. This behaves like the synchronous +counterpart L<nbd_opt_extended_headers(3)>, except that it does +not wait for the server's response. + +To determine when the request completes, wait for +L<nbd_aio_is_connecting(3)> to return false. Or supply the optional +C<completion_callback> which will be invoked as described in +L<libnbd(3)/Completion callbacks>, except that it is automatically +retired regardless of return value. Note that detecting whether the +server returns an error (as is done by the return value of the +synchronous counterpart) is only possible with a completion +callback."; + see_also = [Link "set_opt_mode"; Link "opt_extended_headers"]; + }; + "aio_opt_structured_reply", { default_call with args = []; @@ -4122,6 +4181,8 @@ let first_version "set_request_extended_headers", (1, 18); "get_request_extended_headers", (1, 18); "get_extended_headers_negotiated", (1, 18); + "opt_extended_headers", (1, 18); + "aio_opt_extended_headers", (1, 18); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/generator/states-newstyle-opt-extended-headers.c b/generator/states-newstyle-opt-extended-headers.c index 1ec25e97..5017a629 100644 --- a/generator/states-newstyle-opt-extended-headers.c +++ b/generator/states-newstyle-opt-extended-headers.c @@ -21,11 +21,14 @@ STATE_MACHINE { NEWSTYLE.OPT_EXTENDED_HEADERS.START: assert (h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE); - assert (h->opt_current != NBD_OPT_EXTENDED_HEADERS); - assert (CALLBACK_IS_NULL (h->opt_cb.completion)); - if (!h->request_eh || !h->request_sr) { - SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); - return 0; + if (h->opt_current == NBD_OPT_EXTENDED_HEADERS) + assert (h->opt_mode); + else { + assert (CALLBACK_IS_NULL (h->opt_cb.completion)); + if (!h->request_eh || !h->request_sr) { + SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + return 0; + } } h->sbuf.option.version = htobe64 (NBD_NEW_VERSION); @@ -68,6 +71,7 @@ NEWSTYLE.OPT_EXTENDED_HEADERS.RECV_REPLY_PAYLOAD: NEWSTYLE.OPT_EXTENDED_HEADERS.CHECK_REPLY: uint32_t reply; + int err = ENOTSUP; reply = be32toh (h->sbuf.or.option_reply.reply); switch (reply) { @@ -76,19 +80,31 @@ NEWSTYLE.OPT_EXTENDED_HEADERS.CHECK_REPLY: h->extended_headers = true; /* Extended headers trump structured replies, so skip ahead. */ h->structured_replies = true; + err = 0; break; + case NBD_REP_ERR_INVALID: + err = EINVAL; + /* fallthrough */ default: if (handle_reply_error (h) == -1) { SET_NEXT_STATE (%.DEAD); return 0; } - debug (h, "extended headers are not supported by this server"); + if (h->extended_headers) + debug (h, "extended headers already negotiated"); + else + debug (h, "extended headers are not supported by this server"); break; } /* Next option. */ - SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + if (h->opt_current == NBD_OPT_EXTENDED_HEADERS) + SET_NEXT_STATE (%.NEGOTIATING); + else + SET_NEXT_STATE (%^OPT_STRUCTURED_REPLY.START); + CALL_CALLBACK (h->opt_cb.completion, &err); + nbd_internal_free_option (h); return 0; } /* END STATE MACHINE */ diff --git a/generator/states-newstyle.c b/generator/states-newstyle.c index ad5bbf72..45893a8b 100644 --- a/generator/states-newstyle.c +++ b/generator/states-newstyle.c @@ -146,6 +146,9 @@ NEWSTYLE.START: case NBD_OPT_STRUCTURED_REPLY: SET_NEXT_STATE (%OPT_STRUCTURED_REPLY.START); return 0; + case NBD_OPT_EXTENDED_HEADERS: + SET_NEXT_STATE (%OPT_EXTENDED_HEADERS.START); + return 0; case NBD_OPT_STARTTLS: SET_NEXT_STATE (%OPT_STARTTLS.START); return 0; diff --git a/lib/opt.c b/lib/opt.c index f58d5e19..d48acdd1 100644 --- a/lib/opt.c +++ b/lib/opt.c @@ -164,6 +164,31 @@ nbd_unlocked_opt_starttls (struct nbd_handle *h) return r; } +/* Issue NBD_OPT_EXTENDED_HEADERS and wait for the reply. */ +int +nbd_unlocked_opt_extended_headers (struct nbd_handle *h) +{ + int err; + nbd_completion_callback c = { .callback = go_complete, .user_data = &err }; + int r = nbd_unlocked_aio_opt_extended_headers (h, &c); + + if (r == -1) + return r; + + r = wait_for_option (h); + if (r == 0) { + if (nbd_internal_is_state_negotiating (get_next_state (h))) + r = err == 0; + else { + assert (nbd_internal_is_state_dead (get_next_state (h))); + set_error (err, + "failed to get response to opt_extended_headers request"); + r = -1; + } + } + return r; +} + /* Issue NBD_OPT_STRUCTURED_REPLY and wait for the reply. */ int nbd_unlocked_opt_structured_reply (struct nbd_handle *h) @@ -386,6 +411,25 @@ nbd_unlocked_aio_opt_starttls (struct nbd_handle *h, #endif } +/* Issue NBD_OPT_EXTENDED_HEADERS without waiting. */ +int +nbd_unlocked_aio_opt_extended_headers (struct nbd_handle *h, + nbd_completion_callback *complete) +{ + if ((h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE) == 0) { + set_error (ENOTSUP, "server is not using fixed newstyle protocol"); + return -1; + } + + h->opt_current = NBD_OPT_EXTENDED_HEADERS; + h->opt_cb.completion = *complete; + SET_CALLBACK_TO_NULL (*complete); + + if (nbd_internal_run (h, cmd_issue) == -1) + debug (h, "option queued, ignoring state machine failure"); + return 0; +} + /* Issue NBD_OPT_STRUCTURED_REPLY without waiting. */ int nbd_unlocked_aio_opt_structured_reply (struct nbd_handle *h, diff --git a/interop/Makefile.am b/interop/Makefile.am index ec8ea0b2..3f81df0c 100644 --- a/interop/Makefile.am +++ b/interop/Makefile.am @@ -25,6 +25,7 @@ EXTRA_DIST = \ list-exports-test-dir/disk1 \ list-exports-test-dir/disk2 \ structured-read.sh \ + opt-extended-headers.sh \ $(NULL) TESTS_ENVIRONMENT = \ @@ -134,6 +135,7 @@ check_PROGRAMS += \ socket-activation-qemu-nbd \ dirty-bitmap \ structured-read \ + opt-extended-headers \ $(NULL) TESTS += \ interop-qemu-nbd \ @@ -144,6 +146,7 @@ TESTS += \ dirty-bitmap.sh \ structured-read.sh \ interop-qemu-block-size.sh \ + opt-extended-headers.sh \ $(NULL) interop_qemu_nbd_SOURCES = \ @@ -235,6 +238,9 @@ dirty_bitmap_LDADD = $(top_builddir)/lib/libnbd.la structured_read_SOURCES = structured-read.c structured_read_LDADD = $(top_builddir)/lib/libnbd.la +opt_extended_headers_SOURCES = opt-extended-headers.c +opt_extended_headers_LDADD = $(top_builddir)/lib/libnbd.la + endif HAVE_QEMU_NBD #---------------------------------------------------------------------- diff --git a/interop/opt-extended-headers.c b/interop/opt-extended-headers.c new file mode 100644 index 00000000..f50cd78f --- /dev/null +++ b/interop/opt-extended-headers.c @@ -0,0 +1,153 @@ +/* NBD client library in userspace + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Demonstrate low-level use of nbd_opt_extended_headers(). */ + +#include <config.h> + +#include <inttypes.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <errno.h> +#include <unistd.h> +#include <sys/stat.h> + +#include <libnbd.h> + +#define check(got, exp) do_check (#got, got, exp) + +static void +do_check (const char *act, int64_t got, int64_t exp) +{ + fprintf (stderr, "trying %s\n", act); + if (got == -1) + fprintf (stderr, "%s\n", nbd_get_error ()); + else + fprintf (stderr, "succeeded, result %" PRId64 "\n", got); + if (got != exp) { + fprintf (stderr, "got %" PRId64 ", but expected %" PRId64 "\n", got, exp); + exit (EXIT_FAILURE); + } +} + +static int +cb (void *data, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t nr_entries, int *error) +{ + /* If we got here, extents worked, implying at least structured replies */ + bool *seen = data; + + *seen = true; + return 0; +} + +struct nbd_handle * +prep (bool sr, bool eh, char **cmd) +{ + struct nbd_handle *nbd; + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* Connect to the server in opt mode, disable client-side failsafes so + * that we are testing server response even when client breaks protocol. + */ + check (nbd_set_opt_mode (nbd, true), 0); + check (nbd_set_strict_mode (nbd, 0), 0); + check (nbd_add_meta_context (nbd, LIBNBD_CONTEXT_BASE_ALLOCATION), 0); + check (nbd_set_request_structured_replies (nbd, sr), 0); + check (nbd_set_request_extended_headers (nbd, eh), 0); + check (nbd_connect_systemd_socket_activation (nbd, cmd), 0); + + return nbd; +} + +void +cleanup (struct nbd_handle *nbd, bool extents_exp) +{ + bool extents = false; + + check (nbd_opt_go (nbd), 0); + check (nbd_can_meta_context (nbd, LIBNBD_CONTEXT_BASE_ALLOCATION), + extents_exp); + check (nbd_block_status_64 (nbd, 512, 0, + (nbd_extent64_callback) { .callback = cb, + .user_data = &extents }, + 0), extents_exp ? 0 : -1); + check (extents, extents_exp); + nbd_close (nbd); +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + int64_t bytes_sent; + + if (argc < 2) { + fprintf (stderr, "%s qemu-nbd [args ...]\n", argv[0]); + exit (EXIT_FAILURE); + } + + /* Default setup tries eh first, and skips sr request when eh works... */ + nbd = prep (true, true, &argv[1]); + bytes_sent = nbd_stats_bytes_sent (nbd); + check (nbd_get_extended_headers_negotiated (nbd), true); + check (nbd_get_structured_replies_negotiated (nbd), true); + /* Duplicate eh request is no-op as redundant, but does not change state */ + check (nbd_opt_extended_headers (nbd), false); + /* Trying sr after eh is no-op as redundant, but does not change state */ + check (nbd_opt_structured_reply (nbd), false); + check (nbd_get_extended_headers_negotiated (nbd), true); + check (nbd_get_structured_replies_negotiated (nbd), true); + cleanup (nbd, true); + + /* ...which should result in the same amount of initial negotiation + * traffic as explicitly requesting just structured replies, albeit + * with different results on what got negotiated. + */ + nbd = prep (true, false, &argv[1]); + check (nbd_stats_bytes_sent (nbd), bytes_sent); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), true); + cleanup (nbd, true); + + /* request_eh is ignored if request_sr is false. */ + nbd = prep (false, true, &argv[1]); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), false); + cleanup (nbd, false); + + /* Swap order, requesting structured replies before extended headers */ + nbd = prep (false, false, &argv[1]); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), false); + check (nbd_opt_structured_reply (nbd), true); + check (nbd_get_extended_headers_negotiated (nbd), false); + check (nbd_get_structured_replies_negotiated (nbd), true); + check (nbd_opt_extended_headers (nbd), true); + check (nbd_get_extended_headers_negotiated (nbd), true); + check (nbd_get_structured_replies_negotiated (nbd), true); + cleanup (nbd, true); + + exit (EXIT_SUCCESS); +} diff --git a/interop/opt-extended-headers.sh b/interop/opt-extended-headers.sh new file mode 100755 index 00000000..41322f36 --- /dev/null +++ b/interop/opt-extended-headers.sh @@ -0,0 +1,29 @@ +#!/usr/bin/env bash +# nbd client library in userspace +# Copyright Red Hat +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +# Test low-level nbd_opt_extended_headers() details with qemu-nbd + +source ../tests/functions.sh +set -e +set -x + +requires qemu-nbd --version +requires nbdinfo --can extended-headers -- [ qemu-nbd -r -f raw "$0" ] + +# Run the test. +$VG ./opt-extended-headers qemu-nbd -r -f raw "$0" diff --git a/.gitignore b/.gitignore index bc7c2c37..24642748 100644 --- a/.gitignore +++ b/.gitignore @@ -118,6 +118,7 @@ Makefile.in /interop/list-exports-nbdkit /interop/list-exports-qemu-nbd /interop/nbd-server-tls.conf +/interop/opt-extended-headers /interop/requires.c /interop/socket-activation-nbdkit /interop/socket-activation-qemu-nbd -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 20/22] interop: Add test of 64-bit block status
Prove that we can round-trip a block status request larger than 4G through a new-enough qemu-nbd. Also serves as a unit test of our shim for converting internal 64-bit representation back to the older 32-bit nbd_block_status callback interface. Signed-off-by: Eric Blake <eblake at redhat.com> --- interop/Makefile.am | 6 ++ interop/large-status.c | 186 ++++++++++++++++++++++++++++++++++++++++ interop/large-status.sh | 49 +++++++++++ .gitignore | 1 + 4 files changed, 242 insertions(+) create mode 100644 interop/large-status.c create mode 100755 interop/large-status.sh diff --git a/interop/Makefile.am b/interop/Makefile.am index 3f81df0c..9a7a5967 100644 --- a/interop/Makefile.am +++ b/interop/Makefile.am @@ -21,6 +21,7 @@ EXTRA_DIST = \ dirty-bitmap.sh \ interop-qemu-storage-daemon.sh \ interop-qemu-block-size.sh \ + large-status.sh \ list-exports-nbd-config \ list-exports-test-dir/disk1 \ list-exports-test-dir/disk2 \ @@ -134,6 +135,7 @@ check_PROGRAMS += \ list-exports-qemu-nbd \ socket-activation-qemu-nbd \ dirty-bitmap \ + large-status \ structured-read \ opt-extended-headers \ $(NULL) @@ -144,6 +146,7 @@ TESTS += \ list-exports-qemu-nbd \ socket-activation-qemu-nbd \ dirty-bitmap.sh \ + large-status.sh \ structured-read.sh \ interop-qemu-block-size.sh \ opt-extended-headers.sh \ @@ -235,6 +238,9 @@ socket_activation_qemu_nbd_LDADD = $(top_builddir)/lib/libnbd.la dirty_bitmap_SOURCES = dirty-bitmap.c dirty_bitmap_LDADD = $(top_builddir)/lib/libnbd.la +large_status_SOURCES = large-status.c +large_status_LDADD = $(top_builddir)/lib/libnbd.la + structured_read_SOURCES = structured-read.c structured_read_LDADD = $(top_builddir)/lib/libnbd.la diff --git a/interop/large-status.c b/interop/large-status.c new file mode 100644 index 00000000..36415653 --- /dev/null +++ b/interop/large-status.c @@ -0,0 +1,186 @@ +/* NBD client library in userspace + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Test 64-bit block status with qemu. */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <assert.h> +#include <stdbool.h> +#include <errno.h> + +#include <libnbd.h> + +static const char *bitmap; + +struct data { + bool req_one; /* input: true if req_one was passed to request */ + int count; /* input: count of expected remaining calls */ + bool seen_base; /* output: true if base:allocation encountered */ + bool seen_dirty; /* output: true if qemu:dirty-bitmap encountered */ +}; + +static int +cb32 (void *opaque, const char *metacontext, uint64_t offset, + uint32_t *entries, size_t len, int *error) +{ + struct data *data = opaque; + + assert (offset == 0); + assert (data->count-- > 0); + + if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) == 0) { + assert (!data->seen_base); + data->seen_base = true; + + /* Data block offset 0 size 64k, remainder is hole */ + assert (len == 4); + assert (entries[0] == 65536); + assert (entries[1] == 0); + /* libnbd had to truncate qemu's >4G answer */ + assert (entries[2] == 4227858432); + assert (entries[3] == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + } + else if (strcmp (metacontext, bitmap) == 0) { + assert (!data->seen_dirty); + data->seen_dirty = true; + + /* Dirty block at offset 5G-64k, remainder is clean */ + /* libnbd had to truncate qemu's >4G answer */ + assert (len == 2); + assert (entries[0] == 4227858432); + assert (entries[1] == 0); + } + else { + fprintf (stderr, "unexpected context %s\n", metacontext); + exit (EXIT_FAILURE); + } + return 0; +} + +static int +cb64 (void *opaque, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t len, int *error) +{ + struct data *data = opaque; + + assert (offset == 0); + assert (data->count-- > 0); + + if (strcmp (metacontext, LIBNBD_CONTEXT_BASE_ALLOCATION) == 0) { + assert (!data->seen_base); + data->seen_base = true; + + /* Data block offset 0 size 64k, remainder is hole */ + assert (len == 2); + assert (entries[0].length == 65536); + assert (entries[0].flags == 0); + assert (entries[1].length == 5368643584ULL); + assert (entries[1].flags == (LIBNBD_STATE_HOLE|LIBNBD_STATE_ZERO)); + } + else if (strcmp (metacontext, bitmap) == 0) { + assert (!data->seen_dirty); + data->seen_dirty = true; + + /* Dirty block at offset 5G-64k, remainder is clean */ + assert (len == 2); + assert (entries[0].length == 5368643584ULL); + assert (entries[0].flags == 0); + assert (entries[1].length == 65536); + assert (entries[1].flags == 1); + } + else { + fprintf (stderr, "unexpected context %s\n", metacontext); + exit (EXIT_FAILURE); + } + return 0; +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + int64_t exportsize; + struct data data; + + if (argc < 3) { + fprintf (stderr, "%s bitmap qemu-nbd [args ...]\n", argv[0]); + exit (EXIT_FAILURE); + } + bitmap = argv[1]; + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_add_meta_context (nbd, LIBNBD_CONTEXT_BASE_ALLOCATION); + nbd_add_meta_context (nbd, bitmap); + + if (nbd_connect_systemd_socket_activation (nbd, &argv[2]) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + exportsize = nbd_get_size (nbd); + if (exportsize == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + if (nbd_get_extended_headers_negotiated (nbd) != 1) { + fprintf (stderr, "skipping: qemu-nbd lacks extended headers\n"); + exit (77); + } + + /* Prove that we can round-trip a >4G block status request */ + data = (struct data) { .count = 2, }; + if (nbd_block_status_64 (nbd, exportsize, 0, + (nbd_extent64_callback) { .callback = cb64, + .user_data = &data }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (data.seen_base && data.seen_dirty); + + /* Check libnbd's handling of a >4G response through older interface */ + data = (struct data) { .count = 2, }; + if (nbd_block_status (nbd, exportsize, 0, + (nbd_extent_callback) { .callback = cb32, + .user_data = &data }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (data.seen_base && data.seen_dirty); + + if (nbd_shutdown (nbd, 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_close (nbd); + + exit (EXIT_SUCCESS); +} diff --git a/interop/large-status.sh b/interop/large-status.sh new file mode 100755 index 00000000..46810dc3 --- /dev/null +++ b/interop/large-status.sh @@ -0,0 +1,49 @@ +#!/usr/bin/env bash +# nbd client library in userspace +# Copyright Red Hat +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +# Test reading qemu dirty-bitmap. + +source ../tests/functions.sh +set -e +set -x + +requires qemu-img bitmap --help +requires qemu-nbd --version + +# This test uses the qemu-nbd -B option. +if ! qemu-nbd --help | grep -sq -- -B; then + echo "$0: skipping because qemu-nbd does not support the -B option" + exit 77 +fi + +files="large-status.qcow2" +rm -f $files +cleanup_fn rm -f $files + +# Create mostly-sparse file with intentionally different data vs. dirty areas +# (64k data, 5G-64k hole,zero; 5G-64k clean, 64k dirty) +qemu-img create -f qcow2 large-status.qcow2 5G +qemu-img bitmap --add --enable -f qcow2 large-status.qcow2 bitmap0 +qemu-io -f qcow2 -c "w -z $((5*1024*1024*1024 - 64*1024)) 64k" \ + large-status.qcow2 +qemu-img bitmap --disable -f qcow2 large-status.qcow2 bitmap0 +qemu-io -f qcow2 -c 'w 0 64k' large-status.qcow2 + +# Run the test. +$VG ./large-status qemu:dirty-bitmap:bitmap0 \ + qemu-nbd -f qcow2 -B bitmap0 large-status.qcow2 diff --git a/.gitignore b/.gitignore index 24642748..fd81357b 100644 --- a/.gitignore +++ b/.gitignore @@ -114,6 +114,7 @@ Makefile.in /interop/interop-qemu-nbd /interop/interop-qemu-nbd-tls-certs /interop/interop-qemu-nbd-tls-psk +/interop/large-status /interop/list-exports-nbd-server /interop/list-exports-nbdkit /interop/list-exports-qemu-nbd -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 21/22] api: Add nbd_can_block_status_payload()
In the recent NBD protocol extensions to add 64-bit commands [1], an additional option was added to allow NBD_CMD_BLOCK_STATUS pass a client payload instructing the server to filter its answers in nbd.git commit e6f3b94a (mainly useful when the client requests more than one meta context with NBD_OPT_SET_META_CONTEXT). This patch lays the groundwork by exposing servers that advertise this capability, although libnbd does not yet actually utilize it until the next patch. At the time this patch was written, qemu-nbd was also patched to provide such support; hence, an interop/ test shows the API in action. [1] https://github.com/NetworkBlockDevice/nbd/blob/extension-ext-header/doc/ Signed-off-by: Eric Blake <eblake at redhat.com> --- info/nbdinfo.pod | 10 ++- lib/nbd-protocol.h | 29 +++++--- generator/API.ml | 18 +++++ lib/flags.c | 12 +++ examples/server-flags.c | 7 +- interop/Makefile.am | 6 ++ interop/block-status-payload.c | 126 ++++++++++++++++++++++++++++++++ interop/block-status-payload.sh | 68 +++++++++++++++++ .gitignore | 1 + info/can.c | 5 ++ info/show.c | 9 ++- 11 files changed, 274 insertions(+), 17 deletions(-) create mode 100644 interop/block-status-payload.c create mode 100755 interop/block-status-payload.sh diff --git a/info/nbdinfo.pod b/info/nbdinfo.pod index 9ea4a278..f5dc53fa 100644 --- a/info/nbdinfo.pod +++ b/info/nbdinfo.pod @@ -178,6 +178,8 @@ rotating disk: accessing nearby blocks may be faster than random access and requests should be sorted to improve performance. Many servers do not or cannot report this accurately. +=item nbdinfo --can block-status-payload URI + =item nbdinfo --can cache URI =item nbdinfo --can df URI @@ -345,10 +347,10 @@ The command does not print anything. Instead it exits with success For further information see the L<NBD protocol|https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md> -and the following libnbd functions: L<nbd_can_cache(3)>, -L<nbd_can_df(3)>, L<nbd_can_fast_zero(3)>, L<nbd_can_flush(3)>, -L<nbd_can_fua(3)>, L<nbd_can_multi_conn(3)>, L<nbd_can_trim(3)>, -L<nbd_can_zero(3)>, L<nbd_is_read_only(3)>, +and the following libnbd functions: L<nbd_can_block_status_payload(3)>, +L<nbd_can_cache(3)>, L<nbd_can_df(3)>, L<nbd_can_fast_zero(3)>, +L<nbd_can_flush(3)>, L<nbd_can_fua(3)>, L<nbd_can_multi_conn(3)>, +L<nbd_can_trim(3)>, L<nbd_can_zero(3)>, L<nbd_is_read_only(3)>, L<nbd_get_structured_replies_negotiated(3)>, L<nbd_get_extended_headers_negotiated(3)>. diff --git a/lib/nbd-protocol.h b/lib/nbd-protocol.h index b6fa9b8a..9e358122 100644 --- a/lib/nbd-protocol.h +++ b/lib/nbd-protocol.h @@ -102,17 +102,18 @@ struct nbd_fixed_new_option_reply { #define NBD_FLAG_NO_ZEROES (1 << 1) /* Per-export flags. */ -#define NBD_FLAG_HAS_FLAGS (1 << 0) -#define NBD_FLAG_READ_ONLY (1 << 1) -#define NBD_FLAG_SEND_FLUSH (1 << 2) -#define NBD_FLAG_SEND_FUA (1 << 3) -#define NBD_FLAG_ROTATIONAL (1 << 4) -#define NBD_FLAG_SEND_TRIM (1 << 5) -#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) -#define NBD_FLAG_SEND_DF (1 << 7) -#define NBD_FLAG_CAN_MULTI_CONN (1 << 8) -#define NBD_FLAG_SEND_CACHE (1 << 10) -#define NBD_FLAG_SEND_FAST_ZERO (1 << 11) +#define NBD_FLAG_HAS_FLAGS (1 << 0) +#define NBD_FLAG_READ_ONLY (1 << 1) +#define NBD_FLAG_SEND_FLUSH (1 << 2) +#define NBD_FLAG_SEND_FUA (1 << 3) +#define NBD_FLAG_ROTATIONAL (1 << 4) +#define NBD_FLAG_SEND_TRIM (1 << 5) +#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) +#define NBD_FLAG_SEND_DF (1 << 7) +#define NBD_FLAG_CAN_MULTI_CONN (1 << 8) +#define NBD_FLAG_SEND_CACHE (1 << 10) +#define NBD_FLAG_SEND_FAST_ZERO (1 << 11) +#define NBD_FLAG_BLOCK_STATUS_PAYLOAD (1 << 12) /* NBD options (new style handshake only). */ #define NBD_OPT_EXPORT_NAME 1 @@ -204,6 +205,12 @@ struct nbd_request_ext { uint64_t count; /* Request effect or payload length. */ } NBD_ATTRIBUTE_PACKED; +/* Extended request payload for NBD_CMD_BLOCK_STATUS, when supported. */ +struct nbd_block_status_payload { + uint64_t length; /* Effective length of client request */ + /* followed by array of uint32_t ids */ +} NBD_ATTRIBUTE_PACKED; + /* Simple reply (server -> client). */ struct nbd_simple_reply { uint32_t magic; /* NBD_SIMPLE_REPLY_MAGIC. */ diff --git a/generator/API.ml b/generator/API.ml index 85625bbd..5a31ce3b 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -2327,6 +2327,23 @@ "can_fast_zero", { example = Some "examples/server-flags.c"; }; + "can_block_status_payload", { + default_call with + args = []; ret = RBool; + permitted_states = [ Negotiating; Connected; Closed ]; + shortdesc = "does the server support the block status payload flag?"; + longdesc = "\ +Returns true if the server supports the use of the +C<LIBNBD_CMD_FLAG_PAYLOAD_LEN> flag to allow filtering of the +block status command. Returns +false if the server does not. Note that this will never return +true if L<nbd_get_extended_headers_negotiated(3)> is false." +^ non_blocking_test_call_description; + see_also = [SectionLink "Flag calls"; Link "opt_info"; + Link "get_extended_headers_negotiated"]; + example = Some "examples/server-flags.c"; + }; + "can_df", { default_call with args = []; ret = RBool; @@ -4183,6 +4200,7 @@ let first_version "get_extended_headers_negotiated", (1, 18); "opt_extended_headers", (1, 18); "aio_opt_extended_headers", (1, 18); + "can_block_status_payload", (1, 18); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/lib/flags.c b/lib/flags.c index be880acf..7e6ddedd 100644 --- a/lib/flags.c +++ b/lib/flags.c @@ -66,6 +66,12 @@ nbd_internal_set_size_and_flags (struct nbd_handle *h, eflags &= ~NBD_FLAG_SEND_DF; } + if (eflags & NBD_FLAG_BLOCK_STATUS_PAYLOAD && !h->extended_headers) { + debug (h, "server lacks extended headers, ignoring claim " + "of block status payload"); + eflags &= ~NBD_FLAG_BLOCK_STATUS_PAYLOAD; + } + if (eflags & NBD_FLAG_SEND_FAST_ZERO && !(eflags & NBD_FLAG_SEND_WRITE_ZEROES)) { debug (h, "server lacks write zeroes, ignoring claim of fast zero"); @@ -213,6 +219,12 @@ nbd_unlocked_can_cache (struct nbd_handle *h) return get_flag (h, NBD_FLAG_SEND_CACHE); } +int +nbd_unlocked_can_block_status_payload (struct nbd_handle *h) +{ + return get_flag (h, NBD_FLAG_BLOCK_STATUS_PAYLOAD); +} + int nbd_unlocked_can_meta_context (struct nbd_handle *h, const char *name) { diff --git a/examples/server-flags.c b/examples/server-flags.c index d156aced..f53b86ed 100644 --- a/examples/server-flags.c +++ b/examples/server-flags.c @@ -78,8 +78,13 @@ main (int argc, char *argv[]) PRINT_FLAG (nbd_can_multi_conn); PRINT_FLAG (nbd_can_trim); PRINT_FLAG (nbd_can_zero); -#if LIBNBD_HAVE_NBD_CAN_FAST_ZERO /* Added in 1.2 */ +#if LIBNBD_HAVE_NBD_CAN_FAST_ZERO + /* Added in 1.2 */ PRINT_FLAG (nbd_can_fast_zero); +#endif +#if LIBNBD_HAVE_NBD_CAN_BLOCK_STATUS_PAYLOAD + /* Added in 1.18 */ + PRINT_FLAG (nbd_can_block_status_payload); #endif PRINT_FLAG (nbd_is_read_only); PRINT_FLAG (nbd_is_rotational); diff --git a/interop/Makefile.am b/interop/Makefile.am index 9a7a5967..f8c4cb7d 100644 --- a/interop/Makefile.am +++ b/interop/Makefile.am @@ -27,6 +27,7 @@ EXTRA_DIST = \ list-exports-test-dir/disk2 \ structured-read.sh \ opt-extended-headers.sh \ + block-status-payload.sh \ $(NULL) TESTS_ENVIRONMENT = \ @@ -138,6 +139,7 @@ check_PROGRAMS += \ large-status \ structured-read \ opt-extended-headers \ + block-status-payload \ $(NULL) TESTS += \ interop-qemu-nbd \ @@ -150,6 +152,7 @@ TESTS += \ structured-read.sh \ interop-qemu-block-size.sh \ opt-extended-headers.sh \ + block-status-payload.sh \ $(NULL) interop_qemu_nbd_SOURCES = \ @@ -247,6 +250,9 @@ structured_read_LDADD = $(top_builddir)/lib/libnbd.la opt_extended_headers_SOURCES = opt-extended-headers.c opt_extended_headers_LDADD = $(top_builddir)/lib/libnbd.la +block_status_payload_SOURCES = block-status-payload.c +block_status_payload_LDADD = $(top_builddir)/lib/libnbd.la + endif HAVE_QEMU_NBD #---------------------------------------------------------------------- diff --git a/interop/block-status-payload.c b/interop/block-status-payload.c new file mode 100644 index 00000000..9603dfe5 --- /dev/null +++ b/interop/block-status-payload.c @@ -0,0 +1,126 @@ +/* NBD client library in userspace + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/* Test interaction with qemu using block status payload filtering. */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <unistd.h> +#include <assert.h> +#include <stdbool.h> +#include <errno.h> + +#include <libnbd.h> + +#include "array-size.h" + +static const char *contexts[] = { + "base:allocation", + "qemu:allocation-depth", + "qemu:dirty-bitmap:bitmap0", + "qemu:dirty-bitmap:bitmap1", +}; + +static int +cb (void *opaque, const char *metacontext, uint64_t offset, + nbd_extent *entries, size_t len, int *error) +{ + /* Adjust seen according to which context was visited */ + unsigned int *seen = opaque; + size_t i; + + for (i = 0; i < ARRAY_SIZE (contexts); i++) + if (strcmp (contexts[i], metacontext) == 0) + break; + *seen |= 1 << i; + return 0; +} + +int +main (int argc, char *argv[]) +{ + struct nbd_handle *nbd; + int64_t exportsize; + unsigned int seen; + size_t i; + int r; + + if (argc < 2) { + fprintf (stderr, "%s qemu-nbd [args ...]\n", argv[0]); + exit (EXIT_FAILURE); + } + + nbd = nbd_create (); + if (nbd == NULL) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + assert (ARRAY_SIZE (contexts) == 4); + for (i = 0; i < ARRAY_SIZE (contexts); i++) { + if (nbd_add_meta_context (nbd, contexts[i]) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + } + + if (nbd_connect_systemd_socket_activation (nbd, &argv[1]) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + r = nbd_can_block_status_payload (nbd); + if (r == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + if (r != 1) { + fprintf (stderr, "expecting block status payload support from qemu\n"); + exit (EXIT_FAILURE); + } + + exportsize = nbd_get_size (nbd); + if (exportsize == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + /* An unfiltered call should see all four contexts */ + seen = 0; + if (nbd_block_status_64 (nbd, exportsize, 0, + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0xf); + + /* FIXME: Test filtered calls once the API is added */ + if (nbd_shutdown (nbd, 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + + nbd_close (nbd); + + exit (EXIT_SUCCESS); +} diff --git a/interop/block-status-payload.sh b/interop/block-status-payload.sh new file mode 100755 index 00000000..a12cfc8a --- /dev/null +++ b/interop/block-status-payload.sh @@ -0,0 +1,68 @@ +#!/usr/bin/env bash +# nbd client library in userspace +# Copyright Red Hat +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +# Test use of block status payload for server filtering + +source ../tests/functions.sh +set -e +set -x + +requires qemu-img bitmap --help +# This test uses the qemu-nbd -A and -B options. +requires qemu-nbd -A -BA --version + +file="block-status-payload.qcow2" +rm -f $file +cleanup_fn rm -f $file + +# Create sparse file with two bitmaps. +qemu-img create -f qcow2 $file 1M +qemu-img bitmap --add --enable -f qcow2 $file bitmap0 +qemu-img bitmap --add --enable -f qcow2 $file bitmap1 + +# Unconditional part of test: qemu should not advertise block status payload +# support if extended headers are not in use +nbdsh -c ' +h.set_request_extended_headers(False) +h.add_meta_context("base:allocation") +h.add_meta_context("qemu:allocation-depth") +h.add_meta_context("qemu:dirty-bitmap:bitmap0") +h.add_meta_context("qemu:dirty-bitmap:bitmap1") +h.set_opt_mode(True) +args = ["qemu-nbd", "-f", "qcow2", "-A", "-B", "bitmap0", "-B", "bitmap1", + "'"$file"'"] +h.connect_systemd_socket_activation(args) +assert h.aio_is_negotiating() is True +assert h.get_extended_headers_negotiated() is False +# Flag not available until info or go +try: + h.can_block_status_payload() + assert False +except nbd.Error: + pass +h.opt_info() +assert h.can_block_status_payload() is False +assert h.can_meta_context("base:allocation") is True +h.opt_abort() +' + +# Conditional part of test: if qemu is new enough to support extended +# headers, we assume it can also support block status payload. +requires nbdinfo --can extended-headers -- [ qemu-nbd -r -f qcow2 "$file" ] +$VG ./block-status-payload \ + qemu-nbd -f qcow2 -A -B bitmap0 -B bitmap1 $file diff --git a/.gitignore b/.gitignore index fd81357b..a2d052bd 100644 --- a/.gitignore +++ b/.gitignore @@ -101,6 +101,7 @@ Makefile.in /info/nbdinfo /info/nbdinfo.1 /install-sh +/interop/block-status-payload /interop/dirty-bitmap /interop/interop-nbd-server /interop/interop-nbd-server-tls diff --git a/info/can.c b/info/can.c index 31c4a1ca..6dd68eeb 100644 --- a/info/can.c +++ b/info/can.c @@ -72,6 +72,11 @@ do_can (void) else if (strcasecmp (can, "rotational") == 0) feature = nbd_is_rotational (nbd); + else if (strcasecmp (can, "block status payload") == 0 || + strcasecmp (can, "block-status-payload") == 0 || + strcasecmp (can, "block_status_payload") == 0) + feature = nbd_can_block_status_payload (nbd); + else if (strcasecmp (can, "cache") == 0) feature = nbd_can_cache (nbd); diff --git a/info/show.c b/info/show.c index 920bbb0a..8914f927 100644 --- a/info/show.c +++ b/info/show.c @@ -54,7 +54,7 @@ show_one_export (struct nbd_handle *nbd, const char *desc, char *uri = NULL; int is_rotational, is_read_only; int can_cache, can_df, can_fast_zero, can_flush, can_fua, - can_multi_conn, can_trim, can_zero; + can_multi_conn, can_trim, can_zero, can_block_status_payload; int64_t block_minimum, block_preferred, block_maximum; string_vector contexts = empty_vector; bool show_context = false; @@ -120,6 +120,7 @@ show_one_export (struct nbd_handle *nbd, const char *desc, can_multi_conn = nbd_can_multi_conn (nbd); can_trim = nbd_can_trim (nbd); can_zero = nbd_can_zero (nbd); + can_block_status_payload = nbd_can_block_status_payload (nbd); block_minimum = nbd_get_block_size (nbd, LIBNBD_SIZE_MINIMUM); block_preferred = nbd_get_block_size (nbd, LIBNBD_SIZE_PREFERRED); block_maximum = nbd_get_block_size (nbd, LIBNBD_SIZE_MAXIMUM); @@ -161,6 +162,8 @@ show_one_export (struct nbd_handle *nbd, const char *desc, if (is_read_only >= 0) fprintf (fp, "\t%s: %s\n", "is_read_only", is_read_only ? "true" : "false"); + if (can_block_status_payload >= 0) + show_boolean ("can_block_status_payload", can_block_status_payload); if (can_cache >= 0) show_boolean ("can_cache", can_cache); if (can_df >= 0) @@ -230,6 +233,10 @@ show_one_export (struct nbd_handle *nbd, const char *desc, if (is_read_only >= 0) fprintf (fp, "\t\"%s\": %s,\n", "is_read_only", is_read_only ? "true" : "false"); + if (can_block_status_payload >= 0) + fprintf (fp, "\t\"%s\": %s,\n", + "can_block_status_payload", + can_block_status_payload ? "true" : "false"); if (can_cache >= 0) fprintf (fp, "\t\"%s\": %s,\n", "can_cache", can_cache ? "true" : "false"); -- 2.40.1
Eric Blake
2023-May-25 13:01 UTC
[Libguestfs] [libnbd PATCH v3 22/22] api: Add nbd_[aio_]block_status_filter()
As part of extending NBD to support 64-bit lengths, the protocol also added an option for servers to allow clients to request filtered responses to NBD_CMD_BLOCK_STATUS when more than one meta-context is negotiated (see NBD commit e6f3b94a). At the same time as this patch, qemu-nbd was taught to support and advertise this feature as a server, but does not utilize it as a client (qemu doesn't yet need to connect to multiple contexts at once). Thus, addding generic client support and enhancing the interop/ test in libnbd is needed to prove that the feature is viable and worth standardizing. --- lib/internal.h | 5 +- generator/API.ml | 71 +++++++++++++++-- generator/states-issue-command.c | 4 +- lib/aio.c | 7 +- lib/rw.c | 127 ++++++++++++++++++++++++++++++- interop/block-status-payload.c | 117 +++++++++++++++++++++++++++- interop/block-status-payload.sh | 14 +++- info/info-can.sh | 3 + 8 files changed, 336 insertions(+), 12 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 2948b77b..64921de9 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -73,6 +73,8 @@ struct meta_context { }; DEFINE_VECTOR_TYPE (meta_vector, struct meta_context); +DEFINE_VECTOR_TYPE(uint32_vector, uint32_t); + struct export { char *name; char *description; @@ -380,7 +382,8 @@ struct command { uint64_t cookie; uint64_t offset; uint64_t count; - void *data; /* Buffer for read/write */ + void *data; /* Buffer for read/write, uint32_vector* for status payload */ + uint32_vector *ids; /* For block status with payload */ struct command_cb cb; bool initialized; /* For read, true if getting a hole may skip memset */ uint32_t data_seen; /* For read, cumulative size of data chunks seen */ diff --git a/generator/API.ml b/generator/API.ml index 5a31ce3b..a26ed1da 100644 --- a/generator/API.ml +++ b/generator/API.ml @@ -2335,12 +2335,13 @@ "can_block_status_payload", { longdesc = "\ Returns true if the server supports the use of the C<LIBNBD_CMD_FLAG_PAYLOAD_LEN> flag to allow filtering of the -block status command. Returns +block status command (see L<nbd_block_status_filter(3)>). Returns false if the server does not. Note that this will never return true if L<nbd_get_extended_headers_negotiated(3)> is false." ^ non_blocking_test_call_description; see_also = [SectionLink "Flag calls"; Link "opt_info"; - Link "get_extended_headers_negotiated"]; + Link "get_extended_headers_negotiated"; + Link "block_status_filter"]; example = Some "examples/server-flags.c"; }; @@ -2409,6 +2410,10 @@ "can_meta_context", { meta contexts were requested but there is a missing or failed attempt at NBD_OPT_SET_META_CONTEXT during option negotiation. +If the server supports block status filtering (see +L<nbd_can_block_status_payload(3)>, this function must return +true for any filter name passed to L<nbd_block_status_filter(3)>. + The single parameter is the name of the metadata context, for example C<LIBNBD_CONTEXT_BASE_ALLOCATION>. B<E<lt>libnbd.hE<gt>> includes defined constants for well-known @@ -2941,9 +2946,12 @@ "block_status_64", { information about blocks beginning from the specified offset to be returned. The C<count> parameter is a hint: the server may choose to return less status, or the final block -may extend beyond the requested range. If multiple contexts +may extend beyond the requested range. When multiple contexts are supported, the number of blocks and cumulative length -of those blocks need not be identical between contexts. +of those blocks need not be identical between contexts; this +command generally returns the status of all negotiated contexts, +while some servers also support a filtered request (see +L<nbd_can_block_status_payload(3)>, L<nbd_block_status_filter(3)>). Note that not all servers can support a C<count> of 4GiB or larger; L<nbd_get_extended_headers_negotiated(3)> indicates which servers @@ -2993,11 +3001,38 @@ "block_status_64", { does not exceed C<count> bytes; however, libnbd does not validate that the server obeyed the flag." ^ strict_call_description; - see_also = [Link "block_status"; + see_also = [Link "block_status"; Link "block_status_filter"; Link "add_meta_context"; Link "can_meta_context"; Link "aio_block_status_64"; Link "set_strict_mode"]; }; + "block_status_filter", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; StringList "contexts"; + Closure extent64_closure ]; + optargs = [ OFlags ("flags", cmd_flags, Some ["REQ_ONE"; "PAYLOAD_LEN"]) ]; + ret = RErr; + permitted_states = [ Connected ]; + shortdesc = "send filtered block status command, with 64-bit callback"; + longdesc = "\ +Issue a filtered block status command to the NBD server. If +supported by the server (see L<nbd_can_block_status_payload(3)>), +this causes metadata context information about blocks beginning +from the specified offset to be returned, and with the result +limited to just the contexts specified in C<filter>. Note that +all strings in C<filter> must be supported by +L<nbd_can_meta_context(3)>. + +All other parameters to this function have the same semantics +as in L<nbd_block_status_64(3)>; except that for convenience, +the C<flags> parameter may additionally contain or omit +C<LIBNBD_CMD_FLAG_PAYLOAD_LEN>." +^ strict_call_description; + see_also = [Link "block_status_64"; + Link "can_block_status_payload"; Link "can_meta_context"; + Link "aio_block_status_filter"; Link "set_strict_mode"]; + }; + "poll", { default_call with args = [ Int "timeout" ]; ret = RInt; @@ -3667,6 +3702,30 @@ "aio_block_status_64", { Link "set_strict_mode"]; }; + "aio_block_status_filter", { + default_call with + args = [ UInt64 "count"; UInt64 "offset"; StringList "contexts"; + Closure extent64_closure ]; + optargs = [ OClosure completion_closure; + OFlags ("flags", cmd_flags, Some ["REQ_ONE"; "PAYLOAD_LEN"]) ]; + ret = RCookie; + permitted_states = [ Connected ]; + shortdesc = "send filtered block status command to the NBD server"; + longdesc = "\ +Send a filtered block status command to the NBD server. + +To check if the command completed, call L<nbd_aio_command_completed(3)>. +Or supply the optional C<completion_callback> which will be invoked +as described in L<libnbd(3)/Completion callbacks>. + +Other parameters behave as documented in L<nbd_block_status_filter(3)>." +^ strict_call_description; + see_also = [SectionLink "Issuing asynchronous commands"; + Link "aio_block_status_64"; Link "block_status_filter"; + Link "can_meta_context"; Link "can_block_status_payload"; + Link "set_strict_mode"]; + }; + "aio_get_fd", { default_call with args = []; ret = RFd; @@ -4201,6 +4260,8 @@ let first_version "opt_extended_headers", (1, 18); "aio_opt_extended_headers", (1, 18); "can_block_status_payload", (1, 18); + "block_status_filter", (1, 18); + "aio_block_status_filter", (1, 18); (* These calls are proposed for a future version of libnbd, but * have not been added to any released version so far. diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c index 79136b61..5307731d 100644 --- a/generator/states-issue-command.c +++ b/generator/states-issue-command.c @@ -84,7 +84,9 @@ ISSUE_COMMAND.PREPARE_WRITE_PAYLOAD: assert (h->cmds_to_issue != NULL); cmd = h->cmds_to_issue; assert (cmd->cookie == be64toh (h->req.compact.handle)); - if (cmd->type == NBD_CMD_WRITE) { + if (cmd->type == NBD_CMD_WRITE || + (h->extended_headers && cmd->type == NBD_CMD_BLOCK_STATUS && + cmd->flags & NBD_CMD_FLAG_PAYLOAD_LEN)) { h->wbuf = cmd->data; h->wlen = cmd->count; if (cmd->next && cmd->count < 64 * 1024) diff --git a/lib/aio.c b/lib/aio.c index a419ac32..77b20c32 100644 --- a/lib/aio.c +++ b/lib/aio.c @@ -32,8 +32,13 @@ void nbd_internal_retire_and_free_command (struct command *cmd) { /* Free the callbacks. */ - if (cmd->type == NBD_CMD_BLOCK_STATUS) + if (cmd->type == NBD_CMD_BLOCK_STATUS) { + if (cmd->ids) { + uint32_vector_reset (cmd->ids); + free (cmd->ids); + } FREE_CALLBACK (cmd->cb.fn.extent); + } if (cmd->type == NBD_CMD_READ) FREE_CALLBACK (cmd->cb.fn.chunk); FREE_CALLBACK (cmd->cb.completion); diff --git a/lib/rw.c b/lib/rw.c index bea55fa1..db6bc0bc 100644 --- a/lib/rw.c +++ b/lib/rw.c @@ -242,6 +242,26 @@ nbd_unlocked_block_status_64 (struct nbd_handle *h, return wait_for_command (h, cookie); } +/* Issue a filtered block status command and wait for the reply. */ +int +nbd_unlocked_block_status_filter (struct nbd_handle *h, + uint64_t count, uint64_t offset, + char **filter, + nbd_extent64_callback *extent64, + uint32_t flags) +{ + int64_t cookie; + nbd_completion_callback c = NBD_NULL_COMPLETION; + + cookie = nbd_unlocked_aio_block_status_filter (h, count, offset, filter, + extent64, &c, flags); + if (cookie == -1) + return -1; + + assert (CALLBACK_IS_NULL (*extent64)); + return wait_for_command (h, cookie); +} + /* count_err represents the errno to return if bounds check fail */ int64_t nbd_internal_command_common (struct nbd_handle *h, @@ -250,6 +270,7 @@ nbd_internal_command_common (struct nbd_handle *h, void *data, struct command_cb *cb) { struct command *cmd; + uint32_vector *ids = NULL; if (h->disconnect_request) { set_error (EINVAL, "cannot request more commands after NBD_CMD_DISC"); @@ -297,10 +318,23 @@ nbd_internal_command_common (struct nbd_handle *h, } break; + case NBD_CMD_BLOCK_STATUS: + if (data) { + ids = data; + count = ids->len * sizeof (uint32_t); + data = ids->ptr; + if (count > MAX_REQUEST_SIZE || + (h->strict & LIBNBD_STRICT_PAYLOAD && count > h->payload_maximum)) { + set_error (ERANGE, "filter set too large"); + goto err; + } + break; + } + /* fallthrough */ + default: /* Other commands are limited by the 32 bit field in the command * structure on the wire, unless extended headers were negotiated. */ - default: if (!h->extended_headers && count > UINT32_MAX) { set_error (ERANGE, "request too large: maximum request size is %" PRIu32, UINT32_MAX); @@ -320,6 +354,7 @@ nbd_internal_command_common (struct nbd_handle *h, cmd->offset = offset; cmd->count = count; cmd->data = data; + cmd->ids = ids; if (cb) cmd->cb = *cb; @@ -364,8 +399,13 @@ nbd_internal_command_common (struct nbd_handle *h, err: /* Since we did not queue the command, we must free the callbacks. */ if (cb) { - if (type == NBD_CMD_BLOCK_STATUS) + if (type == NBD_CMD_BLOCK_STATUS) { + if (ids) { + uint32_vector_reset (ids); + free (ids); + } FREE_CALLBACK (cb->fn.extent); + } if (type == NBD_CMD_READ) FREE_CALLBACK (cb->fn.chunk); FREE_CALLBACK (cb->completion); @@ -609,3 +649,86 @@ nbd_unlocked_aio_block_status_64 (struct nbd_handle *h, return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, count, EINVAL, NULL, &cb); } + +int64_t +nbd_unlocked_aio_block_status_filter (struct nbd_handle *h, + uint64_t count, uint64_t offset, + char **filter, + nbd_extent64_callback *extent64, + nbd_completion_callback *completion, + uint32_t flags) +{ + struct command_cb cb = { .fn.extent = *extent64, + .completion = *completion }; + uint32_vector *ids; + char *name; + size_t i; + + /* Because this affects wire format, it is more convenient to manage + * PAYLOAD_LEN by what was negotiated than to require the user to + * have to set it correctly. + */ + if (!h->extended_headers) { + set_error (ENOTSUP, "server does not support extended headers"); + return -1; + } + flags |= LIBNBD_CMD_FLAG_PAYLOAD_LEN; + + if (h->strict & LIBNBD_STRICT_COMMANDS) { + if (nbd_unlocked_can_block_status_payload (h) != 1) { + set_error (EINVAL, + "server does not support the block status payload flag"); + return -1; + } + + if (!h->meta_valid || h->meta_contexts.len == 0) { + set_error (ENOTSUP, "did not negotiate any metadata contexts, " + "either you did not call nbd_add_meta_context before " + "connecting or the server does not support it"); + return -1; + } + } + + ids = calloc (1, sizeof *ids); + if (ids == NULL) { + set_error (errno, "calloc"); + return -1; + } + if (uint32_vector_append (ids, htobe32 (count >> 32)) == -1 || + uint32_vector_append (ids, htobe32 (count)) == -1) { + set_error (errno, "realloc"); + goto fail; + } + + /* O(n^2) search - hopefully filter and negotiated contexts are both small */ + for ( ; (name = *filter) != NULL; filter++) { + if (!h->meta_valid) { + set_error (EINVAL, "context %s not negotiated", name); + goto fail; + } + for (i = 0; i < h->meta_contexts.len; i++) { + struct meta_context *meta = &h->meta_contexts.ptr[i]; + if (strcmp (name, meta->name) == 0) { + if (uint32_vector_append (ids, htobe32 (meta->context_id)) == -1) { + set_error (errno, "realloc"); + goto fail; + } + break; + } + } + if (i == h->meta_contexts.len) { + set_error (EINVAL, "context %s not negotiated", name); + goto fail; + } + } + + SET_CALLBACK_TO_NULL (*extent64); + SET_CALLBACK_TO_NULL (*completion); + return nbd_internal_command_common (h, flags, NBD_CMD_BLOCK_STATUS, offset, + count, EINVAL, ids, &cb); + + fail: + uint32_vector_reset (ids); + free (ids); + return -1; +} diff --git a/interop/block-status-payload.c b/interop/block-status-payload.c index 9603dfe5..704b25aa 100644 --- a/interop/block-status-payload.c +++ b/interop/block-status-payload.c @@ -54,11 +54,26 @@ cb (void *opaque, const char *metacontext, uint64_t offset, return 0; } +static char ** +list (unsigned int use) +{ + static const char *array[ARRAY_SIZE (contexts) + 1]; + size_t i, j; + + assert (use < 1 << ARRAY_SIZE (contexts)); + for (i = j = 0; i < ARRAY_SIZE (contexts); i++) + if (use & (1 << i)) + array[j++] = contexts[i]; + array[j] = NULL; + return (char **) array; +} + int main (int argc, char *argv[]) { struct nbd_handle *nbd; int64_t exportsize; + uint64_t bytes_sent; unsigned int seen; size_t i; int r; @@ -114,7 +129,107 @@ main (int argc, char *argv[]) } assert (seen == 0xf); - /* FIXME: Test filtered calls once the API is added */ + /* Filtering with all contexts listed, same effect as unfilitered call */ + seen = 0; + if (nbd_block_status_filter (nbd, exportsize, 0, list (0xf), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0xf); + + /* Filtering with just two out of four contexts; test optional flag */ + seen = 0; + if (nbd_block_status_filter (nbd, exportsize, 0, list (0x5), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + LIBNBD_CMD_FLAG_PAYLOAD_LEN) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0x5); + + /* Filtering with one context, near end of file (to make sure the + * payload length isn't confused with the effect length) + */ + seen = 0; + if (nbd_block_status_filter (nbd, 1, exportsize - 1, list (0x2), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) == -1) { + fprintf (stderr, "%s\n", nbd_get_error ()); + exit (EXIT_FAILURE); + } + assert (seen == 0x2); + + /* Filtering with no contexts - pointless, so qemu rejects it */ + bytes_sent = nbd_stats_bytes_sent (nbd); + seen = 0; + if (nbd_block_status_filter (nbd, exportsize, 0, list (0x0), + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) != -1) { + fprintf (stderr, "expecting block status failure\n"); + exit (EXIT_FAILURE); + } + assert (seen == 0x0); + if (nbd_get_errno () != EINVAL) { + fprintf (stderr, "expecting EINVAL after block status failure\n"); + exit (EXIT_FAILURE); + } + if (nbd_stats_bytes_sent (nbd) <= bytes_sent) { + fprintf (stderr, "expecting server-side rejection of bad request\n"); + exit (EXIT_FAILURE); + } + + /* Giving unknown string triggers EINVAL from libnbd */ + bytes_sent = nbd_stats_bytes_sent (nbd); + seen = 0; + { + const char *bogus[] = { "qemu:dirty-bitmap:bitmap2", NULL }; + if (nbd_block_status_filter (nbd, exportsize, 0, (char **) bogus, + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) != -1) { + fprintf (stderr, "expecting block status failure\n"); + exit (EXIT_FAILURE); + } + } + if (nbd_get_errno () != EINVAL) { + fprintf (stderr, "expecting EINVAL after block status failure\n"); + exit (EXIT_FAILURE); + } + assert (seen == 0x0); + if (nbd_stats_bytes_sent (nbd) != bytes_sent) { + fprintf (stderr, "expecting client-side rejection of bad request\n"); + exit (EXIT_FAILURE); + } + + /* Giving same string twice triggers EINVAL from qemu */ + seen = 0; + { + const char *dupes[] = { "base:allocation", "base:allocation", NULL }; + if (nbd_block_status_filter (nbd, exportsize, 0, (char **) dupes, + (nbd_extent64_callback) { .callback = cb, + .user_data = &seen }, + 0) != -1) { + fprintf (stderr, "expecting block status failure\n"); + exit (EXIT_FAILURE); + } + } + if (nbd_get_errno () != EINVAL) { + fprintf (stderr, "expecting EINVAL after block status failure\n"); + exit (EXIT_FAILURE); + } + assert (seen == 0x0); + if (nbd_stats_bytes_sent (nbd) <= bytes_sent) { + fprintf (stderr, "expecting server-side rejection of bad request\n"); + exit (EXIT_FAILURE); + } + + /* Done */ if (nbd_shutdown (nbd, 0) == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); exit (EXIT_FAILURE); diff --git a/interop/block-status-payload.sh b/interop/block-status-payload.sh index a12cfc8a..0e6681b6 100755 --- a/interop/block-status-payload.sh +++ b/interop/block-status-payload.sh @@ -49,6 +49,7 @@ args = ["qemu-nbd", "-f", "qcow2", "-A", "-B", "bitmap0", "-B", "bitmap1", h.connect_systemd_socket_activation(args) assert h.aio_is_negotiating() is True assert h.get_extended_headers_negotiated() is False + # Flag not available until info or go try: h.can_block_status_payload() @@ -58,7 +59,18 @@ except nbd.Error: h.opt_info() assert h.can_block_status_payload() is False assert h.can_meta_context("base:allocation") is True -h.opt_abort() + +# Filter request not allowed if not advertised +def f(): + assert False +h.opt_go() +assert h.can_block_status_payload() is False +try: + h.block_status_filter(0, 512, ["base:allocation"], f) + assert False +except nbd.Error: + pass +h.shutdown() ' # Conditional part of test: if qemu is new enough to support extended diff --git a/info/info-can.sh b/info/info-can.sh index 8154d1ce..097837d2 100755 --- a/info/info-can.sh +++ b/info/info-can.sh @@ -38,6 +38,9 @@ requires bash -c "nbdkit sh --dump-plugin | grep has_can_cache=1" # and oldstyle never, but that feels like depending a bit too much on # the implementation. +# --can block-status-payload is not supported by nbdkit yet. Testing +# is done during interop with new-enough qemu. + # --can structured-reply is not a per-export setting, but rather # something set on the server as a whole. -- 2.40.1
Richard W.M. Jones
2023-Jun-08 09:56 UTC
[Libguestfs] [libnbd PATCH v3 00/22] NBD 64-bit extensions (libnbd portion)
Sorry it took me so long to get around to this one. Hopefully the next version will need only cursory approval. I did read all of the patches, and I basically agree with Laszlo on the ones he reviewed already. Therefore I didn't add any comment except where necessary. You can assume Acked-by on those patches. Thanks! Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW