Eric Blake
2023-May-25 13:00 UTC
[Libguestfs] [libnbd PATCH v3 02/22] internal: Refactor layout of replies in sbuf
In order to more easily add a third reply type with an even larger header, but where the payload will look the same for both structured and extended replies, it is nicer if simple and structured replies are nested inside the same layer of sbuf.reply.hdr. While at it, note that while .or and .sr are structs declared within the overall sbuf union, we never read into both halves of those structs at the same time, so it does not matter if their two halves are consecutive. Dropping the packed notation on those structs means the compiler can align .payload more naturally, which may slightly improve performance on some platforms, even if it makes the overall union a few bytes larger due to padding. Visually, this patch changes the layout from: offset simple structured +------------------------------------------------------------+ | union sbuf | | +---------------------+------------------------------+ | | | struct simple_reply | union sr | | | | +-----------------+ | +--------------------------+ | | | | | | | | struct structured_reply | | | | | | | | | +----------------------+ | | | | 0 | | uint32_t magic | | | | uint32_t magic | | | | | 4 | | uint32_t error | | | | uint16_t flags | | | | | 6 | | | | | | uint16_t type | | | | | 8 | | uint64_t handle | | | | uint64_t handle | | | | | | +-----------------+ | | | | | | | | 16 | [padding] | | | uint32_t length | | | | | | | | +----------------------+ | | | | | | | union payload | | | | | | | +-----------+----------+ | | | | 20 | | | | ... | ... | | | | | | | | +-----------+----------+ | | | | | | +--------------------------+ | | | +---------------------+------------------------------+ | +------------------------------------------------------------+ to: offset simple structured +-------------------------------------------------------------+ | union sbuf | | +-----------------------------------------------------+ | | | struct reply | | | | +-------------------------------------------------+ | | | | | union hdr | | | | | | +--------------------+------------------------+ | | | | | | | struct simple | struct structured | | | | | | | | +----------------+ | +--------------------+ | | | | | 0 | | | | uint32_t magic | | | uint32_t magic | | | | | | 4 | | | | uint32_t error | | | uint16_t flags | | | | | | 6 | | | | | | | uint16_t type | | | | | | 8 | | | | uint64_t handle| | | uint64_t handle | | | | | | | | | +----------------+ | | | | | | | | 16 | | | [padding] | | uint32_t length | | | | | | | | | | +--------------------+ | | | | | 20 | | | | [padding] | | | | | | | +--------------------+------------------------+ | | | | | | union payload | | | | | | +--------------------+------------------------+ | | | | 24 | | | ... | ... | | | | | | | +--------------------+------------------------+ | | | | | +-------------------------------------------------+ | | | +-----------------------------------------------------+ | +-------------------------------------------------------------+ Technically, whether the payload union offset moves to byte 24 (with 20-23 now padding) or stays at 20 depends on compiler ABI; but many systems prefer that any struct with a uint64_t provide 8-byte alignment to its containing union. The commit is largely mechanical, and there should be no semantic change. Signed-off-by: Eric Blake <eblake at redhat.com> --- lib/internal.h | 12 ++-- generator/states-reply-simple.c | 4 +- generator/states-reply-structured.c | 103 ++++++++++++++-------------- generator/states-reply.c | 14 ++-- 4 files changed, 68 insertions(+), 65 deletions(-) diff --git a/lib/internal.h b/lib/internal.h index 25eeea34..c71980ef 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -231,14 +231,16 @@ struct nbd_handle { struct { struct nbd_fixed_new_option_reply_meta_context context; char str[NBD_MAX_STRING]; - } __attribute__ ((packed)) context; + } __attribute__ ((packed)) context; char err_msg[NBD_MAX_STRING]; } payload; - } __attribute__ ((packed)) or; + } or; struct nbd_export_name_option_reply export_name_reply; - struct nbd_simple_reply simple_reply; struct { - struct nbd_structured_reply structured_reply; + union { + struct nbd_simple_reply simple; + struct nbd_structured_reply structured; + } hdr; union { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; @@ -249,7 +251,7 @@ struct nbd_handle { uint64_t offset; /* Only used for NBD_REPLY_TYPE_ERROR_OFFSET */ } __attribute__ ((packed)) error; } payload; - } __attribute__ ((packed)) sr; + } reply; uint16_t gflags; uint32_t cflags; uint32_t len; diff --git a/generator/states-reply-simple.c b/generator/states-reply-simple.c index 8fd9f62a..e6f1ee23 100644 --- a/generator/states-reply-simple.c +++ b/generator/states-reply-simple.c @@ -23,7 +23,7 @@ REPLY.SIMPLE_REPLY.START: struct command *cmd = h->reply_cmd; uint32_t error; - error = be32toh (h->sbuf.simple_reply.error); + error = be32toh (h->sbuf.reply.hdr.simple.error); if (cmd == NULL) { /* Unexpected reply. If error was set or we have structured @@ -39,7 +39,7 @@ REPLY.SIMPLE_REPLY.START: if (error || h->structured_replies) SET_NEXT_STATE (%^FINISH_COMMAND); else { - uint64_t cookie = be64toh (h->sbuf.simple_reply.handle); + uint64_t cookie = be64toh (h->sbuf.reply.hdr.simple.handle); SET_NEXT_STATE (%.DEAD); set_error (EPROTO, "no matching cookie %" PRIu64 " found for server reply, " diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c index 96182222..6f96945a 100644 --- a/generator/states-reply-structured.c +++ b/generator/states-reply-structured.c @@ -49,9 +49,9 @@ REPLY.STRUCTURED_REPLY.START: * so read the remaining part. */ h->rbuf = &h->sbuf; - h->rbuf = (char *)h->rbuf + sizeof h->sbuf.simple_reply; - h->rlen = sizeof h->sbuf.sr.structured_reply; - h->rlen -= sizeof h->sbuf.simple_reply; + h->rbuf = (char *)h->rbuf + sizeof h->sbuf.reply.hdr.simple; + h->rlen = sizeof h->sbuf.reply.hdr.structured; + h->rlen -= sizeof h->sbuf.reply.hdr.simple; SET_NEXT_STATE (%RECV_REMAINING); return 0; @@ -71,9 +71,9 @@ REPLY.STRUCTURED_REPLY.CHECK: uint16_t flags, type; uint32_t length; - flags = be16toh (h->sbuf.sr.structured_reply.flags); - type = be16toh (h->sbuf.sr.structured_reply.type); - length = be32toh (h->sbuf.sr.structured_reply.length); + flags = be16toh (h->sbuf.reply.hdr.structured.flags); + type = be16toh (h->sbuf.reply.hdr.structured.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); /* Reject a server that replies with too much information, but don't * reject a single structured reply to NBD_CMD_READ on the largest @@ -82,7 +82,7 @@ REPLY.STRUCTURED_REPLY.CHECK: * oversized reply is going to take long enough to resync that it is * not worth keeping the connection alive. */ - if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.sr.payload.offset_data) { + if (length > MAX_REQUEST_SIZE + sizeof h->sbuf.reply.payload.offset_data) { set_error (0, "invalid server reply length %" PRIu32, length); SET_NEXT_STATE (%.DEAD); return 0; @@ -105,19 +105,19 @@ REPLY.STRUCTURED_REPLY.CHECK: * them as an extension, so we use < instead of <=. */ if (cmd->type != NBD_CMD_READ || - length < sizeof h->sbuf.sr.payload.offset_data) + length < sizeof h->sbuf.reply.payload.offset_data) goto resync; - h->rbuf = &h->sbuf.sr.payload.offset_data; - h->rlen = sizeof h->sbuf.sr.payload.offset_data; + h->rbuf = &h->sbuf.reply.payload.offset_data; + h->rlen = sizeof h->sbuf.reply.payload.offset_data; SET_NEXT_STATE (%RECV_OFFSET_DATA); break; case NBD_REPLY_TYPE_OFFSET_HOLE: if (cmd->type != NBD_CMD_READ || - length != sizeof h->sbuf.sr.payload.offset_hole) + length != sizeof h->sbuf.reply.payload.offset_hole) goto resync; - h->rbuf = &h->sbuf.sr.payload.offset_hole; - h->rlen = sizeof h->sbuf.sr.payload.offset_hole; + h->rbuf = &h->sbuf.reply.payload.offset_hole; + h->rlen = sizeof h->sbuf.reply.payload.offset_hole; SET_NEXT_STATE (%RECV_OFFSET_HOLE); break; @@ -127,8 +127,8 @@ REPLY.STRUCTURED_REPLY.CHECK: goto resync; assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent)); /* Start by reading the context ID. */ - h->rbuf = &h->sbuf.sr.payload.bs_hdr; - h->rlen = sizeof h->sbuf.sr.payload.bs_hdr; + h->rbuf = &h->sbuf.reply.payload.bs_hdr; + h->rlen = sizeof h->sbuf.reply.payload.bs_hdr; SET_NEXT_STATE (%RECV_BS_HEADER); break; @@ -139,10 +139,10 @@ REPLY.STRUCTURED_REPLY.CHECK: * compliant, will favor the wire error over EPROTO during more * length checks in RECV_ERROR_MESSAGE and RECV_ERROR_TAIL. */ - if (length < sizeof h->sbuf.sr.payload.error.error.error) + if (length < sizeof h->sbuf.reply.payload.error.error.error) goto resync; - h->rbuf = &h->sbuf.sr.payload.error.error; - h->rlen = MIN (length, sizeof h->sbuf.sr.payload.error.error); + h->rbuf = &h->sbuf.reply.payload.error.error; + h->rlen = MIN (length, sizeof h->sbuf.reply.payload.error.error); SET_NEXT_STATE (%RECV_ERROR); } else @@ -168,19 +168,19 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - assert (length >= sizeof h->sbuf.sr.payload.error.error.error); + length = be32toh (h->sbuf.reply.hdr.structured.length); + assert (length >= sizeof h->sbuf.reply.payload.error.error.error); assert (cmd); - if (length < sizeof h->sbuf.sr.payload.error.error) + if (length < sizeof h->sbuf.reply.payload.error.error) goto resync; - msglen = be16toh (h->sbuf.sr.payload.error.error.len); - if (msglen > length - sizeof h->sbuf.sr.payload.error.error || - msglen > sizeof h->sbuf.sr.payload.error.msg) + msglen = be16toh (h->sbuf.reply.payload.error.error.len); + if (msglen > length - sizeof h->sbuf.reply.payload.error.error || + msglen > sizeof h->sbuf.reply.payload.error.msg) goto resync; - h->rbuf = h->sbuf.sr.payload.error.msg; + h->rbuf = h->sbuf.reply.payload.error.msg; h->rlen = msglen; SET_NEXT_STATE (%RECV_ERROR_MESSAGE); } @@ -188,11 +188,11 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR: resync: /* Favor the error packet's errno over RESYNC's EPROTO. */ - error = be32toh (h->sbuf.sr.payload.error.error.error); + error = be32toh (h->sbuf.reply.payload.error.error.error); if (cmd->error == 0) cmd->error = nbd_internal_errno_of_nbd_error (error); h->rbuf = NULL; - h->rlen = length - MIN (length, sizeof h->sbuf.sr.payload.error.error); + h->rlen = length - MIN (length, sizeof h->sbuf.reply.payload.error.error); SET_NEXT_STATE (%RESYNC); return 0; @@ -207,15 +207,15 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - msglen = be16toh (h->sbuf.sr.payload.error.error.len); - type = be16toh (h->sbuf.sr.structured_reply.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); + msglen = be16toh (h->sbuf.reply.payload.error.error.len); + type = be16toh (h->sbuf.reply.hdr.structured.type); - length -= sizeof h->sbuf.sr.payload.error.error + msglen; + length -= sizeof h->sbuf.reply.payload.error.error + msglen; if (msglen) debug (h, "structured error server message: %.*s", (int)msglen, - h->sbuf.sr.payload.error.msg); + h->sbuf.reply.payload.error.msg); /* Special case two specific errors; silently ignore tail for all others */ h->rbuf = NULL; @@ -227,11 +227,11 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_MESSAGE: "the server may have a bug"); break; case NBD_REPLY_TYPE_ERROR_OFFSET: - if (length != sizeof h->sbuf.sr.payload.error.offset) + if (length != sizeof h->sbuf.reply.payload.error.offset) debug (h, "unable to safely extract error offset, " "the server may have a bug"); else - h->rbuf = &h->sbuf.sr.payload.error.offset; + h->rbuf = &h->sbuf.reply.payload.error.offset; break; } SET_NEXT_STATE (%RECV_ERROR_TAIL); @@ -250,8 +250,8 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_TAIL: SET_NEXT_STATE (%.READY); return 0; case 0: - error = be32toh (h->sbuf.sr.payload.error.error.error); - type = be16toh (h->sbuf.sr.structured_reply.type); + error = be32toh (h->sbuf.reply.payload.error.error.error); + type = be16toh (h->sbuf.reply.hdr.structured.type); assert (cmd); /* guaranteed by CHECK */ @@ -266,7 +266,7 @@ REPLY.STRUCTURED_REPLY.RECV_ERROR_TAIL: * user callback if present. Ignore the offset if it was bogus. */ if (type == NBD_REPLY_TYPE_ERROR_OFFSET && h->rbuf) { - uint64_t offset = be64toh (h->sbuf.sr.payload.error.offset); + uint64_t offset = be64toh (h->sbuf.reply.payload.error.offset); if (structured_reply_in_bounds (offset, 0, cmd) && cmd->type == NBD_CMD_READ && CALLBACK_IS_NOT_NULL (cmd->cb.fn.chunk)) { @@ -307,8 +307,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - offset = be64toh (h->sbuf.sr.payload.offset_data.offset); + length = be32toh (h->sbuf.reply.hdr.structured.length); + offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ @@ -346,8 +346,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_DATA_DATA: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); - offset = be64toh (h->sbuf.sr.payload.offset_data.offset); + length = be32toh (h->sbuf.reply.hdr.structured.length); + offset = be64toh (h->sbuf.reply.payload.offset_data.offset); assert (cmd); /* guaranteed by CHECK */ if (CALLBACK_IS_NOT_NULL (cmd->cb.fn.chunk)) { @@ -377,8 +377,8 @@ REPLY.STRUCTURED_REPLY.RECV_OFFSET_HOLE: SET_NEXT_STATE (%.READY); return 0; case 0: - offset = be64toh (h->sbuf.sr.payload.offset_hole.offset); - length = be32toh (h->sbuf.sr.payload.offset_hole.length); + offset = be64toh (h->sbuf.reply.payload.offset_hole.offset); + length = be32toh (h->sbuf.reply.payload.offset_hole.length); assert (cmd); /* guaranteed by CHECK */ @@ -426,12 +426,12 @@ REPLY.STRUCTURED_REPLY.RECV_BS_HEADER: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); + length = be32toh (h->sbuf.reply.hdr.structured.length); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); assert (length >= 12); - length -= sizeof h->sbuf.sr.payload.bs_hdr; + length -= sizeof h->sbuf.reply.payload.bs_hdr; free (h->bs_entries); h->bs_entries = malloc (length); @@ -460,7 +460,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: SET_NEXT_STATE (%.READY); return 0; case 0: - length = be32toh (h->sbuf.sr.structured_reply.length); + length = be32toh (h->sbuf.reply.hdr.structured.length); assert (cmd); /* guaranteed by CHECK */ assert (cmd->type == NBD_CMD_BLOCK_STATUS); @@ -468,7 +468,8 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: assert (h->bs_entries); assert (length >= 12); assert (h->meta_valid); - count = (length - sizeof h->sbuf.sr.payload.bs_hdr) / sizeof *h->bs_entries; + count = (length - sizeof h->sbuf.reply.payload.bs_hdr) / + sizeof *h->bs_entries; /* Need to byte-swap the entries returned, but apart from that we * don't validate them. @@ -477,7 +478,7 @@ REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES: h->bs_entries[i] = be32toh (h->bs_entries[i]); /* Look up the context ID. */ - context_id = be32toh (h->sbuf.sr.payload.bs_hdr.context_id); + context_id = be32toh (h->sbuf.reply.payload.bs_hdr.context_id); for (i = 0; i < h->meta_contexts.len; ++i) if (context_id == h->meta_contexts.ptr[i].context_id) break; @@ -524,8 +525,8 @@ REPLY.STRUCTURED_REPLY.RESYNC: SET_NEXT_STATE (%^FINISH_COMMAND); return 0; } - type = be16toh (h->sbuf.sr.structured_reply.type); - length = be32toh (h->sbuf.sr.structured_reply.length); + type = be16toh (h->sbuf.reply.hdr.structured.type); + length = be32toh (h->sbuf.reply.hdr.structured.length); debug (h, "unexpected reply type %u or payload length %" PRIu32 " for cookie %" PRIu64 " and command %" PRIu32 ", this is probably a server bug", @@ -539,7 +540,7 @@ REPLY.STRUCTURED_REPLY.RESYNC: REPLY.STRUCTURED_REPLY.FINISH: uint16_t flags; - flags = be16toh (h->sbuf.sr.structured_reply.flags); + flags = be16toh (h->sbuf.reply.hdr.structured.flags); if (flags & NBD_REPLY_FLAG_DONE) { SET_NEXT_STATE (%^FINISH_COMMAND); } diff --git a/generator/states-reply.c b/generator/states-reply.c index f7888154..31e4bd2f 100644 --- a/generator/states-reply.c +++ b/generator/states-reply.c @@ -69,8 +69,8 @@ REPLY.START: assert (h->reply_cmd == NULL); assert (h->rlen == 0); - h->rbuf = &h->sbuf; - h->rlen = sizeof h->sbuf.simple_reply; + h->rbuf = &h->sbuf.reply.hdr; + h->rlen = sizeof h->sbuf.reply.hdr.simple; r = h->sock->ops->recv (h, h->sock, h->rbuf, h->rlen); if (r == -1) { @@ -115,7 +115,7 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: uint32_t magic; uint64_t cookie; - magic = be32toh (h->sbuf.simple_reply.magic); + magic = be32toh (h->sbuf.reply.hdr.simple.magic); if (magic == NBD_SIMPLE_REPLY_MAGIC) { SET_NEXT_STATE (%SIMPLE_REPLY.START); } @@ -127,8 +127,8 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: SET_NEXT_STATE (%.DEAD); set_error (0, "invalid reply magic 0x%" PRIx32, magic); #if 0 /* uncomment to see desynchronized data */ - nbd_internal_hexdump (&h->sbuf.simple_reply, - sizeof (h->sbuf.simple_reply), + nbd_internal_hexdump (&h->sbuf.reply.hdr.simple, + sizeof (h->sbuf.reply.hdr.simple), stderr); #endif return 0; @@ -138,7 +138,7 @@ REPLY.CHECK_SIMPLE_OR_STRUCTURED_REPLY: * handle (our cookie) is stored at the same offset. */ h->chunks_received++; - cookie = be64toh (h->sbuf.simple_reply.handle); + cookie = be64toh (h->sbuf.reply.hdr.simple.handle); /* Find the command amongst the commands in flight. If the server sends * a reply for an unknown cookie, FINISH will diagnose that later. */ @@ -157,7 +157,7 @@ REPLY.FINISH_COMMAND: /* NB: This works for both simple and structured replies because the * handle (our cookie) is stored at the same offset. */ - cookie = be64toh (h->sbuf.simple_reply.handle); + cookie = be64toh (h->sbuf.reply.hdr.simple.handle); /* Find the command amongst the commands in flight. */ for (cmd = h->cmds_in_flight, prev_cmd = NULL; cmd != NULL; -- 2.40.1
Laszlo Ersek
2023-May-26 15:53 UTC
[Libguestfs] [libnbd PATCH v3 02/22] internal: Refactor layout of replies in sbuf
On 5/25/23 15:00, Eric Blake wrote:> In order to more easily add a third reply type with an even larger > header, but where the payload will look the same for both structured > and extended replies, it is nicer if simple and structured replies are > nested inside the same layer of sbuf.reply.hdr.This makes sense, and the following, strictly related code change in the patch body corresponds to the explanation (except for the "__attribute__ ((packed))" removal, but more on that later): - struct nbd_simple_reply simple_reply; struct { - struct nbd_structured_reply structured_reply; + union { + struct nbd_simple_reply simple; + struct nbd_structured_reply structured; + } hdr; union { struct nbd_structured_reply_offset_data offset_data; struct nbd_structured_reply_offset_hole offset_hole; struct nbd_structured_reply_block_status_hdr bs_hdr; struct { struct nbd_structured_reply_error error; char msg[NBD_MAX_STRING]; /* Common to all error types */ uint64_t offset; /* Only used for NBD_REPLY_TYPE_ERROR_OFFSET */ } __attribute__ ((packed)) error; } payload; - } __attribute__ ((packed)) sr; + } reply; One interesting consequence is that the payload union now becomes possible after the simple header, which IIUC makes no sense per NBD spec. But, it doesn't hurt either. At the same time:> While at it, note > that while .or and .sr are structs declared within the overall sbuf > union, we never read into both halves of those structs at the same > time, so it does not matter if their two halves are consecutive.I've been staring at this part for an extremely long time, and I just can't make any sense of it. My (strongly held) opinion is that this part of the patch should be split off to a separate patch. Here's a possible interpretation: - The "sbuf.or" ("option reply") member of the "sbuf" union is a structure. It has two members: "sbuf.or.option_reply" (another structure) and "sbuf.or.payload" (a union). We never read into "sbuf.or.option_reply" and "sbuf.or.payload" at the same time (i.e., we never cross the structure field boundary between them, with a single recv operation), therefore we can remove the "packed" attribute from "sbuf.or". This permits the compiler to insert padding between "sbuf.or.option_reply" and "sbuf.or.payload", and/or after "sbuf.or". - Similarly, the "sbuf.sr" ("structured reply") member of the "sbuf" union is a structure. It has two members: "sbuf.sr.structured_reply" (a structure) and "sbuf.sr.payload" (a union). We never read into "sbuf.sr.structured_reply" and "sbuf.sr.payload" at the same time (i.e., we never cross the structure field boundary between them, with a single recv operation), therefore we can remove the "packed" attribute from "sbuf.sr". This permits the compiler to insert padding between "sbuf.sr.structured_reply" and "sbuf.sr.payload", and/or after "sbuf.sr". In turn, my problem with this interpretation is that, if we never cross the struct member boundary within "sbuf.or" (i.e., between "option_reply" and "payload"), and similarly we never cross the struct member boundary within "sbuf.sr" (i.e., between "structured_reply" and "payload"), then *why* are "sbuf.or" and "sbuf.sr" structures in the first place (and not unions)? Now, I can imagine that the answer sounds like this: indeed we don't cross those field boundaries with a single recv, however we still need the data from both fields *at the same time*, later on. And this certainly sounds like a valid explanation, and dropping the "packed" attributes is fine / justified as well, under this interpreation, but it absolutely must be a separate patch, IMO. I think I've spent at least half an hour explaining it to myself.> Dropping the packed notation on those structs means the compiler can > align .payload more naturally, which may slightly improve performance > on some platforms, even if it makes the overall union a few bytes > larger due to padding. > > Visually, this patch changes the layout from: > > offset simple structured > +------------------------------------------------------------+ > | union sbuf | > | +---------------------+------------------------------+ | > | | struct simple_reply | union sr | | > | | +-----------------+ | +--------------------------+ | | > | | | | | | struct structured_reply | | | > | | | | | | +----------------------+ | | | > | 0 | | uint32_t magic | | | | uint32_t magic | | | | > | 4 | | uint32_t error | | | | uint16_t flags | | | | > | 6 | | | | | | uint16_t type | | | | > | 8 | | uint64_t handle | | | | uint64_t handle | | | | > | | +-----------------+ | | | | | | | > | 16 | [padding] | | | uint32_t length | | | | > | | | | +----------------------+ | | | > | | | | union payload | | | > | | | | +-----------+----------+ | | | > | 20 | | | | ... | ... | | | | > | | | | +-----------+----------+ | | | > | | | +--------------------------+ | | > | +---------------------+------------------------------+ | > +------------------------------------------------------------+ > > to: > > offset simple structured > +-------------------------------------------------------------+ > | union sbuf | > | +-----------------------------------------------------+ | > | | struct reply | | > | | +-------------------------------------------------+ | | > | | | union hdr | | | > | | | +--------------------+------------------------+ | | | > | | | | struct simple | struct structured | | | | > | | | | +----------------+ | +--------------------+ | | | | > | 0 | | | | uint32_t magic | | | uint32_t magic | | | | | > | 4 | | | | uint32_t error | | | uint16_t flags | | | | | > | 6 | | | | | | | uint16_t type | | | | | > | 8 | | | | uint64_t handle| | | uint64_t handle | | | | | > | | | | +----------------+ | | | | | | | > | 16 | | | [padding] | | uint32_t length | | | | | > | | | | | +--------------------+ | | | | > | 20 | | | | [padding] | | | | > | | | +--------------------+------------------------+ | | | > | | | union payload | | | > | | | +--------------------+------------------------+ | | | > | 24 | | | ... | ... | | | | > | | | +--------------------+------------------------+ | | | > | | +-------------------------------------------------+ | | > | +-----------------------------------------------------+ | > +-------------------------------------------------------------+ > > Technically, whether the payload union offset moves to byte 24 (with > 20-23 now padding) or stays at 20 depends on compiler ABI; but many > systems prefer that any struct with a uint64_t provide 8-byte > alignment to its containing union. > > The commit is largely mechanical, and there should be no semantic > change. > > Signed-off-by: Eric Blake <eblake at redhat.com> > --- > lib/internal.h | 12 ++-- > generator/states-reply-simple.c | 4 +- > generator/states-reply-structured.c | 103 ++++++++++++++-------------- > generator/states-reply.c | 14 ++-- > 4 files changed, 68 insertions(+), 65 deletions(-) > > diff --git a/lib/internal.h b/lib/internal.h > index 25eeea34..c71980ef 100644 > --- a/lib/internal.h > +++ b/lib/internal.h > @@ -231,14 +231,16 @@ struct nbd_handle { > struct { > struct nbd_fixed_new_option_reply_meta_context context; > char str[NBD_MAX_STRING]; > - } __attribute__ ((packed)) context; > + } __attribute__ ((packed)) context;Whitespace fixup, tolerable as a part of this patch.> char err_msg[NBD_MAX_STRING]; > } payload; > - } __attribute__ ((packed)) or; > + } or;So I disagree with doing this here. I'd like it to be its own patch.> struct nbd_export_name_option_reply export_name_reply; > - struct nbd_simple_reply simple_reply; > struct { > - struct nbd_structured_reply structured_reply; > + union { > + struct nbd_simple_reply simple; > + struct nbd_structured_reply structured; > + } hdr; > union { > struct nbd_structured_reply_offset_data offset_data; > struct nbd_structured_reply_offset_hole offset_hole;Yes, fine (although formatting it with a much larger context would have helped -- anyway I'm reviewing the series with -W in my local clone).> @@ -249,7 +251,7 @@ struct nbd_handle { > uint64_t offset; /* Only used for NBD_REPLY_TYPE_ERROR_OFFSET */ > } __attribute__ ((packed)) error; > } payload; > - } __attribute__ ((packed)) sr; > + } reply; > uint16_t gflags; > uint32_t cflags; > uint32_t len;Here too, the __attribute__ ((packed)) removal shoould be in a separate patch. Renaming "sr" to "reply" is "in scope", of course.> diff --git a/generator/states-reply-simple.c b/generator/states-reply-simple.c > index 8fd9f62a..e6f1ee23 100644 > --- a/generator/states-reply-simple.c > +++ b/generator/states-reply-simple.c > @@ -23,7 +23,7 @@ REPLY.SIMPLE_REPLY.START: > struct command *cmd = h->reply_cmd; > uint32_t error; > > - error = be32toh (h->sbuf.simple_reply.error); > + error = be32toh (h->sbuf.reply.hdr.simple.error); > > if (cmd == NULL) { > /* Unexpected reply. If error was set or we have structuredYes.> @@ -39,7 +39,7 @@ REPLY.SIMPLE_REPLY.START: > if (error || h->structured_replies) > SET_NEXT_STATE (%^FINISH_COMMAND); > else { > - uint64_t cookie = be64toh (h->sbuf.simple_reply.handle); > + uint64_t cookie = be64toh (h->sbuf.reply.hdr.simple.handle); > SET_NEXT_STATE (%.DEAD); > set_error (EPROTO, > "no matching cookie %" PRIu64 " found for server reply, "Yes; same thing, just different name.> diff --git a/generator/states-reply-structured.c b/generator/states-reply-structured.c > index 96182222..6f96945a 100644 > --- a/generator/states-reply-structured.c > +++ b/generator/states-reply-structured.c > @@ -49,9 +49,9 @@ REPLY.STRUCTURED_REPLY.START: > * so read the remaining part. > */ > h->rbuf = &h->sbuf; > - h->rbuf = (char *)h->rbuf + sizeof h->sbuf.simple_reply; > - h->rlen = sizeof h->sbuf.sr.structured_reply; > - h->rlen -= sizeof h->sbuf.simple_reply; > + h->rbuf = (char *)h->rbuf + sizeof h->sbuf.reply.hdr.simple; > + h->rlen = sizeof h->sbuf.reply.hdr.structured; > + h->rlen -= sizeof h->sbuf.reply.hdr.simple; > SET_NEXT_STATE (%RECV_REMAINING); > return 0; >Here I disagree with the mechanical approach. I even take issue with the pre-patch code. Pre-patch, we fill in "h->sbuf.simple_reply" (in "generator/states-reply.c"), i.e., one member of the top-level "sbuf" union, but then continue filling a member of a *different* member (i.e., "sr.structured_reply") of the "sbuf" union here. This looks undefined by the C standard, but even if it's not undefined, it's supremely confusing. Now, if structure "nbd_simple_reply" were the first member of "nbd_structured_reply", maybe we could salvage this somehow -- but that's not the case. ... Right, in "generator/states-reply.c", we have: /* We read all replies initially as if they are simple replies, but * check the magic in CHECK_SIMPLE_OR_STRUCTURED_REPLY below. * This works because the structured_reply header is larger. */ This is precisely my problem. The common initial subsequence is only the "uint32_t magic" field. Therefore the code in "generator/states-reply.c" should only read the magic field at first, and then branch to simple vs. structured reply. Post-patch, the sbuf.reply.hdr union actually reflects this well! The structure members of that union all begin with the common subsequence "uint32_t magic", so using the special guarantee from the C standard, it's fine to fill in "magic" via one union member, and read it back via another. Pre-patch, the guarantee doesn't apply, because "sbuf.simple_reply" and "sbuf.sr.structured_reply" are not members of the same union. However, even with this patch applied (i.e., with "sbuf.reply.hdr" existing), the code performing the reading continues making me uncomfortable. We effectively have this sequence: - take address of "h->sbuf.reply.hdr" - read "sizeof h->sbuf.reply.hdr.simple" bytes into it - (at this point we can safely access "h->sbuf.reply.hdr.simple", *and* "h->sbuf.reply.hdr.structured.magic" as well -- but not the rest of "h->sbuf.reply.hdr.structured"!) - take the address of "h->sbuf.reply.hdr.structured.length" -- but not by name; instead, only by virtue of the previous read having filled in "magic", "flags", "type", and "handle", as a *happenstance*, via populating "h->sbuf.reply.hdr.simple" - read as many bytes as necessary in order to complete "h->sbuf.reply.hdr.structured", from that point onward. This smells to me. Optimally, the simple reply and the structured reply should look like this: struct nbd_reply_header { uint32_t magic; union { struct { uint32_t error; uint64_t handle; } simple; struct { uint16_t flags; uint16_t type; uint64_t handle; uint32_t length; } structured; } magic_specific; }; and we should have separate automaton states for reading "magic_specific.simple" and "magic_specific.structured". In REPLY.START, we should only read "magic". We should have a sepate state called REPLY.SIMPLE_REPLY.START, for reading "magic_specific.simple". In REPLY.STRUCTURED_REPLY.START, we should point h->rbuf at "magic_specific.structured", and read "sizeof magic_specific.structured" bytes. I'll stop here because there's already much to discuss. Thanks, Laszlo