I'm not sure if this is worth pursuing. On paper, it makes sense (if we know we have multiple commands batched to send over the wire, AND those commands are short in length, we might as well use MSG_MORE), but the measurement numbers with it applied might just be in the noise. Eric Blake (2): examples: Enhance access patterns of threaded-reads-and-writes states: Another use for MSG_MORE examples/threaded-reads-and-writes.c | 12 ++++++++---- generator/states-issue-command.c | 4 +++- 2 files changed, 11 insertions(+), 5 deletions(-) -- 2.20.1
Eric Blake
2019-Jun-12 22:04 UTC
[Libguestfs] [libnbd PATCH 1/2] examples: Enhance access patterns of threaded-reads-and-writes
A typical client will probably be interleaving small and large requests, rather than using uniform requests everywhere. Update the test to randomly simulate this. For me, this cuts test runtime from 23.1s to 12.6s. --- examples/threaded-reads-and-writes.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/examples/threaded-reads-and-writes.c b/examples/threaded-reads-and-writes.c index 1f66f2f..6d09cfc 100644 --- a/examples/threaded-reads-and-writes.c +++ b/examples/threaded-reads-and-writes.c @@ -46,7 +46,7 @@ static int64_t exportsize; */ #define MAX_IN_FLIGHT 64 -/* The size of reads and writes. */ +/* The size of large reads and writes, must be > 512. */ #define BUFFER_SIZE (1024*1024) /* Number of commands we issue (per thread). */ @@ -188,7 +188,9 @@ start_thread (void *arg) uint64_t handles[MAX_IN_FLIGHT]; size_t in_flight; /* counts number of requests in flight */ int dir, r, cmd; + size_t size; + assert (512 < BUFFER_SIZE); buf = malloc (BUFFER_SIZE); if (buf == NULL) { perror ("malloc"); @@ -237,14 +239,16 @@ start_thread (void *arg) * the same buffer for multiple in-flight requests. It doesn't * matter here because we're just trying to write random stuff, * but that would be Very Bad in a real application. + * Simulate a mix of large and small requests. */ while (i > 0 && in_flight < MAX_IN_FLIGHT) { - offset = rand () % (exportsize - BUFFER_SIZE); + size = (rand() & 1) ? BUFFER_SIZE : 512; + offset = rand () % (exportsize - size); cmd = rand () & 1; if (cmd == 0) - handle = nbd_aio_pwrite (nbd, buf, BUFFER_SIZE, offset, 0); + handle = nbd_aio_pwrite (nbd, buf, size, offset, 0); else - handle = nbd_aio_pread (nbd, buf, BUFFER_SIZE, offset, 0); + handle = nbd_aio_pread (nbd, buf, size, offset, 0); if (handle == -1) { fprintf (stderr, "%s\n", nbd_get_error ()); goto error; -- 2.20.1
Eric Blake
2019-Jun-12 22:04 UTC
[Libguestfs] [libnbd PATCH 2/2] states: Another use for MSG_MORE
Following up to cf1a3045, if we know that we have more requests to transmit (because the user was queueing up requests while we were busy elsewhere), and our current request is short (a non-write, or a write with a small payload), then our current command can be batched with the next command. The numbers here were not as dramatic and may be merely in the noise; over three runs of: $ time ~/nbdkit/nbdkit memory 100M --run 'examples/threaded-reads-and-writes localhost 10809' my machine improved runtime from 12.68s to 12.59s. --- generator/states-issue-command.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c index cce43d7..5d2a7e6 100644 --- a/generator/states-issue-command.c +++ b/generator/states-issue-command.c @@ -42,7 +42,7 @@ h->request.count = htobe32 ((uint32_t) cmd->count); h->wbuf = &h->request; h->wlen = sizeof (h->request); - if (cmd->type == NBD_CMD_WRITE) + if (cmd->type == NBD_CMD_WRITE || cmd->next) h->wflags = MSG_MORE; SET_NEXT_STATE (%SEND_REQUEST); return 0; @@ -70,6 +70,8 @@ if (cmd->type == NBD_CMD_WRITE) { h->wbuf = cmd->data; h->wlen = cmd->count; + if (cmd->next && cmd->count < 64 * 1024) + h->wflags = MSG_MORE; SET_NEXT_STATE (%SEND_WRITE_PAYLOAD); } else -- 2.20.1
Richard W.M. Jones
2019-Jun-14 07:57 UTC
Re: [Libguestfs] [libnbd PATCH 2/2] states: Another use for MSG_MORE
On Wed, Jun 12, 2019 at 05:04:05PM -0500, Eric Blake wrote:> Following up to cf1a3045, if we know that we have more requests to > transmit (because the user was queueing up requests while we were busy > elsewhere), and our current request is short (a non-write, or a write > with a small payload), then our current command can be batched with > the next command. > > The numbers here were not as dramatic and may be merely in the noise; > over three runs of: > > $ time ~/nbdkit/nbdkit memory 100M --run 'examples/threaded-reads-and-writes localhost 10809' > > my machine improved runtime from 12.68s to 12.59s. > --- > generator/states-issue-command.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/generator/states-issue-command.c b/generator/states-issue-command.c > index cce43d7..5d2a7e6 100644 > --- a/generator/states-issue-command.c > +++ b/generator/states-issue-command.c > @@ -42,7 +42,7 @@ > h->request.count = htobe32 ((uint32_t) cmd->count); > h->wbuf = &h->request; > h->wlen = sizeof (h->request); > - if (cmd->type == NBD_CMD_WRITE) > + if (cmd->type == NBD_CMD_WRITE || cmd->next) > h->wflags = MSG_MORE; > SET_NEXT_STATE (%SEND_REQUEST); > return 0; > @@ -70,6 +70,8 @@ > if (cmd->type == NBD_CMD_WRITE) { > h->wbuf = cmd->data; > h->wlen = cmd->count; > + if (cmd->next && cmd->count < 64 * 1024) > + h->wflags = MSG_MORE; > SET_NEXT_STATE (%SEND_WRITE_PAYLOAD); > } > elseWe may as well do this, ACK series. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Possibly Parallel Threads
- [PATCH libnbd 0/3] states: Use MSG_MORE to coalesce messages.
- Re: [PATCH libnbd 2/3] states: Add handle h->wflags field.
- [libnbd PATCH 2/2] states: Another use for MSG_MORE
- [PATCH libnbd 3/3] states: Use MSG_MORE to coalesce messages into single packets.
- [libnbd PATCH 0/3] Expose server block size constraints