thr3ads.net - Libguestfs - [Libguestfs] [PATCH libnbd] states: In recv_into_rbuf and send_from

If this information is useful, please help other people find it:
Share via:

Richard W.M. Jones

2019-Jun-09 18:34 UTC

[Libguestfs] [PATCH libnbd] states: In recv_into_rbuf and send_from_wbuf loop until EAGAIN.

I thought this should produce a fairly dramatic performance gain.  In
fact I couldn't measure any performance difference at all.  I think
what's happening is we're actually paying an extra syscall (to
discover the socket would block) and then doing the poll anyway.

So I don't know if it's worth having this patch.  It could be argued
that it makes the code shorter (therefore simpler), so I'm posting it,
but I'm ambivalent.

Rich.

Richard W.M. Jones

2019-Jun-09 18:34 UTC

head link

[Libguestfs] [PATCH libnbd] states: In recv_into_rbuf and send_from_wbuf loop until EAGAIN.

Previously we performed a single call to recv(2) or send(2) (or the
GnuTLS equivalents), and even if more data/space was immediately
available to receive/send we would return to poll.  Instead of this,
loop until the socket returns EAGAIN.
---
 generator/states.c | 91 ++++++++++++++++++++++------------------------
 1 file changed, 43 insertions(+), 48 deletions(-)

diff --git a/generator/states.c b/generator/states.c
index 145e8c1..cde934a 100644
--- a/generator/states.c
+++ b/generator/states.c
@@ -46,43 +46,40 @@ recv_into_rbuf (struct nbd_handle *h)
   void *rbuf;
   size_t rlen;
 
-  if (h->rlen == 0)
-    return 0;                   /* move to next state */
+  while (h->rlen > 0) {
+    /* As a special case h->rbuf is allowed to be NULL, meaning
+     * throw away the data.
+     */
+    if (h->rbuf) {
+      rbuf = h->rbuf;
+      rlen = h->rlen;
+    }
+    else {
+      rbuf = &buf;
+      rlen = h->rlen > sizeof buf ? sizeof buf : h->rlen;
+    }
 
-  /* As a special case h->rbuf is allowed to be NULL, meaning
-   * throw away the data.
-   */
-  if (h->rbuf) {
-    rbuf = h->rbuf;
-    rlen = h->rlen;
-  }
-  else {
-    rbuf = &buf;
-    rlen = h->rlen > sizeof buf ? sizeof buf : h->rlen;
-  }
-
-  r = h->sock->ops->recv (h, h->sock, rbuf, rlen);
-  if (r == -1) {
-    if (errno == EAGAIN || errno == EWOULDBLOCK)
-      return 1;                 /* more data */
-    /* sock->ops->recv called set_error already. */
-    return -1;
-  }
-  if (r == 0) {
-    set_error (0, "recv: server disconnected unexpectedly");
-    return -1;
-  }
+    r = h->sock->ops->recv (h, h->sock, rbuf, rlen);
+    if (r == -1) {
+      if (errno == EAGAIN || errno == EWOULDBLOCK)
+        return 1;      /* more data */
+      /* sock->ops->recv called set_error already. */
+      return -1;
+    }
+    if (r == 0) {
+      set_error (0, "recv: server disconnected unexpectedly");
+      return -1;
+    }
 #ifdef DUMP_PACKETS
-  if (h->rbuf != NULL)
-    nbd_internal_hexdump (h->rbuf, r, stderr);
+    if (h->rbuf != NULL)
+      nbd_internal_hexdump (h->rbuf, r, stderr);
 #endif
-  if (h->rbuf)
-    h->rbuf += r;
-  h->rlen -= r;
-  if (h->rlen == 0)
-    return 0;                   /* move to next state */
-  else
-    return 1;                   /* more data */
+    if (h->rbuf)
+      h->rbuf += r;
+    h->rlen -= r;
+  }
+
+  return 0;                     /* move to next state */
 }
 
 static int
@@ -90,21 +87,19 @@ send_from_wbuf (struct nbd_handle *h)
 {
   ssize_t r;
 
-  if (h->wlen == 0)
-    return 0;                   /* move to next state */
-  r = h->sock->ops->send (h, h->sock, h->wbuf, h->wlen);
-  if (r == -1) {
-    if (errno == EAGAIN || errno == EWOULDBLOCK)
-      return 1;                 /* more data */
-    /* sock->ops->send called set_error already. */
-    return -1;
+  while (h->wlen > 0) {
+    r = h->sock->ops->send (h, h->sock, h->wbuf, h->wlen);
+    if (r == -1) {
+      if (errno == EAGAIN || errno == EWOULDBLOCK)
+        return 1;      /* more data */
+      /* sock->ops->send called set_error already. */
+      return -1;
+    }
+    h->wbuf += r;
+    h->wlen -= r;
   }
-  h->wbuf += r;
-  h->wlen -= r;
-  if (h->wlen == 0)
-    return 0;                   /* move to next state */
-  else
-    return 1;                   /* more data */
+
+  return 0;                     /* move to next state */
 }
 
 /*----- End of prologue. -----*/
-- 
2.21.0

Eric Blake

2019-Jun-10 22:48 UTC

head link

Re: [Libguestfs] [PATCH libnbd] states: In recv_into_rbuf and send_from_wbuf loop until EAGAIN.

On 6/9/19 1:34 PM, Richard W.M. Jones wrote:> Previously we performed a single call to recv(2) or send(2) (or the
> GnuTLS equivalents), and even if more data/space was immediately
> available to receive/send we would return to poll.  Instead of this,
> loop until the socket returns EAGAIN.
In general, if you get a short read() or write(), it's BECAUSE the
socket blocked (and doing an immediate additional read or write will
likely return 0/EAGAIN because you are blocked).  So I don't think this
patch as-is is worthwhile.

But unless we have a larger buffer to read into, we're really limiting
ourselves by the length we read. I think you might actually get a
performance boost on read if you rework things to maintain a 4k or
similar buffer of pre-read data and consult that instead of going into
read (I already added the .pending() callback since gnutls in fact does
something like that already) - but then you have to try to read a full
buffer amount.

So, pre-patch, you had something like:

Most replies:
read(4) = 4
next state
read(4) = 4
next state
back to READY
poll()

vs. NBD_CMD_READ reply:
read(4) = 4
next state
read(1M) = 64k
block
poll()
read(1M-64k) = 64k
block
poll()
...

where this patch didn't really change short replies, but changes large
reads into:

read(4) = 4
next state
read(1M) = 64k
read(1M-64k) = 0
block
poll()
read(1M-64k) = 64k
...

so yeah, this patch does NOT look beneficial as-is.  But with my idea of
reading into a 4k buffer or so, we change to something like:

buffer is empty
service 4 to read simple_reply.magic
  read(4k) = 16; buffer contains 16 bytes pre-read
  consume 4 from buffer; buffer contains 12 bytes
next state
service 4 to read simple_reply.error
  consume 4 from buffer; buffer contains 8 bytes - we skipped read :)
next state
service 8 to read simple_reply.handle
  consume 8 from buffer; buffer is empty
next state
back to READY, time to block
poll()

or even:
service 4 to read simple_reply.magic
  read(4k) = 32; buffer contains 32 bytes pre-read
  consume 4 from buffer; buffer contains 28 bytes
next state
service 4 to read simple_reply.error
  consume 4 from buffer; buffer contains 24 bytes - we skipped read :)
next state
service 8 to read simple_reply.handle
  consume 8 from buffer; buffer contains 16 bytes
next state
back to READY, buffer is non-empty, so .pending returns true
service 4 to read simple_reply.magic
  consume 4 from buffer - we again skipped read
etc.

We do, however, lose a bit of efficiency where we are no longer reading
NBD_CMD_READ replies directly into the user's buffer - we'll have to
memcpy() any prefix (up to the length of the buffer, which is why sizing
the buffer larger than 4k is probably a loss) before doing a direct
read() into the tail of the user's buffer.  So we'd have to measure the
effects, both with a client that does lots of 512-byte reads, as well as
a client that does lots of large reads, to see if any pre-buffered
read()ing followed by memcpy() saves enough syscalls to be noticeable.

Then on the write side, there MIGHT be a minor performance boost if we
try and set MSG_MORE any time we complete a request but know that there
are more requests pending, at which point we can continue the state
machine onto that next request and pack multiple commands into one TCP
segment.  But that's different than trying to call write() in a loop
after a short write.

Loops like this are good for blocking connections, but we are
intentionally using non-blocking sockets, where a short read/write is
more likely because we are actually blocked than because of an
interruption from a signal where our next attempt will not actually be
blocked.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Possibly Parallel Threads

Search for more possibly parallel threads

Libguestfs - Jun 2019 - [PATCH libnbd] states: In recv_into_rbuf and send_from_wbuf loop until EAGAIN.

[Libguestfs] [PATCH libnbd] states: In recv_into_rbuf and send_from_wbuf loop until EAGAIN.

[Libguestfs] [PATCH libnbd] states: In recv_into_rbuf and send_from_wbuf loop until EAGAIN.

Re: [Libguestfs] [PATCH libnbd] states: In recv_into_rbuf and send_from_wbuf loop until EAGAIN.

Possibly Parallel Threads