Eric Blake
2019-Jun-10 22:11 UTC
[Libguestfs] [nbdkit PATCH] crypto: Tweak handling of SEND_MORE
In the recent commit 3842a080 to add SEND_MORE support, I blindly implemented the tls code as: if (SEND_MORE) { cork send } else { send uncork } because it showed improvements for my test case of aio-parallel-load from libnbd. But that test sticks to 64k I/O requests. But with further investigation, I've learned that even though gnutls corking works great for smaller NBD_CMD_READ replies, it can actually pessimize behavior for larger requests (such as a client sending lots of 2M read requests). This is because gnutls loses time both to realloc()s to copy the entire packet into memory, as well as CPU time spent encrypting the entire payload before anything is sent, not to mention that it still ends up fragmenting the message to fit TCP segment sizing. So, let's further tweak the logic to make a decision based on a heuristic: if we're going to split the reply for exceeding a TCP segment anyway, then uncork before the data (this sends the header out early as a partial packet, but that happens while the CPU is encrypting the large payload); otherwise, there's still hope that we can merge the previous request and this one into a single TCP segment (which may or may not happen, based on how much overhead TLS adds, given that 64k is the maximum TCP segment). It may turn out that we can tune the limit for slightly better performance (maybe 32k is smarter than 64k), but since the previous commit saw improvement with the benchmark using 64k packets, that's the value picked for now. Signed-off-by: Eric Blake <eblake@redhat.com> --- server/crypto.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/server/crypto.c b/server/crypto.c index 3f3d96b..356f04f 100644 --- a/server/crypto.c +++ b/server/crypto.c @@ -345,6 +345,12 @@ crypto_recv (struct connection *conn, void *vbuf, size_t len) return 1; } +/* If this send()'s length is so large that it is going to require + * multiple TCP segments anyway, there's no need to try and merge it + * with any corked data from a previous send that used SEND_MORE. + */ +#define MAX_SEND_MORE_LEN (64 * 1024) + /* Write buffer to GnuTLS and either succeed completely * (returns 0) or fail (returns -1). flags is ignored for now. */ @@ -357,7 +363,11 @@ crypto_send (struct connection *conn, const void *vbuf, size_t len, int flags) assert (session != NULL); - if (flags & SEND_MORE) + if (len >= MAX_SEND_MORE_LEN) { + if (gnutls_record_uncork (session, GNUTLS_RECORD_WAIT) < 0) + return -1; + } + else if (flags & SEND_MORE) gnutls_record_cork (session); while (len > 0) { -- 2.20.1
Eric Blake
2019-Jun-12 19:19 UTC
Re: [Libguestfs] [nbdkit PATCH] crypto: Tweak handling of SEND_MORE
On 6/10/19 5:11 PM, Eric Blake wrote:> In the recent commit 3842a080 to add SEND_MORE support, I blindly > implemented the tls code as: > if (SEND_MORE) { > cork > send > } else { > send > uncork > } > because it showed improvements for my test case of aio-parallel-load > from libnbd. But that test sticks to 64k I/O requests. >...> (which may or may not happen, based on how much overhead TLS adds, > given that 64k is the maximum TCP segment). It may turn out that we > can tune the limit for slightly better performance (maybe 32k is > smarter than 64k), but since the previous commit saw improvement with > the benchmark using 64k packets, that's the value picked for now.After more measurements, and a tweak to libnbd aio-parallel-load to make it easier to test various command sizes, I came up with this rough table: packet size: 512 32k 64k 2m No cork [1]: 325,209.3 187,598.9 116,304.2 4,856.7 full cork [2]: 461,083.8 198,009.3 118,072.2 4,595.1 32k cork: 456,365.5 196,586.5 117,772.4 4,779.1 64k cork [3]: 450,443.6 194,505.9 117,226.6 4,822.2 128k cork: 455,049.4 197,991.4 117,157.9 4,953.7 [1] Using the code prior to 3842a080 [2] Using the code from 3842a080 [3] The value chosen for this patch Some of those numbers are in the noise, but I'm feeling more confident about my choice, and the fact that uncorking early DOES benefit large payload sizes without too much penalty to small payload sizes. So I'm going ahead and pushing this patch with one additional tweak:> @@ -357,7 +363,11 @@ crypto_send (struct connection *conn, const void *vbuf, size_t len, int flags) > > assert (session != NULL); > > - if (flags & SEND_MORE) > + if (len >= MAX_SEND_MORE_LEN) {Here, I'm now using if (len + gnutls_record_check_corked (session) > MAX_SEND_MORE_LEN) { after verifying in the gnutls source code that blindly checking the corked length is not a significant burden (no syscalls). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2019-Jun-12 20:56 UTC
Re: [Libguestfs] [nbdkit PATCH] crypto: Tweak handling of SEND_MORE
On Mon, Jun 10, 2019 at 05:11:26PM -0500, Eric Blake wrote:> In the recent commit 3842a080 to add SEND_MORE support, I blindly > implemented the tls code as: > if (SEND_MORE) { > cork > send > } else { > send > uncork > } > because it showed improvements for my test case of aio-parallel-load > from libnbd. But that test sticks to 64k I/O requests. > > But with further investigation, I've learned that even though gnutls > corking works great for smaller NBD_CMD_READ replies, it can actually > pessimize behavior for larger requests (such as a client sending lots > of 2M read requests). This is because gnutls loses time both to > realloc()s to copy the entire packet into memory, as well as CPU time > spent encrypting the entire payload before anything is sent, not to > mention that it still ends up fragmenting the message to fit TCP > segment sizing. > > So, let's further tweak the logic to make a decision based on a > heuristic: if we're going to split the reply for exceeding a TCP > segment anyway, then uncork before the data (this sends the header > out early as a partial packet, but that happens while the CPU is > encrypting the large payload); otherwise, there's still hope that we > can merge the previous request and this one into a single TCP segment > (which may or may not happen, based on how much overhead TLS adds, > given that 64k is the maximum TCP segment). It may turn out that we > can tune the limit for slightly better performance (maybe 32k is > smarter than 64k), but since the previous commit saw improvement with > the benchmark using 64k packets, that's the value picked for now. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > server/crypto.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/server/crypto.c b/server/crypto.c > index 3f3d96b..356f04f 100644 > --- a/server/crypto.c > +++ b/server/crypto.c > @@ -345,6 +345,12 @@ crypto_recv (struct connection *conn, void *vbuf, size_t len) > return 1; > } > > +/* If this send()'s length is so large that it is going to require > + * multiple TCP segments anyway, there's no need to try and merge it > + * with any corked data from a previous send that used SEND_MORE. > + */ > +#define MAX_SEND_MORE_LEN (64 * 1024) > + > /* Write buffer to GnuTLS and either succeed completely > * (returns 0) or fail (returns -1). flags is ignored for now. > */ > @@ -357,7 +363,11 @@ crypto_send (struct connection *conn, const void *vbuf, size_t len, int flags) > > assert (session != NULL); > > - if (flags & SEND_MORE) > + if (len >= MAX_SEND_MORE_LEN) { > + if (gnutls_record_uncork (session, GNUTLS_RECORD_WAIT) < 0) > + return -1; > + } > + else if (flags & SEND_MORE) > gnutls_record_cork (session); > > while (len > 0) {Looks reasonable, ACK. In answer to the previous question about how to choose the threshold, I have no idea either :-( Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Reasonably Related Threads
- [nbdkit PATCH v2 0/2] Reduce network overhead with MSG_MORE/corking
- [nbdkit PATCH 0/2] Reduce network overhead with corking
- [nbdkit PATCH v2 2/2] server: Group related transmission send()s
- [nbdkit PATCH 1/2] server: Add support for corking
- [PATCH 1/1] nbd/server: push pending frames after sending reply