thr3ads.net - Libguestfs - Re: [Libguestfs] [nbdkit PATCH 3/4] nbd: Use libnbd 0.1 [May 2019]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-May-30 19:00 UTC

Re: [Libguestfs] [nbdkit PATCH 3/4] nbd: Use libnbd 0.1

On 5/29/19 10:13 PM, Eric Blake wrote:> Benchmark-wise, using the same setup as in commit e897ed70, I see
> an order-of-magnitude slowdown:
> 
> Pre-patch, the runs averaged   1.266s, 1.30E+08 bits/s
> Post-patch, the runs averaged 11.663s, 1.41E+07 bits/s
> 
> This will need further profiling to determine how much is nbdkit's
> fault, and how much is libnbd's.  I think that we are probably holding
> some locks for too long, resulting in some serialized performance.
> Also, the standalone code was able to run read of command 1 in
> parallel with write of command 2 via separate threads, whereas
> libnbd's state machine is serializing everything (whether or not the
> state machine spreads out the I/O to do writes from the thread calling
> nbd_aio_FOO and reads from the reader thread, the two are never run at
> once).
Rich identified and fixed the culprit - libnbd was not setting
TCP_NODELAY (disabling Nagle's algorithm) the way nbd-standalone.c did,
which meant that any request that gets split over TCP windowing sizes
waits to send the second packet until the ACK for the first has been
received by the server. While disabling Nagle's increases network
overhead (you are sending more short packets with their overhead rather
than bundling things into larger packets by virtue of the downtime
waiting for the ACK), it is an artificial delay (the server can't
process anything until the packets arrive). libnbd 0.1.2 now disables
Nagle's algorithm, and my repeat of the tests with that change showed an
improvement to 2.180s, which is more eassily explained by the
serialization nature that libnbd is never read()ing from one thread at
the same time another thread is write()ing.

I'd have to intentionally cripple nbd-standalone.c to do that same
serialization, to see if libnbd actually offers an overall performance
gain when I/O is serialized (may not be as easy as it sounds; I was
relying on blocking I/O from separate threads, but merely serializing
that so that no thread is doing I/O concurrent with another risks
deadlock in the case of a client sending a large NBD_CMD_READ followed
by NBD_CMD_WRITE to a server that responds strictly serially, where the
server is waiting for the client to finish the read's response but the
client is waiting for the server to parse the write's request; libnbd
works around that by using non-blocking sockets and prioritizing reads
over writes rather than insisting on complete transactions).

> -/* Connect to a TCP socket, returning the fd on success */
> -static int
> -nbdplug_connect_tcp (void)
> -{
> -  struct addrinfo hints = { .ai_family = AF_UNSPEC,
> -                            .ai_socktype = SOCK_STREAM, };
> -  struct addrinfo *result, *rp;
> -  int r;
> -  const int optval = 1;
> -  int fd;
> -
> -
> -  if (setsockopt (fd, IPPROTO_TCP, TCP_NODELAY, &optval,
> -                  sizeof (int)) == -1) {
> -    nbdkit_error ("cannot set TCP_NODELAY option: %m");
> -    close (fd);
> -    return -1;
> -  }

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Richard W.M. Jones

2019-May-31 08:43 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 3/4] nbd: Use libnbd 0.1

On Thu, May 30, 2019 at 02:00:48PM -0500, Eric Blake
wrote:> [...] which is more eassily explained by the
> serialization nature that libnbd is never read()ing from one thread at
> the same time another thread is write()ing.
>
> [...] libnbd
> works around that by using non-blocking sockets and prioritizing reads
> over writes rather than insisting on complete transactions).
Indeed, this is something that even a more complex state machine in
libnbd could not solve.  It fundamentally assumes that there is one
thread running the handle at any one time, but to both write to and
read from the socket simultaneously would require two threads.  That's
a large architectural change (I'm not even sure how to do it).

The question is whether this use case can be covered up by using
multi-conn, or if there's a requirement for parallel writes and reads
on the same socket.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

Eric Blake

2019-May-31 11:55 UTC

head link

Re: [Libguestfs] [nbdkit PATCH 3/4] nbd: Use libnbd 0.1

On 5/31/19 3:43 AM, Richard W.M. Jones wrote:> On Thu, May 30, 2019 at 02:00:48PM -0500, Eric Blake wrote:
>> [...] which is more eassily explained by the
>> serialization nature that libnbd is never read()ing from one thread at
>> the same time another thread is write()ing.
>>
>> [...] libnbd
>> works around that by using non-blocking sockets and prioritizing reads
>> over writes rather than insisting on complete transactions).
> 
> Indeed, this is something that even a more complex state machine in
> libnbd could not solve.  It fundamentally assumes that there is one
> thread running the handle at any one time, but to both write to and
> read from the socket simultaneously would require two threads.  That's
> a large architectural change (I'm not even sure how to do it).
There's always POSIX aio_* (which, in glibc, creates a worker thread on
your behalf rather than being truly single-thread async). And
https://blog.cloudflare.com/io_submit-the-epoll-alternative-youve-never-heard-about/
states that Linux-only io_submit() (probably via libaio) can do a truly
async single-thread network polling loop that can therefore manage both
simultaneous reads and writes; but that requires an entirely different
control loop (io_getevents instead of poll), so it may be a lot of work
to wire up, before we'd even know if it actually adds any gain.
> 
> The question is whether this use case can be covered up by using
> multi-conn, or if there's a requirement for parallel writes and reads
> on the same socket.
A client truly wanting to saturate performance will probably implement
their own multi-conn on top of libnbd.  But if we want libnbd to be used
in qemu, which is currently using coroutines, then yes, qemu is
currently exploiting the fact that coroutines can track simultaneous
read and write on the same socket. So I don't think we have to redesign
everything now, but it is food for thought for future performance
improvements.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Possibly Parallel Threads

Search for more possibly parallel threads

Libguestfs - May 2019 - Re: [nbdkit PATCH 3/4] nbd: Use libnbd 0.1

Re: [Libguestfs] [nbdkit PATCH 3/4] nbd: Use libnbd 0.1

Re: [Libguestfs] [nbdkit PATCH 3/4] nbd: Use libnbd 0.1

Re: [Libguestfs] [nbdkit PATCH 3/4] nbd: Use libnbd 0.1

Possibly Parallel Threads