thr3ads.net - Libguestfs - [Libguestfs] FYI: perf commands I'm using to benchmark nbdcopy [May 2021]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2021-May-26 21:36 UTC

[Libguestfs] FYI: perf commands I'm using to benchmark nbdcopy

On Wed, May 26, 2021 at 03:15:13PM +0100, Richard W.M. Jones
wrote:> On Wed, May 26, 2021 at 04:49:50PM +0300, Nir Soffer wrote:
> > On Wed, May 26, 2021 at 4:03 PM Richard W.M. Jones <rjones at
redhat.com> wrote:
> > > In my testing, nbdcopy is a clear 4x faster than qemu-img
convert, with
> > > 4 also happening to be the default number of connections/threads.
> > > Why use nbdcopy --connections=1?  That completely disables
threads in
> > > nbdcopy.
> > 
> > Because qemu-nbd does not report multicon when writing, so practically
> > you get one nbd handle for writing.
> 
> Let's see if we can fix that.  Crippling nbdcopy because of a missing
> feature in qemu-nbd isn't right.  I wonder what Eric's reasoning
for
> multi-conn not being safe is.
multi-conn implies that connection A writes, connection B flushes, and
connection C is then guaranteed to read what connection A wrote.
Furthermore, if client A and B plan on doing overlapping writes, the
presence of multi-conn means that whoever flushes last is guaranteed
to have that last write stick.  Without multiconn, even if A writes, B
writes, B flushes, then A flushes, you can end up with A's data
(rather than B's) as the final contents on disk, because the separate
connections are allowed to have separate caching regions where the
order of flushes determines which cache (with potentially stale data)
gets flushed when.  And note that the effect of overlapping writes may
happen even when your client requests are not overlapping: if client A
and B both write distinct 512 byte regions within a larger 4k page,
the server performing RMW caching of that page will behave as though
there are overlapping writes.

During nbdcopy or qemu-img convert, we aren't reading what we just
wrote and can easily arrange to avoid overlapping writes, so we don't
care about the bulk of the semantics of multi-conn (other than it is a
nice hint of a server that accepts multiple clients).  So at the end
of the day, it boils down to:

If the server advertised multi-conn: connect multiple clients, then
when all of them are done writing, only ONE client has to flush, and
the flush will be valid for what all of the clients wrote.

If the server did not advertise multi-conn, but still allows multiple
clients: connect those clients, write to distinct areas (avoid
overlapping writes, and hopefully your writes are sized large enough
that you are also avoiding overlapping cache granularities); then when
all clients are finished writing, ALL of them must call flush
(ideally, only the first flush takes any real time, and the server can
optimize the later flushes as having nothing further to flush - but we
can't guarantee that).

New enough nbdkit also has the multi-conn filter where you can play
around with different multi-conn policies :)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Nir Soffer

2021-May-26 22:11 UTC

head link

[Libguestfs] FYI: perf commands I'm using to benchmark nbdcopy

On Thu, May 27, 2021 at 12:36 AM Eric Blake <eblake at redhat.com>
wrote:>
> On Wed, May 26, 2021 at 03:15:13PM +0100, Richard W.M. Jones wrote:
> > On Wed, May 26, 2021 at 04:49:50PM +0300, Nir Soffer wrote:
> > > On Wed, May 26, 2021 at 4:03 PM Richard W.M. Jones <rjones at
redhat.com> wrote:
> > > > In my testing, nbdcopy is a clear 4x faster than qemu-img
convert, with
> > > > 4 also happening to be the default number of
connections/threads.
> > > > Why use nbdcopy --connections=1?  That completely disables
threads in
> > > > nbdcopy.
> > >
> > > Because qemu-nbd does not report multicon when writing, so
practically
> > > you get one nbd handle for writing.
> >
> > Let's see if we can fix that.  Crippling nbdcopy because of a
missing
> > feature in qemu-nbd isn't right.  I wonder what Eric's
reasoning for
> > multi-conn not being safe is.
>
> multi-conn implies that connection A writes, connection B flushes, and
> connection C is then guaranteed to read what connection A wrote.
> Furthermore, if client A and B plan on doing overlapping writes, the
> presence of multi-conn means that whoever flushes last is guaranteed
> to have that last write stick.  Without multiconn, even if A writes, B
> writes, B flushes, then A flushes, you can end up with A's data
> (rather than B's) as the final contents on disk, because the separate
> connections are allowed to have separate caching regions where the
> order of flushes determines which cache (with potentially stale data)
> gets flushed when.  And note that the effect of overlapping writes may
> happen even when your client requests are not overlapping: if client A
> and B both write distinct 512 byte regions within a larger 4k page,
> the server performing RMW caching of that page will behave as though
> there are overlapping writes.
>
> During nbdcopy or qemu-img convert, we aren't reading what we just
> wrote and can easily arrange to avoid overlapping writes, so we don't
> care about the bulk of the semantics of multi-conn (other than it is a
> nice hint of a server that accepts multiple clients).  So at the end
> of the day, it boils down to:
>
> If the server advertised multi-conn: connect multiple clients, then
> when all of them are done writing, only ONE client has to flush, and
> the flush will be valid for what all of the clients wrote.
>
> If the server did not advertise multi-conn, but still allows multiple
> clients: connect those clients, write to distinct areas (avoid
> overlapping writes, and hopefully your writes are sized large enough
> that you are also avoiding overlapping cache granularities); then when
> all clients are finished writing, ALL of them must call flush
> (ideally, only the first flush takes any real time, and the server can
> optimize the later flushes as having nothing further to flush - but we
> can't guarantee that).
In nbdcopy case, all writers write 128mb segments, except the last
one which may have shorter segment, but this cannot overlap with
anything. So I think we cannot have overlapping writes due to partial
blocks.

Optimizing flush to do only one flush when server supports multi-conn
seems like very low priority to me, so we do this later.

Looks liks nbd_ops_flush is already doing the right thing, flushing all
open handles.

So the only change needed seems to be:

diff --git a/copy/main.c b/copy/main.c
index b9dbe1d..924fedb 100644
--- a/copy/main.c
+++ b/copy/main.c
@@ -321,10 +321,6 @@ main (int argc, char *argv[])
     exit (EXIT_FAILURE);
   }

-  /* If multi-conn is not supported, force connections to 1. */
-  if (! src->ops->can_multi_conn (src) || !
dst->ops->can_multi_conn (dst))
-    connections = 1;
-
   /* Calculate the number of threads from the number of connections. */
   if (threads == 0) {
     long t;

We always use the requested number of connections, ignoring can_multi_conn.

What do  you think?

Libguestfs - May 2021 - FYI: perf commands I'm using to benchmark nbdcopy

[Libguestfs] FYI: perf commands I'm using to benchmark nbdcopy

[Libguestfs] FYI: perf commands I'm using to benchmark nbdcopy