thr3ads.net - Libguestfs - [Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v [Jul 2021]

If this information is useful, please help other people find it:
Share via:

Richard W.M. Jones

2021-Jul-27 11:16 UTC

[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v

Hi Eric, a couple of questions below about nbdkit performance.

Modular virt-v2v will use disk pipelines everywhere.  The input
pipeline looks something like this:

  socket <- cow filter <- cache filter <-   nbdkit
                                           curl|vddk

We found there's a notable slow down in at least one case: When the
source plugin is very slow (eg. it's curl plugin to a slow and remote
website, or VDDK in general), everything runs very slowly.

I made a simple test case to demonstrate this:

$ virt-builder fedora-33
$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img
delay-read=500ms --run 'virt-inspector --format=raw -a "$uri"
-vx'

This uses a local file with the delay filter on top injecting half
second delays into every read.  It "feels" a lot like the slow case we
were observing.  Virt-v2v also does inspection as a first step when
converting an image, so using virt-inspector is somewhat realistic.

Unfortunately this actually runs far too slowly for me to wait around
- at least 30 mins, and probably a lot longer.  This compares to only
7 seconds if you remove the delay filter.

Reducing the delay to 50ms means at least it finishes in a reasonable time:

$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
     delay-read=50ms \
     --run 'virt-inspector --format=raw -a "$uri"'

real    5m16.298s
user    0m0.509s
sys     0m2.894s

In the above scenario the cache filter is not actually doing anything
(since virt-inspector does not write).  Adding cache-on-read=true lets
us cache the reads, avoiding going through the "slow" plugin in many
cases, and the result is a lot better:

$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
     delay-read=50ms cache-on-read=true \
     --run 'virt-inspector --format=raw -a "$uri"'

real    0m27.731s
user    0m0.304s
sys     0m1.771s

However this is still slower than the old method which used qcow2 +
qemu's copy-on-read.  It's harder to demonstrate this, but I modified
virt-inspector to use the copy-on-read setting (which it doesn't do
normally).  On top of nbdkit with 50ms delay and no other filters:

qemu + copy-on-read backed by nbdkit delay-read=50ms file:
real    0m23.251s

So 23s is the time to beat.  (I believe that with longer delays, the
gap between qemu and nbdkit increases in favour of qemu.)

Q1: What other ideas could we explore to improve performance?

- - -

In real scenarios we'll actually want to combine cow + cache, where
cow is caching writes, and cache is caching reads.

  socket <- cow filter <- cache filter   <-  nbdkit
                       cache-on-read=true   curl|vddk

The cow filter is necessary to prevent changes being written back to
the pristine source image.

This is actually surprisingly efficient, making no noticable
difference in this test:

time ./nbdkit --filter=cow --filter=cache --filter=delay \
     file /var/tmp/fedora-33.img \
     delay-read=50ms cache-on-read=true \
     --run 'virt-inspector --format=raw -a "$uri"' 

real	0m27.193s
user	0m0.283s
sys	0m1.776s

Q2: Should we consider a "cow-on-read" flag to the cow filter (thus
removing the need to use the cache filter at all)?


Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

Martin Kletzander

2021-Jul-27 12:18 UTC

head link

[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v

On Tue, Jul 27, 2021 at 12:16:59PM +0100, Richard W.M. Jones
wrote:>Hi Eric, a couple of questions below about nbdkit performance.
>
>Modular virt-v2v will use disk pipelines everywhere.  The input
>pipeline looks something like this:
>
>  socket <- cow filter <- cache filter <-   nbdkit
>                                           curl|vddk
>
>We found there's a notable slow down in at least one case: When the
>source plugin is very slow (eg. it's curl plugin to a slow and remote
>website, or VDDK in general), everything runs very slowly.
>
>I made a simple test case to demonstrate this:
>
>$ virt-builder fedora-33
>$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img
delay-read=500ms --run 'virt-inspector --format=raw -a "$uri"
-vx'
>
>This uses a local file with the delay filter on top injecting half
>second delays into every read.  It "feels" a lot like the slow
case we
>were observing.  Virt-v2v also does inspection as a first step when
>converting an image, so using virt-inspector is somewhat realistic.
>
>Unfortunately this actually runs far too slowly for me to wait around
>- at least 30 mins, and probably a lot longer.  This compares to only
>7 seconds if you remove the delay filter.
>
>Reducing the delay to 50ms means at least it finishes in a reasonable time:
>
>$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
>     delay-read=50ms \
>     --run 'virt-inspector --format=raw -a "$uri"'
>
>real    5m16.298s
>user    0m0.509s
>sys     0m2.894s
>
>In the above scenario the cache filter is not actually doing anything
>(since virt-inspector does not write).  Adding cache-on-read=true lets
>us cache the reads, avoiding going through the "slow" plugin in
many
>cases, and the result is a lot better:
>
>$ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
>     delay-read=50ms cache-on-read=true \
>     --run 'virt-inspector --format=raw -a "$uri"'
>
>real    0m27.731s
>user    0m0.304s
>sys     0m1.771s
>
>However this is still slower than the old method which used qcow2 +
>qemu's copy-on-read.  It's harder to demonstrate this, but I
modified
>virt-inspector to use the copy-on-read setting (which it doesn't do
>normally).  On top of nbdkit with 50ms delay and no other filters:
>
>qemu + copy-on-read backed by nbdkit delay-read=50ms file:
>real    0m23.251s
>
>So 23s is the time to beat.  (I believe that with longer delays, the
>gap between qemu and nbdkit increases in favour of qemu.)
>
>Q1: What other ideas could we explore to improve performance?
>
First thing that came to mind: Could it be that QEMU's cache-on-read
caches maybe bigger blocks making it effectively do some small
read-ahead as well?
>- - -
>
>In real scenarios we'll actually want to combine cow + cache, where
>cow is caching writes, and cache is caching reads.
>
>  socket <- cow filter <- cache filter   <-  nbdkit
>                       cache-on-read=true   curl|vddk
>
>The cow filter is necessary to prevent changes being written back to
>the pristine source image.
>
>This is actually surprisingly efficient, making no noticable
>difference in this test:
>
>time ./nbdkit --filter=cow --filter=cache --filter=delay \
>     file /var/tmp/fedora-33.img \
>     delay-read=50ms cache-on-read=true \
>     --run 'virt-inspector --format=raw -a "$uri"'
>
>real	0m27.193s
>user	0m0.283s
>sys	0m1.776s
>
>Q2: Should we consider a "cow-on-read" flag to the cow filter
(thus
>removing the need to use the cache filter at all)?
>
That would make at least some sense since there is cow-on-cache already
(albeit a little confusing for me personally).  I presume it would not
increase the size of the difference (when using qemu-img rebase) at all,
right?  I do not see however how it would be faster than the existing:

   cow <- cache[cache-on-read]

Martin
>
>Rich.
>
>-- 
>Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
>Read my programming and virtualization blog: http://rwmj.wordpress.com
>virt-df lists disk usage of guests without needing to install any
>software inside the virtual machine.  Supports Linux and Windows.
>http://people.redhat.com/~rjones/virt-df/
>
>_______________________________________________
>Libguestfs mailing list
>Libguestfs at redhat.com
>https://listman.redhat.com/mailman/listinfo/libguestfs
>-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL:
<http://listman.redhat.com/archives/libguestfs/attachments/20210727/e36a2144/attachment.sig>

Eric Blake

2021-Jul-29 01:50 UTC

head link

[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v

On Tue, Jul 27, 2021 at 12:16:59PM +0100, Richard W.M. Jones
wrote:> Hi Eric, a couple of questions below about nbdkit performance.
> 
> Modular virt-v2v will use disk pipelines everywhere.  The input
> pipeline looks something like this:
> 
>   socket <- cow filter <- cache filter <-   nbdkit
>                                            curl|vddk
> 
> We found there's a notable slow down in at least one case: When the
> source plugin is very slow (eg. it's curl plugin to a slow and remote
> website, or VDDK in general), everything runs very slowly.
> 
> I made a simple test case to demonstrate this:
> 
> $ virt-builder fedora-33
> $ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img
delay-read=500ms --run 'virt-inspector --format=raw -a "$uri"
-vx'
> 
> This uses a local file with the delay filter on top injecting half
> second delays into every read.  It "feels" a lot like the slow
case we
> were observing.  Virt-v2v also does inspection as a first step when
> converting an image, so using virt-inspector is somewhat realistic.
> 
> Unfortunately this actually runs far too slowly for me to wait around
> - at least 30 mins, and probably a lot longer.  This compares to only
> 7 seconds if you remove the delay filter.
> 
> Reducing the delay to 50ms means at least it finishes in a reasonable time:
> 
> $ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
>      delay-read=50ms \
>      --run 'virt-inspector --format=raw -a "$uri"'
> 
> real    5m16.298s
> user    0m0.509s
> sys     0m2.894s
Sounds like the reads are rather serialized (the application is not
proceeding to do a second read until after it has the result of the
first read) rather than highly parallel (where the application would
be reading multiple sites in the image at once, possibly by requesting
the start of a read at two different offsets before knowing which of
those two offsets is even useful).  There's also a question of how
frequently a given portion of the disk image is re-read (caching will
speed things up if data is revisited multiple times, but just adds
overhead if the reads are truly once-only access for the life of the
process).
> 
> In the above scenario the cache filter is not actually doing anything
> (since virt-inspector does not write).  Adding cache-on-read=true lets
> us cache the reads, avoiding going through the "slow" plugin in
many
> cases, and the result is a lot better:
> 
> $ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \
>      delay-read=50ms cache-on-read=true \
>      --run 'virt-inspector --format=raw -a "$uri"'
> 
> real    0m27.731s
> user    0m0.304s
> sys     0m1.771s
Okay, that sounds like there is indeed frequent re-reading of portions
of the disk (or at least reading of nearby smaller offsets that fall
within the same larger granularity used by the cache).
> 
> However this is still slower than the old method which used qcow2 +
> qemu's copy-on-read.  It's harder to demonstrate this, but I
modified
> virt-inspector to use the copy-on-read setting (which it doesn't do
> normally).  On top of nbdkit with 50ms delay and no other filters:
> 
> qemu + copy-on-read backed by nbdkit delay-read=50ms file:
> real    0m23.251s
qemu's copy-on-read creates a qcow2 image backed by a read-only base
image; any read that the qcow2 can't satisfy causes the entire cluster
to be read from the backing image into the qcow2 file, even if that
cluster is larger than what the client was actually reading.  It will
benefit from the same speedups of only hitting a given region of the
backing file once in the life of the process.

But it also assumes the presence of a backing chain.  If you try to
use copy-on-read on something that does not have a backing chain (such
as a direct use of an NBD link), the performance suffers (as we
discussed on IRC).  My understanding is that for every read operation,
the COR code does a block status query to see whether the data was
local or came from the backing chain; but in the case of an NBD image
which does not have a backing chain from qemu's point of view, EVERY
block status operation comes back as being local, and the COR has
nothing further to do - so the performance penalty is because of the
extra time spent on that block status call, particularly if that
results in another round trip NBD command over the wire before any
reading happens.
> 
> So 23s is the time to beat.  (I believe that with longer delays, the
> gap between qemu and nbdkit increases in favour of qemu.)
> 
> Q1: What other ideas could we explore to improve performance?
Have you played with block sizing?  (Reading the git log, you have...)
Part of qemu's COR behavior is that for any read not found in the
qcow2 active layer, the entire cluster is copied up the backing chain;
a 512-byte client read becomes a 32k cluster read for the default
sizing.  Other block sizes may be more efficient, such as 64k or 1M
per request actually sent over the wire.
> 
> - - -
> 
> In real scenarios we'll actually want to combine cow + cache, where
> cow is caching writes, and cache is caching reads.
> 
>   socket <- cow filter <- cache filter   <-  nbdkit
>                        cache-on-read=true   curl|vddk
> 
> The cow filter is necessary to prevent changes being written back to
> the pristine source image.
> 
> This is actually surprisingly efficient, making no noticable
> difference in this test:
> 
> time ./nbdkit --filter=cow --filter=cache --filter=delay \
>      file /var/tmp/fedora-33.img \
>      delay-read=50ms cache-on-read=true \
>      --run 'virt-inspector --format=raw -a "$uri"' 
> 
> real	0m27.193s
> user	0m0.283s
> sys	0m1.776s
> 
> Q2: Should we consider a "cow-on-read" flag to the cow filter
(thus
> removing the need to use the cache filter at all)?
Since cow is already a form of caching (anything we touched now lives
locally, so we don't have to re-visit the original data source), yes,
it makes sense to have a cow-on-read mode that stores even reads
locally.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Libguestfs - Jul 2021 - Some questions about nbdkit vs qemu performance affecting virt-v2v

[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v

[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v

[Libguestfs] Some questions about nbdkit vs qemu performance affecting virt-v2v