Richard W.M. Jones
2020-Aug-05 11:15 UTC
[Libguestfs] More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)
[NB: Adding PUBLIC mailing list because this is upstream discussion] On Mon, Aug 03, 2020 at 06:27:04PM +0100, Richard W.M. Jones wrote:> On Mon, Aug 03, 2020 at 06:03:23PM +0300, Nir Soffer wrote: > > On Mon, Aug 3, 2020 at 5:47 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > All this make sense, but when we upload 10 disks we have 10 connections > > but still we cannot push data fast enough. Avoiding copies will help, > > but I don't > > expect huge difference. > > > > My guess is the issue is on the other side - pulling data from vmware. > > I can believe this too. VDDK is really slow, and especially the way > we use it is probably not optimal either -- but it has a confusing > threading model and I don't know if we can safely use a more parallel > thread model: > > https://github.com/libguestfs/nbdkit/blob/89a36b1fab8302ddc370695d386a28a03a74eae7/plugins/vddk/vddk.c#L505 > > I may have a play around with this tomorrow.The threading model allowed by VDDK is restrictive. The rules are here: https://code.vmware.com/docs/11750/virtual-disk-development-kit-programming-guide/GUID-6BE903E8-DC70-46D9-98E4-E34A2002C2AD.html I did a bit of testing, and it's possible to do better than what we are doing at the moment. Not sure at present if this will be easy or will add a lot of complexity. Read on ... I found through experimentation that it is possible to open multiple VDDK handles pointing to the same disk. This would allow us to use SERIALIZE_REQUESTS (instead of SERIALIZE_ALL_REQUESTS) and have overlapping calls through different handles all pointing back to the same server/disk. We should have to change all open/close calls to make the request through a single background thread - see document above for why. Adding a background thread and all the RPC needed to marshall these calls is the part which would add the complexity. However I suspect we might be able to get away with just adding a mutex around open/close. The open/close requests would happen on different threads but would not overlap. This is contrary to the rules above, but it could be sufficient. This is what I'm testing at the moment. It is definitely *not* possible to move to PARALLEL since nbdkit would make requests in parallel on the same VDDK handle, which is not allowed. (I did try this to see if the document above was serious, and it crashed in all kinds of strange ways, so I guess yes they are serious.) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Nir Soffer
2020-Aug-05 11:39 UTC
Re: [Libguestfs] More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)
On Wed, Aug 5, 2020 at 2:15 PM Richard W.M. Jones <rjones@redhat.com> wrote:> > [NB: Adding PUBLIC mailing list because this is upstream discussion] > > On Mon, Aug 03, 2020 at 06:27:04PM +0100, Richard W.M. Jones wrote: > > On Mon, Aug 03, 2020 at 06:03:23PM +0300, Nir Soffer wrote: > > > On Mon, Aug 3, 2020 at 5:47 PM Richard W.M. Jones <rjones@redhat.com> wrote: > > > All this make sense, but when we upload 10 disks we have 10 connections > > > but still we cannot push data fast enough. Avoiding copies will help, > > > but I don't > > > expect huge difference. > > > > > > My guess is the issue is on the other side - pulling data from vmware. > > > > I can believe this too. VDDK is really slow, and especially the way > > we use it is probably not optimal either -- but it has a confusing > > threading model and I don't know if we can safely use a more parallel > > thread model: > > > > https://github.com/libguestfs/nbdkit/blob/89a36b1fab8302ddc370695d386a28a03a74eae7/plugins/vddk/vddk.c#L505 > > > > I may have a play around with this tomorrow. > > The threading model allowed by VDDK is restrictive. The rules are here: > > https://code.vmware.com/docs/11750/virtual-disk-development-kit-programming-guide/GUID-6BE903E8-DC70-46D9-98E4-E34A2002C2AD.html > > I did a bit of testing, and it's possible to do better than what we > are doing at the moment. Not sure at present if this will be easy or > will add a lot of complexity. Read on ... > > I found through experimentation that it is possible to open multiple > VDDK handles pointing to the same disk. This would allow us to use > SERIALIZE_REQUESTS (instead of SERIALIZE_ALL_REQUESTS) and have > overlapping calls through different handles all pointing back to the > same server/disk. We should have to change all open/close calls to > make the request through a single background thread - see document > above for why. > > Adding a background thread and all the RPC needed to marshall these > calls is the part which would add the complexity. > > However I suspect we might be able to get away with just adding a > mutex around open/close. The open/close requests would happen on > different threads but would not overlap. This is contrary to the > rules above, but it could be sufficient. This is what I'm testing at > the moment. > > It is definitely *not* possible to move to PARALLEL since nbdkit would > make requests in parallel on the same VDDK handle, which is not > allowed. (I did try this to see if the document above was serious, > and it crashed in all kinds of strange ways, so I guess yes they are > serious.)Can we use something like the file plugin? thread pool of workers, each keeping open vddk handle, and serving requests in parallel from the same nbd socket? This is kind of ugly but simple, and it works great for the file plugin - we get better performance than qemu-nbd. But since we get low throughput even when we have 10 concurrent handles for 10 different disks, I'm sure this will help, and the issue may be deeper in vmware. Maybe they intentionally throttle the clients?> Rich. > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > Fedora Windows cross-compiler. Compile Windows programs, test, and > build Windows installers. Over 100 libraries supported. > http://fedoraproject.org/wiki/MinGW >
Richard W.M. Jones
2020-Aug-05 11:58 UTC
Re: [Libguestfs] More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)
On Wed, Aug 05, 2020 at 02:39:44PM +0300, Nir Soffer wrote:> Can we use something like the file plugin? thread pool of workers, > each keeping open vddk handle, and serving requests in parallel from > the same nbd socket?Yes, but this isn't implemented in the plugins, it's implemented in the server. The server always uses a thread pool, but plugins can opt for more or less concurrency by adjusting the thread model: http://libguestfs.org/nbdkit-plugin.3.html#Threads The file plugin uses PARALLEL: $ nbdkit file --dump-plugin | grep thread max_thread_model=parallel thread_model=parallel The VDDK plugin currently uses SERIALIZE_ALL_REQUESTS: $ nbdkit vddk --dump-plugin | grep thread max_thread_model=serialize_all_requests thread_model=serialize_all_requests The proposal is to use SERIALIZE_REQUESTS, with an extra mutex added by the plugin around VixDiskLib_Open and _Close calls. PARALLEL is not possible.> This is kind of ugly but simple, and it works great for the file > plugin - we get better > performance than qemu-nbd. > > But since we get low throughput even when we have 10 concurrent > handles for 10 different disks, I'm sure this will help, and the > issue may be deeper in vmware. Maybe they intentionally throttle the > clients?The whole server side seems very heavyweight, judging by how long it takes to answer single requests. It might just be poor implementation rather than throttling though. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Richard W.M. Jones
2020-Aug-05 12:28 UTC
Re: [Libguestfs] More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)
Nir, BTW what are you using for performance testing? As far as I can tell it's not possible to make qemu-img convert use multi-conn when connecting to the source (which is going to be a problem if we want to use this stuff in virt-v2v). Instead I've hacked up a copy of this program from libnbd: https://github.com/libguestfs/libnbd/blob/master/examples/threaded-reads-and-writes.c so that it only does reads and aligns requests to 512 bytes. At least this is testing multi-conn, but there should be an easier way ... Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Possibly Parallel Threads
- Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)
- Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)
- Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)
- Re: [PATCH nbdkit 1/2] vddk: Relax threading model: SERIALIZE_ALL_REQUESTS -> SERIALIZE_REQUESTS.
- Re: More parallelism in VDDK driver (was: Re: CFME-5.11.7.3 Perf. Tests)