Eric Blake
2022-Feb-21 15:40 UTC
[Libguestfs] [PATCH libnbd 8/8] copy: Adaptive queue size
On Sun, Feb 20, 2022 at 02:14:03PM +0200, Nir Soffer wrote:> Limit the size of the copy queue also by the number of queued bytes. > This allows using many concurrent small requests, required to get good > performance, but limiting the number of large requests that are actually > faster with lower concurrency. > > New --queue-size option added to control the maximum queue size. With 2 > MiB we can have 8 256 KiB requests per connection. The default queue > size is 16 MiB, to match the default --requests value (64) with the > default --request-size (256 KiB). Testing show that using more than 16 > 256 KiB requests with one connection do not improve anything.s/do/does/> > The new option will simplify limiting memory usage when using large > requests, like this change in virt-v2v: > https://github.com/libguestfs/virt-v2v/commit/c943420219fa0ee971fc228aa4d9127c5ce973f7 > > I tested this change with 3 images: > > - Fedora 35 + 3g of random data - hopefully simulates a real image > - Fully allocated image - the special case when every read command is > converted to a write command. > - Zero image - the special case when every read command is converted to > a zero command. > > On 2 machines: > > - laptop: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz, 12 cpus, > 1.5 MiB L2 cache per 2 cpus, 12 MiB L3 cache. > - server: Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz, 80 cpus, > 1 MiB L2 cache per cpu, 27.5 MiB L3 cache. > > In all cases, both source and destination are served by qemu-nbd, using > --cache=none --aio=native. Because qemu-nbd does not support MULTI_CONMULTI_CONN> for writing, we are testing a single connection when copying an toDid you mean 'copying an image to'?> qemu-nbd. I tested also copying to null: since in this case we use 4 > connections (these tests are marked with /ro). > > Results for copying all images on all machines with nbdcopy v1.11.0 and > this change. "before" and "after" are average time of 10 runs. > > image machine before after queue size improvement > ==================================================================> fedora laptop 3.044 2.129 2m +43% > full laptop 4.900 3.136 2m +56% > zero laptop 3.147 2.624 2m +20% > ------------------------------------------------------------------- > fedora server 2.324 2.189 16m +6% > full server 3.521 3.380 8m +4% > zero server 2.297 2.338 16m -2% > ------------------------------------------------------------------- > fedora/ro laptop 2.040 1.663 1m +23% > fedora/ro server 1.585 1.393 2m +14% > > Signed-off-by: Nir Soffer <nsoffer at redhat.com> > --- > copy/main.c | 52 ++++++++++++++++++++++++------------- > copy/multi-thread-copying.c | 18 +++++++------ > copy/nbdcopy.h | 1 + > copy/nbdcopy.pod | 12 +++++++-- > 4 files changed, 55 insertions(+), 28 deletions(-) >> static void __attribute__((noreturn)) > usage (FILE *fp, int exitcode) > { > fprintf (fp, > "\n" > "Copy to and from an NBD server:\n" > "\n" > " nbdcopy [--allocated] [-C N|--connections=N]\n" > " [--destination-is-zero|--target-is-zero] [--flush]\n" > " [--no-extents] [-p|--progress|--progress=FD]\n" > -" [--request-size=N] [-R N|--requests=N] [-S N|--sparse=N]\n" > -" [--synchronous] [-T N|--threads=N] [-v|--verbose]\n" > +" [--request-size=N] [--queue-size=N] [-R N|--requests=N]\n"The options are listed in mostly alphabetic order already, so --queue-size before --request-size makes more sense to me.> @@ -104,33 +106,35 @@ main (int argc, char *argv[]) > { > enum { > HELP_OPTION = CHAR_MAX + 1, > LONG_OPTIONS, > SHORT_OPTIONS, > ALLOCATED_OPTION, > DESTINATION_IS_ZERO_OPTION, > FLUSH_OPTION, > NO_EXTENTS_OPTION, > REQUEST_SIZE_OPTION, > + QUEUE_SIZE_OPTION,Likewise here.> SYNCHRONOUS_OPTION, > }; > const char *short_options = "C:pR:S:T:vV"; > const struct option long_options[] = { > { "help", no_argument, NULL, HELP_OPTION }, > { "long-options", no_argument, NULL, LONG_OPTIONS }, > { "allocated", no_argument, NULL, ALLOCATED_OPTION }, > { "connections", required_argument, NULL, 'C' }, > { "destination-is-zero",no_argument, NULL, DESTINATION_IS_ZERO_OPTION }, > { "flush", no_argument, NULL, FLUSH_OPTION }, > { "no-extents", no_argument, NULL, NO_EXTENTS_OPTION }, > { "progress", optional_argument, NULL, 'p' }, > { "request-size", required_argument, NULL, REQUEST_SIZE_OPTION }, > + { "queue-size", required_argument, NULL, QUEUE_SIZE_OPTION },and here.> { "requests", required_argument, NULL, 'R' }, > { "short-options", no_argument, NULL, SHORT_OPTIONS }, > { "sparse", required_argument, NULL, 'S' }, > { "synchronous", no_argument, NULL, SYNCHRONOUS_OPTION }, > { "target-is-zero", no_argument, NULL, DESTINATION_IS_ZERO_OPTION }, > { "threads", required_argument, NULL, 'T' }, > { "verbose", no_argument, NULL, 'v' }, > { "version", no_argument, NULL, 'V' }, > { NULL } > }; > @@ -212,20 +216,28 @@ main (int argc, char *argv[]) > } > if (request_size < MIN_REQUEST_SIZE || request_size > MAX_REQUEST_SIZE || > !is_power_of_2 (request_size)) { > fprintf (stderr, > "%s: --request-size: must be a power of 2 within %d-%d\n", > prog, MIN_REQUEST_SIZE, MAX_REQUEST_SIZE); > exit (EXIT_FAILURE); > } > break; > > + case QUEUE_SIZE_OPTION:and here.> + if (sscanf (optarg, "%u", &queue_size) != 1) {Matches pre-existing use, but *scanf("%u") has an inherent limitation of being non-portable when it comes to dealing with overflow. Better is to use the strtol* family of functions when parsing user input into an integer - but that fits in a separate patch.> > -/* If the number of requests in flight exceeds the limit, poll > - * waiting for at least one request to finish. This enforces > - * the user --requests option. > +/* If the number of requests in flight or the number of queued bytes > + * exceed the limit, poll waiting for at least one request to finish.Pre-existing difficulty in reading, made worse by more words. I would suggest: If the number of requests or queued bytes in flight exceed limits, then poll until enough requests finish.> + * This enforces the user --requests and --queue-size options. > * > * NB: Unfortunately it's not possible to call this from a callback, > * since it will deadlock trying to grab the libnbd handle lock. This > * means that although the worker thread calls this and enforces the > * limit, when we split up requests into subrequests (eg. doing > * sparseness detection) we will probably exceed the user request > * limit. XXX > */ > static void > -wait_for_request_slots (size_t index) > +wait_for_request_slots (struct worker *worker) > { > - while (in_flight (index) >= max_requests) > - poll_both_ends (index); > + while (in_flight (worker->index) >= max_requests || > + worker->queue_size >= queue_size) > + poll_both_ends (worker->index); > }Looks like a nice improvement.> =head1 SYNOPSIS > > nbdcopy [--allocated] [-C N|--connections=N] > [--destination-is-zero|--target-is-zero] [--flush] > [--no-extents] [-p|--progress|--progress=FD] > - [--request-size=N] [-R N|--requests=N] [-S N|--sparse=N] > - [--synchronous] [-T N|--threads=N] [-v|--verbose] > + [--request-size=N] [--queue-size=N] [-R N|--requests=N]Another spot for my alphabetic sorting request.> @@ -156,20 +157,27 @@ following shell commands: > > Set the maximum request size in bytes. The maximum value is 32 MiB, > specified by the NBD protocol. > > =item B<-R> N > > =item B<--requests=>N > > Set the maximum number of requests in flight per NBD connection. > > +=item B<--queue-size=>NAnd again. I'm okay with you making the fixes that we pointed out, without necessarily needing to see a v2 post. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Nir Soffer
2022-Feb-22 11:48 UTC
[Libguestfs] [PATCH libnbd 8/8] copy: Adaptive queue size
On Mon, Feb 21, 2022 at 5:41 PM Eric Blake <eblake at redhat.com> wrote:> > On Sun, Feb 20, 2022 at 02:14:03PM +0200, Nir Soffer wrote: > > Limit the size of the copy queue also by the number of queued bytes. > > This allows using many concurrent small requests, required to get good > > performance, but limiting the number of large requests that are actually > > faster with lower concurrency. > > > > New --queue-size option added to control the maximum queue size. With 2 > > MiB we can have 8 256 KiB requests per connection. The default queue > > size is 16 MiB, to match the default --requests value (64) with the > > default --request-size (256 KiB). Testing show that using more than 16 > > 256 KiB requests with one connection do not improve anything. > > s/do/does/ > > > > > The new option will simplify limiting memory usage when using large > > requests, like this change in virt-v2v: > > https://github.com/libguestfs/virt-v2v/commit/c943420219fa0ee971fc228aa4d9127c5ce973f7 > > > > I tested this change with 3 images: > > > > - Fedora 35 + 3g of random data - hopefully simulates a real image > > - Fully allocated image - the special case when every read command is > > converted to a write command. > > - Zero image - the special case when every read command is converted to > > a zero command. > > > > On 2 machines: > > > > - laptop: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz, 12 cpus, > > 1.5 MiB L2 cache per 2 cpus, 12 MiB L3 cache. > > - server: Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz, 80 cpus, > > 1 MiB L2 cache per cpu, 27.5 MiB L3 cache. > > > > In all cases, both source and destination are served by qemu-nbd, using > > --cache=none --aio=native. Because qemu-nbd does not support MULTI_CON > > MULTI_CONN > > > for writing, we are testing a single connection when copying an to > > Did you mean 'copying an image to'?Yes> > > qemu-nbd. I tested also copying to null: since in this case we use 4 > > connections (these tests are marked with /ro). > > > > Results for copying all images on all machines with nbdcopy v1.11.0 and > > this change. "before" and "after" are average time of 10 runs. > > > > image machine before after queue size improvement > > ==================================================================> > fedora laptop 3.044 2.129 2m +43% > > full laptop 4.900 3.136 2m +56% > > zero laptop 3.147 2.624 2m +20% > > ------------------------------------------------------------------- > > fedora server 2.324 2.189 16m +6% > > full server 3.521 3.380 8m +4% > > zero server 2.297 2.338 16m -2% > > ------------------------------------------------------------------- > > fedora/ro laptop 2.040 1.663 1m +23% > > fedora/ro server 1.585 1.393 2m +14% > > > > Signed-off-by: Nir Soffer <nsoffer at redhat.com> > > --- > > copy/main.c | 52 ++++++++++++++++++++++++------------- > > copy/multi-thread-copying.c | 18 +++++++------ > > copy/nbdcopy.h | 1 + > > copy/nbdcopy.pod | 12 +++++++-- > > 4 files changed, 55 insertions(+), 28 deletions(-) > > > > > static void __attribute__((noreturn)) > > usage (FILE *fp, int exitcode) > > { > > fprintf (fp, > > "\n" > > "Copy to and from an NBD server:\n" > > "\n" > > " nbdcopy [--allocated] [-C N|--connections=N]\n" > > " [--destination-is-zero|--target-is-zero] [--flush]\n" > > " [--no-extents] [-p|--progress|--progress=FD]\n" > > -" [--request-size=N] [-R N|--requests=N] [-S N|--sparse=N]\n" > > -" [--synchronous] [-T N|--threads=N] [-v|--verbose]\n" > > +" [--request-size=N] [--queue-size=N] [-R N|--requests=N]\n" > > The options are listed in mostly alphabetic order already, so > --queue-size before --request-size makes more sense to me. > > > @@ -104,33 +106,35 @@ main (int argc, char *argv[]) > > { > > enum { > > HELP_OPTION = CHAR_MAX + 1, > > LONG_OPTIONS, > > SHORT_OPTIONS, > > ALLOCATED_OPTION, > > DESTINATION_IS_ZERO_OPTION, > > FLUSH_OPTION, > > NO_EXTENTS_OPTION, > > REQUEST_SIZE_OPTION, > > + QUEUE_SIZE_OPTION, > > Likewise here. > > > SYNCHRONOUS_OPTION, > > }; > > const char *short_options = "C:pR:S:T:vV"; > > const struct option long_options[] = { > > { "help", no_argument, NULL, HELP_OPTION }, > > { "long-options", no_argument, NULL, LONG_OPTIONS }, > > { "allocated", no_argument, NULL, ALLOCATED_OPTION }, > > { "connections", required_argument, NULL, 'C' }, > > { "destination-is-zero",no_argument, NULL, DESTINATION_IS_ZERO_OPTION }, > > { "flush", no_argument, NULL, FLUSH_OPTION }, > > { "no-extents", no_argument, NULL, NO_EXTENTS_OPTION }, > > { "progress", optional_argument, NULL, 'p' }, > > { "request-size", required_argument, NULL, REQUEST_SIZE_OPTION }, > > + { "queue-size", required_argument, NULL, QUEUE_SIZE_OPTION }, > > and here. > > > { "requests", required_argument, NULL, 'R' }, > > { "short-options", no_argument, NULL, SHORT_OPTIONS }, > > { "sparse", required_argument, NULL, 'S' }, > > { "synchronous", no_argument, NULL, SYNCHRONOUS_OPTION }, > > { "target-is-zero", no_argument, NULL, DESTINATION_IS_ZERO_OPTION }, > > { "threads", required_argument, NULL, 'T' }, > > { "verbose", no_argument, NULL, 'v' }, > > { "version", no_argument, NULL, 'V' }, > > { NULL } > > }; > > @@ -212,20 +216,28 @@ main (int argc, char *argv[]) > > } > > if (request_size < MIN_REQUEST_SIZE || request_size > MAX_REQUEST_SIZE || > > !is_power_of_2 (request_size)) { > > fprintf (stderr, > > "%s: --request-size: must be a power of 2 within %d-%d\n", > > prog, MIN_REQUEST_SIZE, MAX_REQUEST_SIZE); > > exit (EXIT_FAILURE); > > } > > break; > > > > + case QUEUE_SIZE_OPTION: > > and here. > > > + if (sscanf (optarg, "%u", &queue_size) != 1) { > > Matches pre-existing use, but *scanf("%u") has an inherent limitation > of being non-portable when it comes to dealing with overflow. Better > is to use the strtol* family of functions when parsing user input into > an integer - but that fits in a separate patch. > > > > > -/* If the number of requests in flight exceeds the limit, poll > > - * waiting for at least one request to finish. This enforces > > - * the user --requests option. > > +/* If the number of requests in flight or the number of queued bytes > > + * exceed the limit, poll waiting for at least one request to finish. > > Pre-existing difficulty in reading, made worse by more words. I would > suggest: > > If the number of requests or queued bytes in flight exceed limits, > then poll until enough requests finish.Nicer, will use this.> > > + * This enforces the user --requests and --queue-size options. > > * > > * NB: Unfortunately it's not possible to call this from a callback, > > * since it will deadlock trying to grab the libnbd handle lock. This > > * means that although the worker thread calls this and enforces the > > * limit, when we split up requests into subrequests (eg. doing > > * sparseness detection) we will probably exceed the user request > > * limit. XXX > > */ > > static void > > -wait_for_request_slots (size_t index) > > +wait_for_request_slots (struct worker *worker) > > { > > - while (in_flight (index) >= max_requests) > > - poll_both_ends (index); > > + while (in_flight (worker->index) >= max_requests || > > + worker->queue_size >= queue_size) > > + poll_both_ends (worker->index); > > } > > Looks like a nice improvement. > > > =head1 SYNOPSIS > > > > nbdcopy [--allocated] [-C N|--connections=N] > > [--destination-is-zero|--target-is-zero] [--flush] > > [--no-extents] [-p|--progress|--progress=FD] > > - [--request-size=N] [-R N|--requests=N] [-S N|--sparse=N] > > - [--synchronous] [-T N|--threads=N] [-v|--verbose] > > + [--request-size=N] [--queue-size=N] [-R N|--requests=N] > > Another spot for my alphabetic sorting request. > > > @@ -156,20 +157,27 @@ following shell commands: > > > > Set the maximum request size in bytes. The maximum value is 32 MiB, > > specified by the NBD protocol. > > > > =item B<-R> N > > > > =item B<--requests=>N > > > > Set the maximum number of requests in flight per NBD connection. > > > > +=item B<--queue-size=>N > > And again. > > I'm okay with you making the fixes that we pointed out, without > necessarily needing to see a v2 post.I did not notice that the options are mostly sorted. I will sort them as you suggest before pushing. Thanks for reviewing! Nir