thr3ads.net - Libguestfs - [Libguestfs] [PATCH libnbd 8/8] copy: Adaptive queue size [Feb 2022]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2022-Feb-21 15:40 UTC

[Libguestfs] [PATCH libnbd 8/8] copy: Adaptive queue size

On Sun, Feb 20, 2022 at 02:14:03PM +0200, Nir Soffer
wrote:> Limit the size of the copy queue also by the number of queued bytes.
> This allows using many concurrent small requests, required to get good
> performance, but limiting the number of large requests that are actually
> faster with lower concurrency.
> 
> New --queue-size option added to control the maximum queue size. With 2
> MiB we can have 8 256 KiB requests per connection. The default queue
> size is 16 MiB, to match the default --requests value (64) with the
> default --request-size (256 KiB). Testing show that using more than 16
> 256 KiB requests with one connection do not improve anything.
s/do/does/
> 
> The new option will simplify limiting memory usage when using large
> requests, like this change in virt-v2v:
>
https://github.com/libguestfs/virt-v2v/commit/c943420219fa0ee971fc228aa4d9127c5ce973f7
> 
> I tested this change with 3 images:
> 
> - Fedora 35 + 3g of random data - hopefully simulates a real image
> - Fully allocated image - the special case when every read command is
>   converted to a write command.
> - Zero image - the special case when every read command is converted to
>   a zero command.
> 
> On 2 machines:
> 
> - laptop: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz, 12 cpus,
>   1.5 MiB L2 cache per 2 cpus, 12 MiB L3 cache.
> - server: Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz, 80 cpus,
>   1 MiB L2 cache per cpu, 27.5 MiB L3 cache.
> 
> In all cases, both source and destination are served by qemu-nbd, using
> --cache=none --aio=native. Because qemu-nbd does not support MULTI_CON
MULTI_CONN
> for writing, we are testing a single connection when copying an to
Did you mean 'copying an image to'?
> qemu-nbd. I tested also copying to null: since in this case we use 4
> connections (these tests are marked with /ro).
> 
> Results for copying all images on all machines with nbdcopy v1.11.0 and
> this change. "before" and "after" are average time of
10 runs.
> 
> image       machine    before    after    queue size    improvement
> ==================================================================>
fedora      laptop      3.044    2.129           2m           +43%
> full        laptop      4.900    3.136           2m           +56%
> zero        laptop      3.147    2.624           2m           +20%
> -------------------------------------------------------------------
> fedora      server      2.324    2.189          16m            +6%
> full        server      3.521    3.380           8m            +4%
> zero        server      2.297    2.338          16m            -2%
> -------------------------------------------------------------------
> fedora/ro   laptop      2.040    1.663           1m           +23%
> fedora/ro   server      1.585    1.393           2m           +14%
> 
> Signed-off-by: Nir Soffer <nsoffer at redhat.com>
> ---
>  copy/main.c                 | 52 ++++++++++++++++++++++++-------------
>  copy/multi-thread-copying.c | 18 +++++++------
>  copy/nbdcopy.h              |  1 +
>  copy/nbdcopy.pod            | 12 +++++++--
>  4 files changed, 55 insertions(+), 28 deletions(-)
> 
>  static void __attribute__((noreturn))
>  usage (FILE *fp, int exitcode)
>  {
>    fprintf (fp,
>  "\n"
>  "Copy to and from an NBD server:\n"
>  "\n"
>  "    nbdcopy [--allocated] [-C N|--connections=N]\n"
>  "            [--destination-is-zero|--target-is-zero]
[--flush]\n"
>  "            [--no-extents] [-p|--progress|--progress=FD]\n"
> -"            [--request-size=N] [-R N|--requests=N] [-S
N|--sparse=N]\n"
> -"            [--synchronous] [-T N|--threads=N]
[-v|--verbose]\n"
> +"            [--request-size=N] [--queue-size=N] [-R
N|--requests=N]\n"
The options are listed in mostly alphabetic order already, so
--queue-size before --request-size makes more sense to me.
> @@ -104,33 +106,35 @@ main (int argc, char *argv[])
>  {
>    enum {
>      HELP_OPTION = CHAR_MAX + 1,
>      LONG_OPTIONS,
>      SHORT_OPTIONS,
>      ALLOCATED_OPTION,
>      DESTINATION_IS_ZERO_OPTION,
>      FLUSH_OPTION,
>      NO_EXTENTS_OPTION,
>      REQUEST_SIZE_OPTION,
> +    QUEUE_SIZE_OPTION,
Likewise here.
>      SYNCHRONOUS_OPTION,
>    };
>    const char *short_options = "C:pR:S:T:vV";
>    const struct option long_options[] = {
>      { "help",               no_argument,       NULL, HELP_OPTION
},
>      { "long-options",       no_argument,       NULL,
LONG_OPTIONS },
>      { "allocated",          no_argument,       NULL,
ALLOCATED_OPTION },
>      { "connections",        required_argument, NULL, 'C'
},
>      { "destination-is-zero",no_argument,       NULL,
DESTINATION_IS_ZERO_OPTION },
>      { "flush",              no_argument,       NULL,
FLUSH_OPTION },
>      { "no-extents",         no_argument,       NULL,
NO_EXTENTS_OPTION },
>      { "progress",           optional_argument, NULL, 'p'
},
>      { "request-size",       required_argument, NULL,
REQUEST_SIZE_OPTION },
> +    { "queue-size",         required_argument, NULL,
QUEUE_SIZE_OPTION },
and here.
>      { "requests",           required_argument, NULL, 'R'
},
>      { "short-options",      no_argument,       NULL,
SHORT_OPTIONS },
>      { "sparse",             required_argument, NULL, 'S'
},
>      { "synchronous",        no_argument,       NULL,
SYNCHRONOUS_OPTION },
>      { "target-is-zero",     no_argument,       NULL,
DESTINATION_IS_ZERO_OPTION },
>      { "threads",            required_argument, NULL, 'T'
},
>      { "verbose",            no_argument,       NULL, 'v'
},
>      { "version",            no_argument,       NULL, 'V'
},
>      { NULL }
>    };
> @@ -212,20 +216,28 @@ main (int argc, char *argv[])
>        }
>        if (request_size < MIN_REQUEST_SIZE || request_size >
MAX_REQUEST_SIZE ||
>                !is_power_of_2 (request_size)) {
>          fprintf (stderr,
>                  "%s: --request-size: must be a power of 2 within
%d-%d\n",
>                   prog, MIN_REQUEST_SIZE, MAX_REQUEST_SIZE);
>          exit (EXIT_FAILURE);
>        }
>        break;
>  
> +    case QUEUE_SIZE_OPTION:
and here.
> +      if (sscanf (optarg, "%u", &queue_size) != 1) {
Matches pre-existing use, but *scanf("%u") has an inherent limitation
of being non-portable when it comes to dealing with overflow.  Better
is to use the strtol* family of functions when parsing user input into
an integer - but that fits in a separate patch.
>  
> -/* If the number of requests in flight exceeds the limit, poll
> - * waiting for at least one request to finish.  This enforces
> - * the user --requests option.
> +/* If the number of requests in flight or the number of queued bytes
> + * exceed the limit, poll waiting for at least one request to finish.
Pre-existing difficulty in reading, made worse by more words.  I would
suggest:

If the number of requests or queued bytes in flight exceed limits,
then poll until enough requests finish.
> + * This enforces the user --requests and --queue-size options.
>   *
>   * NB: Unfortunately it's not possible to call this from a callback,
>   * since it will deadlock trying to grab the libnbd handle lock.  This
>   * means that although the worker thread calls this and enforces the
>   * limit, when we split up requests into subrequests (eg. doing
>   * sparseness detection) we will probably exceed the user request
>   * limit. XXX
>   */
>  static void
> -wait_for_request_slots (size_t index)
> +wait_for_request_slots (struct worker *worker)
>  {
> -  while (in_flight (index) >= max_requests)
> -    poll_both_ends (index);
> +  while (in_flight (worker->index) >= max_requests ||
> +         worker->queue_size >= queue_size)
> +    poll_both_ends (worker->index);
>  }
Looks like a nice improvement.
>  =head1 SYNOPSIS
>  
>   nbdcopy [--allocated] [-C N|--connections=N]
>           [--destination-is-zero|--target-is-zero] [--flush]
>           [--no-extents] [-p|--progress|--progress=FD]
> -         [--request-size=N] [-R N|--requests=N] [-S N|--sparse=N]
> -         [--synchronous] [-T N|--threads=N] [-v|--verbose]
> +         [--request-size=N] [--queue-size=N] [-R N|--requests=N]
Another spot for my alphabetic sorting request.
> @@ -156,20 +157,27 @@ following shell commands:
>  
>  Set the maximum request size in bytes. The maximum value is 32 MiB,
>  specified by the NBD protocol.
>  
>  =item B<-R> N
>  
>  =item B<--requests=>N
>  
>  Set the maximum number of requests in flight per NBD connection.
>  
> +=item B<--queue-size=>N
And again.

I'm okay with you making the fixes that we pointed out, without
necessarily needing to see a v2 post.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Nir Soffer

2022-Feb-22 11:48 UTC

head link

[Libguestfs] [PATCH libnbd 8/8] copy: Adaptive queue size

On Mon, Feb 21, 2022 at 5:41 PM Eric Blake <eblake at redhat.com>
wrote:>
> On Sun, Feb 20, 2022 at 02:14:03PM +0200, Nir Soffer wrote:
> > Limit the size of the copy queue also by the number of queued bytes.
> > This allows using many concurrent small requests, required to get good
> > performance, but limiting the number of large requests that are
actually
> > faster with lower concurrency.
> >
> > New --queue-size option added to control the maximum queue size. With
2
> > MiB we can have 8 256 KiB requests per connection. The default queue
> > size is 16 MiB, to match the default --requests value (64) with the
> > default --request-size (256 KiB). Testing show that using more than 16
> > 256 KiB requests with one connection do not improve anything.
>
> s/do/does/
>
> >
> > The new option will simplify limiting memory usage when using large
> > requests, like this change in virt-v2v:
> >
https://github.com/libguestfs/virt-v2v/commit/c943420219fa0ee971fc228aa4d9127c5ce973f7
> >
> > I tested this change with 3 images:
> >
> > - Fedora 35 + 3g of random data - hopefully simulates a real image
> > - Fully allocated image - the special case when every read command is
> >   converted to a write command.
> > - Zero image - the special case when every read command is converted
to
> >   a zero command.
> >
> > On 2 machines:
> >
> > - laptop: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz, 12 cpus,
> >   1.5 MiB L2 cache per 2 cpus, 12 MiB L3 cache.
> > - server: Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz, 80 cpus,
> >   1 MiB L2 cache per cpu, 27.5 MiB L3 cache.
> >
> > In all cases, both source and destination are served by qemu-nbd,
using
> > --cache=none --aio=native. Because qemu-nbd does not support MULTI_CON
>
> MULTI_CONN
>
> > for writing, we are testing a single connection when copying an to
>
> Did you mean 'copying an image to'?
Yes
>
> > qemu-nbd. I tested also copying to null: since in this case we use 4
> > connections (these tests are marked with /ro).
> >
> > Results for copying all images on all machines with nbdcopy v1.11.0
and
> > this change. "before" and "after" are average time
of 10 runs.
> >
> > image       machine    before    after    queue size    improvement
> > ==================================================================>
> fedora      laptop      3.044    2.129           2m           +43%
> > full        laptop      4.900    3.136           2m           +56%
> > zero        laptop      3.147    2.624           2m           +20%
> > -------------------------------------------------------------------
> > fedora      server      2.324    2.189          16m            +6%
> > full        server      3.521    3.380           8m            +4%
> > zero        server      2.297    2.338          16m            -2%
> > -------------------------------------------------------------------
> > fedora/ro   laptop      2.040    1.663           1m           +23%
> > fedora/ro   server      1.585    1.393           2m           +14%
> >
> > Signed-off-by: Nir Soffer <nsoffer at redhat.com>
> > ---
> >  copy/main.c                 | 52
++++++++++++++++++++++++-------------
> >  copy/multi-thread-copying.c | 18 +++++++------
> >  copy/nbdcopy.h              |  1 +
> >  copy/nbdcopy.pod            | 12 +++++++--
> >  4 files changed, 55 insertions(+), 28 deletions(-)
> >
>
> >  static void __attribute__((noreturn))
> >  usage (FILE *fp, int exitcode)
> >  {
> >    fprintf (fp,
> >  "\n"
> >  "Copy to and from an NBD server:\n"
> >  "\n"
> >  "    nbdcopy [--allocated] [-C N|--connections=N]\n"
> >  "            [--destination-is-zero|--target-is-zero]
[--flush]\n"
> >  "            [--no-extents]
[-p|--progress|--progress=FD]\n"
> > -"            [--request-size=N] [-R N|--requests=N] [-S
N|--sparse=N]\n"
> > -"            [--synchronous] [-T N|--threads=N]
[-v|--verbose]\n"
> > +"            [--request-size=N] [--queue-size=N] [-R
N|--requests=N]\n"
>
> The options are listed in mostly alphabetic order already, so
> --queue-size before --request-size makes more sense to me.
>
> > @@ -104,33 +106,35 @@ main (int argc, char *argv[])
> >  {
> >    enum {
> >      HELP_OPTION = CHAR_MAX + 1,
> >      LONG_OPTIONS,
> >      SHORT_OPTIONS,
> >      ALLOCATED_OPTION,
> >      DESTINATION_IS_ZERO_OPTION,
> >      FLUSH_OPTION,
> >      NO_EXTENTS_OPTION,
> >      REQUEST_SIZE_OPTION,
> > +    QUEUE_SIZE_OPTION,
>
> Likewise here.
>
> >      SYNCHRONOUS_OPTION,
> >    };
> >    const char *short_options = "C:pR:S:T:vV";
> >    const struct option long_options[] = {
> >      { "help",               no_argument,       NULL,
HELP_OPTION },
> >      { "long-options",       no_argument,       NULL,
LONG_OPTIONS },
> >      { "allocated",          no_argument,       NULL,
ALLOCATED_OPTION },
> >      { "connections",        required_argument, NULL,
'C' },
> >      { "destination-is-zero",no_argument,       NULL,
DESTINATION_IS_ZERO_OPTION },
> >      { "flush",              no_argument,       NULL,
FLUSH_OPTION },
> >      { "no-extents",         no_argument,       NULL,
NO_EXTENTS_OPTION },
> >      { "progress",           optional_argument, NULL,
'p' },
> >      { "request-size",       required_argument, NULL,
REQUEST_SIZE_OPTION },
> > +    { "queue-size",         required_argument, NULL,
QUEUE_SIZE_OPTION },
>
> and here.
>
> >      { "requests",           required_argument, NULL,
'R' },
> >      { "short-options",      no_argument,       NULL,
SHORT_OPTIONS },
> >      { "sparse",             required_argument, NULL,
'S' },
> >      { "synchronous",        no_argument,       NULL,
SYNCHRONOUS_OPTION },
> >      { "target-is-zero",     no_argument,       NULL,
DESTINATION_IS_ZERO_OPTION },
> >      { "threads",            required_argument, NULL,
'T' },
> >      { "verbose",            no_argument,       NULL,
'v' },
> >      { "version",            no_argument,       NULL,
'V' },
> >      { NULL }
> >    };
> > @@ -212,20 +216,28 @@ main (int argc, char *argv[])
> >        }
> >        if (request_size < MIN_REQUEST_SIZE || request_size >
MAX_REQUEST_SIZE ||
> >                !is_power_of_2 (request_size)) {
> >          fprintf (stderr,
> >                  "%s: --request-size: must be a power of 2 within
%d-%d\n",
> >                   prog, MIN_REQUEST_SIZE, MAX_REQUEST_SIZE);
> >          exit (EXIT_FAILURE);
> >        }
> >        break;
> >
> > +    case QUEUE_SIZE_OPTION:
>
> and here.
>
> > +      if (sscanf (optarg, "%u", &queue_size) != 1) {
>
> Matches pre-existing use, but *scanf("%u") has an inherent
limitation
> of being non-portable when it comes to dealing with overflow.  Better
> is to use the strtol* family of functions when parsing user input into
> an integer - but that fits in a separate patch.
>
> >
> > -/* If the number of requests in flight exceeds the limit, poll
> > - * waiting for at least one request to finish.  This enforces
> > - * the user --requests option.
> > +/* If the number of requests in flight or the number of queued bytes
> > + * exceed the limit, poll waiting for at least one request to finish.
>
> Pre-existing difficulty in reading, made worse by more words.  I would
> suggest:
>
> If the number of requests or queued bytes in flight exceed limits,
> then poll until enough requests finish.
Nicer, will use this.
>
> > + * This enforces the user --requests and --queue-size options.
> >   *
> >   * NB: Unfortunately it's not possible to call this from a
callback,
> >   * since it will deadlock trying to grab the libnbd handle lock. 
This
> >   * means that although the worker thread calls this and enforces the
> >   * limit, when we split up requests into subrequests (eg. doing
> >   * sparseness detection) we will probably exceed the user request
> >   * limit. XXX
> >   */
> >  static void
> > -wait_for_request_slots (size_t index)
> > +wait_for_request_slots (struct worker *worker)
> >  {
> > -  while (in_flight (index) >= max_requests)
> > -    poll_both_ends (index);
> > +  while (in_flight (worker->index) >= max_requests ||
> > +         worker->queue_size >= queue_size)
> > +    poll_both_ends (worker->index);
> >  }
>
> Looks like a nice improvement.
>
> >  =head1 SYNOPSIS
> >
> >   nbdcopy [--allocated] [-C N|--connections=N]
> >           [--destination-is-zero|--target-is-zero] [--flush]
> >           [--no-extents] [-p|--progress|--progress=FD]
> > -         [--request-size=N] [-R N|--requests=N] [-S N|--sparse=N]
> > -         [--synchronous] [-T N|--threads=N] [-v|--verbose]
> > +         [--request-size=N] [--queue-size=N] [-R N|--requests=N]
>
> Another spot for my alphabetic sorting request.
>
> > @@ -156,20 +157,27 @@ following shell commands:
> >
> >  Set the maximum request size in bytes. The maximum value is 32 MiB,
> >  specified by the NBD protocol.
> >
> >  =item B<-R> N
> >
> >  =item B<--requests=>N
> >
> >  Set the maximum number of requests in flight per NBD connection.
> >
> > +=item B<--queue-size=>N
>
> And again.
>
> I'm okay with you making the fixes that we pointed out, without
> necessarily needing to see a v2 post.
I did not notice that the options are mostly sorted. I will sort them
as you suggest before pushing.

Thanks for reviewing!

Nir

Libguestfs - Feb 2022 - [PATCH libnbd 8/8] copy: Adaptive queue size

[Libguestfs] [PATCH libnbd 8/8] copy: Adaptive queue size

[Libguestfs] [PATCH libnbd 8/8] copy: Adaptive queue size