Nir Soffer
2019-Nov-30 00:17 UTC
[Libguestfs] [PATCH nbdkit 0/3] filters: stats: More useful, more friendly
- Use more friendly output with GiB and MiB/s. - Measure time per operation, providing finer grain stats - Add missing stats for flush I hope that these changes will help to understand and imporve virt-v2v performance. Nir Soffer (3): filters: stats: Show size in GiB, rate in MiB/s filters: stats: Measure time per operation filters: stats: Add flush stats filters/stats/stats.c | 117 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 96 insertions(+), 21 deletions(-) -- 2.21.0
Nir Soffer
2019-Nov-30 00:17 UTC
[Libguestfs] [PATCH nbdkit 1/3] filters: stats: Show size in GiB, rate in MiB/s
I find bytes and bits-per-second unhelpful and hard to parse. Using GiB for sizes works for common disk images, and MiB/s works for common storage throughput. Here is an example run with this change: $ ./nbdkit --foreground \ --unix /tmp/nbd.sock \ --exportname '' \ --filter stats \ file file=/var/tmp/dst.img \ statsfile=/dev/stderr \ --run 'qemu-img convert -p -n -f raw -O raw -T none /var/tmp/fedora-30.img nbd:unix:/tmp/nbd.sock' (100.00/100%) elapsed time: 2.313 s write: 1271 ops, 1.14 GiB, 502.63 MiB/s zero: 1027 ops, 4.86 GiB, 2153.24 MiB/s extents: 1 ops, 2.00 GiB, 885.29 MiB/s --- filters/stats/stats.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/filters/stats/stats.c b/filters/stats/stats.c index 98282e2..45bedae 100644 --- a/filters/stats/stats.c +++ b/filters/stats/stats.c @@ -49,6 +49,10 @@ #include "cleanup.h" #include "tvdiff.h" +#define MiB 1048576.0 +#define GiB 1073741824.0 +#define USEC 1000000.0 + static char *filename; static bool append; static FILE *fp; @@ -64,34 +68,34 @@ static uint64_t extents_ops, extents_bytes; static uint64_t cache_ops, cache_bytes; static inline double -calc_bps (uint64_t bytes, int64_t usecs) +calc_mibps (uint64_t bytes, int64_t usecs) { - return 8.0 * bytes / usecs * 1000000.; + return bytes / MiB / usecs * USEC; } static inline void print_stats (int64_t usecs) { - fprintf (fp, "elapsed time: %g s\n", usecs / 1000000.); + fprintf (fp, "elapsed time: %.3f s\n", usecs / USEC); if (pread_ops > 0) - fprintf (fp, "read: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", - pread_ops, pread_bytes, calc_bps (pread_bytes, usecs)); + fprintf (fp, "read: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", + pread_ops, pread_bytes / GiB, calc_mibps (pread_bytes, usecs)); if (pwrite_ops > 0) - fprintf (fp, "write: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", - pwrite_ops, pwrite_bytes, calc_bps (pwrite_bytes, usecs)); + fprintf (fp, "write: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", + pwrite_ops, pwrite_bytes / GiB, calc_mibps (pwrite_bytes, usecs)); if (trim_ops > 0) - fprintf (fp, "trim: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", - trim_ops, trim_bytes, calc_bps (trim_bytes, usecs)); + fprintf (fp, "trim: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", + trim_ops, trim_bytes / GiB, calc_mibps (trim_bytes, usecs)); if (zero_ops > 0) - fprintf (fp, "zero: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", - zero_ops, zero_bytes, calc_bps (zero_bytes, usecs)); + fprintf (fp, "zero: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", + zero_ops, zero_bytes / GiB, calc_mibps (zero_bytes, usecs)); if (extents_ops > 0) - fprintf (fp, "extents: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", - extents_ops, extents_bytes, calc_bps (extents_bytes, usecs)); + fprintf (fp, "extents: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", + extents_ops, extents_bytes / GiB, calc_mibps (extents_bytes, usecs)); if (cache_ops > 0) - fprintf (fp, "cache: %" PRIu64 " ops, %" PRIu64 " bytes, %g bits/s\n", - cache_ops, cache_bytes, calc_bps (cache_bytes, usecs)); + fprintf (fp, "cache: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", + cache_ops, cache_bytes / GiB, calc_mibps (cache_bytes, usecs)); fflush (fp); } -- 2.21.0
Nir Soffer
2019-Nov-30 00:17 UTC
[Libguestfs] [PATCH nbdkit 2/3] filters: stats: Measure time per operation
Previously we measured the total time and used it to calculate the rate of different operations. This is incorrect and hides the real throughput. A more useful way is to measure the time we spent in each operation. Here is an example run with this change: $ ./nbdkit --foreground \ --unix /tmp/nbd.sock \ --exportname '' \ --filter stats \ file file=/var/tmp/dst.img \ statsfile=/dev/stderr \ --run 'qemu-img convert -p -n -f raw -O raw -T none /var/tmp/fedora-30.img nbd:unix:/tmp/nbd.sock' (100.00/100%) elapsed time: 2.150 s write: 1271 ops, 1.14 GiB, 0.398 s, 2922.22 MiB/s zero: 1027 ops, 4.86 GiB, 0.012 s, 414723.03 MiB/s extents: 1 ops, 2.00 GiB, 0.000 s, 120470559.51 MiB/s This show that the actual time waiting for storage was only 0.4 seconds, but elapsed time was 2.1 seconds. I think the missing time is in flush() which we do not measure yet. --- filters/stats/stats.c | 78 +++++++++++++++++++++++++++++++++---------- 1 file changed, 60 insertions(+), 18 deletions(-) diff --git a/filters/stats/stats.c b/filters/stats/stats.c index 45bedae..86439e7 100644 --- a/filters/stats/stats.c +++ b/filters/stats/stats.c @@ -60,12 +60,12 @@ static struct timeval start_t; /* This lock protects all the stats. */ static pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; -static uint64_t pread_ops, pread_bytes; -static uint64_t pwrite_ops, pwrite_bytes; -static uint64_t trim_ops, trim_bytes; -static uint64_t zero_ops, zero_bytes; -static uint64_t extents_ops, extents_bytes; -static uint64_t cache_ops, cache_bytes; +static uint64_t pread_ops, pread_bytes, pread_usecs; +static uint64_t pwrite_ops, pwrite_bytes, pwrite_usecs; +static uint64_t trim_ops, trim_bytes, trim_usecs; +static uint64_t zero_ops, zero_bytes, zero_usecs; +static uint64_t extents_ops, extents_bytes, extents_usecs; +static uint64_t cache_ops, cache_bytes, cache_usecs; static inline double calc_mibps (uint64_t bytes, int64_t usecs) @@ -79,23 +79,23 @@ print_stats (int64_t usecs) fprintf (fp, "elapsed time: %.3f s\n", usecs / USEC); if (pread_ops > 0) - fprintf (fp, "read: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", - pread_ops, pread_bytes / GiB, calc_mibps (pread_bytes, usecs)); + fprintf (fp, "read: %" PRIu64 " ops, %.2f GiB, %.3f s, %.2f MiB/s\n", + pread_ops, pread_bytes / GiB, pread_usecs / USEC, calc_mibps (pread_bytes, pread_usecs)); if (pwrite_ops > 0) - fprintf (fp, "write: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", - pwrite_ops, pwrite_bytes / GiB, calc_mibps (pwrite_bytes, usecs)); + fprintf (fp, "write: %" PRIu64 " ops, %.2f GiB, %.3f s, %.2f MiB/s\n", + pwrite_ops, pwrite_bytes / GiB, pwrite_usecs / USEC, calc_mibps (pwrite_bytes, pwrite_usecs)); if (trim_ops > 0) - fprintf (fp, "trim: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", - trim_ops, trim_bytes / GiB, calc_mibps (trim_bytes, usecs)); + fprintf (fp, "trim: %" PRIu64 " ops, %.2f GiB, %.3f s, %.2f MiB/s\n", + trim_ops, trim_bytes / GiB, trim_usecs / USEC, calc_mibps (trim_bytes, trim_usecs)); if (zero_ops > 0) - fprintf (fp, "zero: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", - zero_ops, zero_bytes / GiB, calc_mibps (zero_bytes, usecs)); + fprintf (fp, "zero: %" PRIu64 " ops, %.2f GiB, %.3f s, %.2f MiB/s\n", + zero_ops, zero_bytes / GiB, zero_usecs / USEC, calc_mibps (zero_bytes, zero_usecs)); if (extents_ops > 0) - fprintf (fp, "extents: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", - extents_ops, extents_bytes / GiB, calc_mibps (extents_bytes, usecs)); + fprintf (fp, "extents: %" PRIu64 " ops, %.2f GiB, %.3f s, %.2f MiB/s\n", + extents_ops, extents_bytes / GiB, extents_usecs / USEC, calc_mibps (extents_bytes, extents_usecs)); if (cache_ops > 0) - fprintf (fp, "cache: %" PRIu64 " ops, %.2f GiB, %.2f MiB/s\n", - cache_ops, cache_bytes / GiB, calc_mibps (cache_bytes, usecs)); + fprintf (fp, "cache: %" PRIu64 " ops, %.2f GiB, %.3f s, %.2f MiB/s\n", + cache_ops, cache_bytes / GiB, cache_usecs / USEC, calc_mibps (cache_bytes, cache_usecs)); fflush (fp); } @@ -186,13 +186,20 @@ stats_pread (struct nbdkit_next_ops *next_ops, void *nxdata, void *handle, void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err) { + struct timeval start, end; + int64_t usecs; int r; + gettimeofday (&start, NULL); r = next_ops->pread (nxdata, buf, count, offset, flags, err); if (r == 0) { + gettimeofday (&end, NULL); + usecs = tvdiff_usec(&start, &end); + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); pread_ops++; pread_bytes += count; + pread_usecs += usecs; } return r; } @@ -204,13 +211,20 @@ stats_pwrite (struct nbdkit_next_ops *next_ops, void *nxdata, const void *buf, uint32_t count, uint64_t offset, uint32_t flags, int *err) { + struct timeval start, end; + int64_t usecs; int r; + gettimeofday (&start, NULL); r = next_ops->pwrite (nxdata, buf, count, offset, flags, err); if (r == 0) { + gettimeofday (&end, NULL); + usecs = tvdiff_usec(&start, &end); + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); pwrite_ops++; pwrite_bytes += count; + pwrite_usecs += usecs; } return r; } @@ -222,13 +236,20 @@ stats_trim (struct nbdkit_next_ops *next_ops, void *nxdata, uint32_t count, uint64_t offset, uint32_t flags, int *err) { + struct timeval start, end; + int64_t usecs; int r; + gettimeofday (&start, NULL); r = next_ops->trim (nxdata, count, offset, flags, err); if (r == 0) { + gettimeofday (&end, NULL); + usecs = tvdiff_usec(&start, &end); + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); trim_ops++; trim_bytes += count; + trim_usecs += usecs; } return r; } @@ -240,13 +261,20 @@ stats_zero (struct nbdkit_next_ops *next_ops, void *nxdata, uint32_t count, uint64_t offset, uint32_t flags, int *err) { + struct timeval start, end; + int64_t usecs; int r; + gettimeofday (&start, NULL); r = next_ops->zero (nxdata, count, offset, flags, err); if (r == 0) { + gettimeofday (&end, NULL); + usecs = tvdiff_usec(&start, &end); + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); zero_ops++; zero_bytes += count; + zero_usecs += usecs; } return r; } @@ -258,10 +286,16 @@ stats_extents (struct nbdkit_next_ops *next_ops, void *nxdata, uint32_t count, uint64_t offset, uint32_t flags, struct nbdkit_extents *extents, int *err) { + struct timeval start, end; + int64_t usecs; int r; + gettimeofday (&start, NULL); r = next_ops->extents (nxdata, count, offset, flags, extents, err); if (r == 0) { + gettimeofday (&end, NULL); + usecs = tvdiff_usec(&start, &end); + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); extents_ops++; /* XXX There's a case for trying to determine how long the extents @@ -269,6 +303,7 @@ stats_extents (struct nbdkit_next_ops *next_ops, void *nxdata, * the complex rules in the protocol. */ extents_bytes += count; + extents_usecs += usecs; } return r; } @@ -280,13 +315,20 @@ stats_cache (struct nbdkit_next_ops *next_ops, void *nxdata, uint32_t count, uint64_t offset, uint32_t flags, int *err) { + struct timeval start, end; + int64_t usecs; int r; + gettimeofday (&start, NULL); r = next_ops->cache (nxdata, count, offset, flags, err); if (r == 0) { + gettimeofday (&end, NULL); + usecs = tvdiff_usec(&start, &end); + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); cache_ops++; cache_bytes += count; + cache_usecs += usecs; } return r; } -- 2.21.0
Nir Soffer
2019-Nov-30 00:17 UTC
[Libguestfs] [PATCH nbdkit 3/3] filters: stats: Add flush stats
flush is usually called once but in some cases most of the time is spent in so we cannot ignore it. Here is an example run with this change: $ ./nbdkit --foreground \ --unix /tmp/nbd.sock \ --exportname '' \ --filter stats \ file file=/var/tmp/dst.img \ statsfile=/dev/stderr \ --run 'qemu-img convert -p -n -f raw -O raw -T none /var/tmp/fedora-30.img nbd:unix:/tmp/nbd.sock' (100.00/100%) elapsed time: 2.107 s write: 1271 ops, 1.14 GiB, 0.356 s, 3262.38 MiB/s zero: 1027 ops, 4.86 GiB, 0.013 s, 390318.00 MiB/s extents: 1 ops, 2.00 GiB, 0.000 s, 68266650.39 MiB/s flush: 2 ops, 1.021 s --- filters/stats/stats.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/filters/stats/stats.c b/filters/stats/stats.c index 86439e7..25e425b 100644 --- a/filters/stats/stats.c +++ b/filters/stats/stats.c @@ -66,6 +66,7 @@ static uint64_t trim_ops, trim_bytes, trim_usecs; static uint64_t zero_ops, zero_bytes, zero_usecs; static uint64_t extents_ops, extents_bytes, extents_usecs; static uint64_t cache_ops, cache_bytes, cache_usecs; +static uint64_t flush_ops, flush_usecs; static inline double calc_mibps (uint64_t bytes, int64_t usecs) @@ -96,6 +97,9 @@ print_stats (int64_t usecs) if (cache_ops > 0) fprintf (fp, "cache: %" PRIu64 " ops, %.2f GiB, %.3f s, %.2f MiB/s\n", cache_ops, cache_bytes / GiB, cache_usecs / USEC, calc_mibps (cache_bytes, cache_usecs)); + if (flush_ops > 0) + fprintf (fp, "flush: %" PRIu64 " ops, %.3f s\n", + flush_ops, flush_usecs / USEC); fflush (fp); } @@ -254,6 +258,30 @@ stats_trim (struct nbdkit_next_ops *next_ops, void *nxdata, return r; } +/* Flush. */ +static int +stats_flush (struct nbdkit_next_ops *next_ops, void *nxdata, + void *handle, uint32_t flags, + int *err) +{ + struct timeval start, end; + int64_t usecs; + int r; + + gettimeofday (&start, NULL); + r = next_ops->flush (nxdata, flags, err); + if (r == 0) { + gettimeofday (&end, NULL); + usecs = tvdiff_usec(&start, &end); + + ACQUIRE_LOCK_FOR_CURRENT_SCOPE (&lock); + flush_ops++; + flush_usecs += usecs; + } + + return r; +} + /* Zero. */ static int stats_zero (struct nbdkit_next_ops *next_ops, void *nxdata, @@ -343,6 +371,7 @@ static struct nbdkit_filter filter = { .pread = stats_pread, .pwrite = stats_pwrite, .trim = stats_trim, + .flush = stats_flush, .zero = stats_zero, .extents = stats_extents, .cache = stats_cache, -- 2.21.0
Richard W.M. Jones
2019-Nov-30 06:58 UTC
Re: [Libguestfs] [PATCH nbdkit 1/3] filters: stats: Show size in GiB, rate in MiB/s
On Sat, Nov 30, 2019 at 02:17:05AM +0200, Nir Soffer wrote:> I find bytes and bits-per-second unhelpful and hard to parse. Using GiB > for sizes works for common disk images, and MiB/s works for common > storage throughput. > > Here is an example run with this change: > > $ ./nbdkit --foreground \ > --unix /tmp/nbd.sock \ > --exportname '' \ > --filter stats \ > file file=/var/tmp/dst.img \ > statsfile=/dev/stderr \ > --run 'qemu-img convert -p -n -f raw -O raw -T none /var/tmp/fedora-30.img nbd:unix:/tmp/nbd.sock' > (100.00/100%) > elapsed time: 2.313 s > write: 1271 ops, 1.14 GiB, 502.63 MiB/s > zero: 1027 ops, 4.86 GiB, 2153.24 MiB/s > extents: 1 ops, 2.00 GiB, 885.29 MiB/s > --- > filters/stats/stats.c | 34 +++++++++++++++++++--------------- > 1 file changed, 19 insertions(+), 15 deletions(-) > > diff --git a/filters/stats/stats.c b/filters/stats/stats.c > index 98282e2..45bedae 100644 > --- a/filters/stats/stats.c > +++ b/filters/stats/stats.c > @@ -49,6 +49,10 @@ > #include "cleanup.h" > #include "tvdiff.h" > > +#define MiB 1048576.0 > +#define GiB 1073741824.0 > +#define USEC 1000000.0 > + > static char *filename; > static bool append; > static FILE *fp; > @@ -64,34 +68,34 @@ static uint64_t extents_ops, extents_bytes; > static uint64_t cache_ops, cache_bytes; > > static inline double > -calc_bps (uint64_t bytes, int64_t usecs) > +calc_mibps (uint64_t bytes, int64_t usecs) > { > - return 8.0 * bytes / usecs * 1000000.; > + return bytes / MiB / usecs * USEC; > }The idea isn't bad, but I think we can lose data doing this. What happens if the rate is substantially less than 1 megabit? We either ought to scale this according to the size of the number being printed, or else let the user select it. For scaling, here's some code from fio: https://github.com/axboe/fio/blob/bef74db41fb5a1607fd55cb86544165fc08acac1/lib/num2str.c#L72 Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Richard W.M. Jones
2019-Nov-30 07:13 UTC
Re: [Libguestfs] [PATCH nbdkit 2/3] filters: stats: Measure time per operation
On Sat, Nov 30, 2019 at 02:17:06AM +0200, Nir Soffer wrote:> Previously we measured the total time and used it to calculate the rate > of different operations. This is incorrect and hides the real > throughput. A more useful way is to measure the time we spent in each > operation. > > Here is an example run with this change: > > $ ./nbdkit --foreground \ > --unix /tmp/nbd.sock \ > --exportname '' \ > --filter stats \ > file file=/var/tmp/dst.img \ > statsfile=/dev/stderr \ > --run 'qemu-img convert -p -n -f raw -O raw -T none /var/tmp/fedora-30.img nbd:unix:/tmp/nbd.sock' > (100.00/100%) > elapsed time: 2.150 s > write: 1271 ops, 1.14 GiB, 0.398 s, 2922.22 MiB/s > zero: 1027 ops, 4.86 GiB, 0.012 s, 414723.03 MiB/s > extents: 1 ops, 2.00 GiB, 0.000 s, 120470559.51 MiB/s > > This show that the actual time waiting for storage was only 0.4 seconds, > but elapsed time was 2.1 seconds. I think the missing time is in flush() > which we do not measure yet.I think opinions differ here. Some people will want to know the throughput being achieved by the system, and some people will want to know how well the plugin/backend is performing single operations. I know that you're interested in benchmarking imageio and that's completely valid, but I'm interested in measuring the total throughput. IOW: Can't we show both? The rest of the code is screaming out to be refactored so we aren't repeating the same change over and over. How about using something like the attached patch as an initial refactoring? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Possibly Parallel Threads
- Re: [PATCH nbdkit 1/3] filters: stats: Show size in GiB, rate in MiB/s
- [PATCH nbdkit 2/3] filters: stats: Measure time per operation
- [PATCH nbdkit 1/3] filters: stats: Show size in GiB, rate in MiB/s
- [nbdkit PATCH v2 00/24] implement NBD_CMD_CACHE
- Re: [PATCH nbdkit 2/3] filters: stats: Measure time per operation