Richard W.M. Jones
2021-May-25 16:36 UTC
[Libguestfs] [PATCH nbdkit] sparse-random: Don't generate random content in blocks by default
[I already pushed this upstream, this email is FYI] As discussed earlier today, when testing nbdcopy with nbdkit- sparse-random-plugin as the test harness, a very large amount of time was spent generating random numbers to fill the data blocks. This was pointless make-work, and this patch fixes it. More details in the commit message. Rich.
Richard W.M. Jones
2021-May-25 16:36 UTC
[Libguestfs] [PATCH nbdkit] sparse-random: Don't generate random content in blocks by default
Testing nbdcopy with nbdkit-sparse-random-plugin as a test harness under perf showed that 34% of the total time was taken calling read_block() to generate random data within each block (about 18% when reading and another 16% when writing and verifying). While we could probably optimize this a bit, it's all pointless make-work when testing. As long as there is some non-zero data in each block it's still a valid test of nbdcopy. Therefore add a new flag (random-content=true|false) to enable random content inside each block. The new default is random-content=false which means each block has the same random non-zero byte repeated across the whole block, which is fast. To get the old behaviour use random-content=true. Total time spent running read_block() went from 34% down to 7%. Total time spent running nbdkit went from 58% down to 37%. --- .../nbdkit-sparse-random-plugin.pod | 19 +++++++++++-- plugins/sparse-random/sparse-random.c | 28 ++++++++++++++----- 2 files changed, 37 insertions(+), 10 deletions(-) diff --git a/plugins/sparse-random/nbdkit-sparse-random-plugin.pod b/plugins/sparse-random/nbdkit-sparse-random-plugin.pod index fff98f3c..12e2798b 100644 --- a/plugins/sparse-random/nbdkit-sparse-random-plugin.pod +++ b/plugins/sparse-random/nbdkit-sparse-random-plugin.pod @@ -4,7 +4,9 @@ nbdkit-sparse-random-plugin - make sparse random disks =head1 SYNOPSIS - nbdkit sparse-random [size=]SIZE [seed=SEED] [percent=N] [runlength=N] + nbdkit sparse-random [size=]SIZE [seed=SEED] + [percent=N] [runlength=N] + [random-content=true] =head1 DESCRIPTION @@ -27,7 +29,12 @@ tries to create runs of data and runs of empty space. The C<runlength> parameter controls the average length of each run of random data. -The random data is generated using an I<insecure> method. +The data in each block normally consists of the same random non-zero +byte repeated over the whole block. If you want fully random content +within each block use C<random-content=true>. This is not the default +because earlier testing of this plugin showed that a great deal of +time was spent generating random content. The random content is +generated using a method which is I<not> cryptographically secure. =head2 Writes and testing copying @@ -51,6 +58,12 @@ data versus sparse empty space. The default is 10 (10%). C<percent=0> will create a completely empty disk and C<percent=100> will create a completely full disk. +=item B<random-content=true> + +By default a single random non-zero byte is repeated over the whole +block, which is fast to generate and check. If you want blocks where +each byte is random, use this setting. + =item B<runlength=>N Specify the average length of runs of random data. This is expressed @@ -109,4 +122,4 @@ Richard W.M. Jones =head1 COPYRIGHT -Copyright (C) 2018 Red Hat Inc. +Copyright (C) 2018-2021 Red Hat Inc. diff --git a/plugins/sparse-random/sparse-random.c b/plugins/sparse-random/sparse-random.c index 2512130a..34347dbf 100644 --- a/plugins/sparse-random/sparse-random.c +++ b/plugins/sparse-random/sparse-random.c @@ -1,5 +1,5 @@ /* nbdkit - * Copyright (C) 2017-2020 Red Hat Inc. + * Copyright (C) 2017-2021 Red Hat Inc. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are @@ -55,6 +55,7 @@ static uint32_t seed; /* Random seed. */ static double percent = 10; /* Percentage of data. */ static uint64_t runlength = /* Expected average run length of data (bytes)*/ UINT64_C(16*1024*1024); +static int random_content; /* false: Repeat same byte true: Random bytes*/ /* We need to store 1 bit per block. Using a 4K block size means we * need 32M to map each 1T of virtual disk. @@ -111,6 +112,11 @@ sparse_random_config (const char *key, const char *value) return -1; } } + else if (strcmp (key, "random-content") == 0) { + random_content = nbdkit_parse_bool (value); + if (random_content == -1) + return -1; + } else { nbdkit_error ("unknown parameter '%s'", key); return -1; @@ -123,7 +129,8 @@ sparse_random_config (const char *key, const char *value) "size=<SIZE> (required) Size of the backing disk\n" \ "seed=<SEED> Random number generator seed\n" \ "percent=<PERCENT> Percentage of data\n" \ - "runlength=<BYTES> Expected average run length of data" + "runlength=<BYTES> Expected average run length of data\n" \ + "random-content=true Fully random content in each block" /* Create the random bitmap of data and holes. * @@ -276,19 +283,26 @@ static void read_block (uint64_t blknum, uint64_t offset, void *buf) { unsigned char *b = buf; + uint64_t s; + uint32_t i; + struct random_state state; if (bitmap_get_blk (&bm, blknum, 0) == 0) /* hole */ memset (buf, 0, BLOCKSIZE); - else { /* data */ - uint32_t i; - struct random_state state; - + else if (!random_content) { /* data when random-content=false */ + xsrandom (seed + offset, &state); + s = xrandom (&state); + s &= 255; + if (s == 0) s = 1; + memset (buf, (int)s, BLOCKSIZE); + } + else { /* data when random-content=true */ /* This produces repeatable data for the same offset. Note it * works because we are called on whole blocks only. */ xsrandom (seed + offset, &state); for (i = 0; i < BLOCKSIZE; ++i) { - uint64_t s = xrandom (&state); + s = xrandom (&state); s &= 255; b[i] = s; } -- 2.31.1