Nir Soffer
2021-Nov-14 07:21 UTC
[Libguestfs] [PATCH libnbd 0/6] Enhance synch-parallel test
I'm working on an application using the sync API. In my tests I see best read throughput with 2 nbd connections which is unexpected. The best way to investigate this is a benchmark using various combinations of request size and number of connections. I found that tests/synch-parallel already does mostly what I need. This series fixes this test and enhances it to make request size and number of connections configurable. Additional features I plan to add: - Configurable run time - for benchmarking, running 10 seconds is too fast. When running on a laptop, the first benchmark is usually faster since the CPU take time to heat and when it heats (I see 96c in my laptop) it slows down. We need to run 30 or 60 seconds for more consistent results. For CI, 10 seconds is way too slow, running for 0.1 seconds is enough to ensure the program still runs. - Configurable I/O type - (read, write, read-write). Testing manually show higher throughput for read-only test (3.9g/s) compared with 1.1g/s read and 1.1g/s write when reading and writing. Issues: - The test scripts is much slower now (120 seconds instead of 10). We need to to separate the benchmark, running many combinations (synch-parallel-bench.sh) and the test, running one combination (sync-parallel.sh). - tests/aio-parallel test seems to have the same issues and needs similar changes. Nir Soffer (6): tests/synch-parallel: Show thread run time tests/synch-parallel: Fix request loop time limit tests/synch-parallel: Remove unneeded memcpy tests/synch-parallel: Show throughput in MiB/s tests/synch-parallel: Test multiple request sizes tests/synch-parallel: Test multiple number of connections tests/synch-parallel.c | 121 +++++++++++++++++++++++++++++----------- tests/synch-parallel.sh | 17 ++++-- 2 files changed, 100 insertions(+), 38 deletions(-) -- 2.31.1
Nir Soffer
2021-Nov-14 07:21 UTC
[Libguestfs] [PATCH libnbd 1/6] tests/synch-parallel: Show thread run time
When testing throughput, we care about the time spent sending requests.
Measure and log the time spent in the request loop. This remove the time
to start the threads and connect to the nbd server from the results, and
it ensure that all threads run the same amount of time, even if they do
not start exactly at the same time.
Running the tests multiple times show that threads run random time
between 10 to 11 seconds, but we compute the throughput assuming run
time of 10 seconds.
Example runs:
$ ./synch-parallel.sh
thread 5: finished OK in 10.054800 seconds
thread 1: finished OK in 10.054912 seconds
thread 2: finished OK in 10.055090 seconds
thread 0: finished OK in 10.055246 seconds
thread 6: finished OK in 10.055210 seconds
thread 7: finished OK in 10.055042 seconds
thread 3: finished OK in 10.055282 seconds
thread 4: finished OK in 10.055420 seconds
TLS: disabled
bytes sent: 6190678016 (619.068 Mbytes/s)
bytes received: 6183370752 (618.337 Mbytes/s)
I/O requests: 755252 (75525.2 IOPS)
$ ./synch-parallel.sh
thread 0: finished OK in 10.982875 seconds
thread 6: finished OK in 10.982651 seconds
thread 4: finished OK in 10.981958 seconds
thread 5: finished OK in 10.982700 seconds
thread 2: finished OK in 10.982774 seconds
thread 1: finished OK in 10.982976 seconds
thread 3: finished OK in 10.982834 seconds
thread 7: finished OK in 10.982610 seconds
TLS: disabled
bytes sent: 6646644736 (664.664 Mbytes/s)
bytes received: 6632275968 (663.228 Mbytes/s)
I/O requests: 810481 (81048.1 IOPS)
If we use the average thread runtime to calculate the stats, the results
would be:
Run 1:
run time: 10.054
bytes sent: 6190678016 (615.742 Mbytes/s)
bytes received: 6183370752 (615.015 Mbytes/s)
I/O requests: 755252 (75119.5 IOPS)
Run 2:
run time: 10.982
bytes sent: 6646644736 (605.230 Mbytes/s)
bytes received: 6632275968 (603.922 Mbytes/s)
I/O requests: 810481 (73800.8 IOPS)
Run 2 results are off by ~10%.
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
tests/synch-parallel.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tests/synch-parallel.c b/tests/synch-parallel.c
index d5efa3f..686e406 100644
--- a/tests/synch-parallel.c
+++ b/tests/synch-parallel.c
@@ -16,73 +16,85 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
/* Test synchronous parallel high level API requests on different
* handles. There should be no shared state between the handles so
* this should run at full speed (albeit with us only having a single
* command per thread in flight).
*/
#include <config.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <stdint.h>
#include <inttypes.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <assert.h>
+#include <sys/time.h>
#include <pthread.h>
#include <libnbd.h>
#include "byte-swapping.h"
/* We keep a shadow of the RAM disk so we can check integrity of the data. */
static char *ramdisk;
/* This is also defined in synch-parallel.sh and checked here. */
#define EXPORTSIZE (8*1024*1024)
/* How long (seconds) that the test will run for. */
#define RUN_TIME 10
/* Number of threads. */
#define NR_THREADS 8
+#define MICROSECONDS 1000000
+
/* Unix socket. */
static const char *unixsocket;
struct thread_status {
size_t i; /* Thread index, 0 .. NR_THREADS-1 */
time_t end_time; /* Threads run until this end time. */
uint64_t offset, length; /* Area assigned to this thread. */
int status; /* Return status. */
unsigned requests; /* Total number of requests made. */
uint64_t bytes_sent, bytes_received; /* Bytes sent and received by thread. */
};
static void *start_thread (void *arg);
+static inline int64_t
+microtime (void)
+{
+ struct timeval tv;
+
+ gettimeofday(&tv, NULL);
+ return tv.tv_sec * MICROSECONDS + tv.tv_usec;
+}
+
int
main (int argc, char *argv[])
{
pthread_t threads[NR_THREADS];
struct thread_status status[NR_THREADS];
size_t i;
time_t t;
int err;
unsigned requests, errors;
uint64_t bytes_sent, bytes_received;
if (argc != 2) {
fprintf (stderr, "%s socket\n", argv[0]);
exit (EXIT_FAILURE);
}
unixsocket = argv[1];
/* Get the current time and the end time. */
time (&t);
t += RUN_TIME;
@@ -156,40 +168,41 @@ main (int argc, char *argv[])
printf ("bytes received: %" PRIu64 " (%g Mbytes/s)\n",
bytes_received, (double) bytes_received / RUN_TIME / 1000000);
printf ("I/O requests: %u (%g IOPS)\n",
requests, (double) requests / RUN_TIME);
exit (errors == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
}
#define BUFFER_SIZE 16384
static void *
start_thread (void *arg)
{
struct thread_status *status = arg;
struct nbd_handle *nbd;
char *buf;
int cmd;
uint64_t offset;
time_t t;
+ int64_t start_usec, run_usec;
buf = calloc (BUFFER_SIZE, 1);
if (buf == NULL) {
perror ("calloc");
exit (EXIT_FAILURE);
}
nbd = nbd_create ();
if (nbd == NULL) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
#ifdef TLS
/* Require TLS on the handle and fail if not available or if the
* handshake fails.
*/
if (nbd_set_tls (nbd, LIBNBD_TLS_REQUIRE) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
@@ -199,73 +212,78 @@ start_thread (void *arg)
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
if (nbd_set_tls_psk_file (nbd, "keys.psk") == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
#endif
/* Connect to nbdkit. */
if (nbd_connect_unix (nbd, unixsocket) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
assert (nbd_get_size (nbd) == EXPORTSIZE);
assert (nbd_can_multi_conn (nbd) > 0);
assert (nbd_is_read_only (nbd) == 0);
+ start_usec = microtime ();
+
/* Issue commands. */
while (1) {
/* Run until the timer expires. */
time (&t);
if (t > status->end_time)
break;
/* Issue a synchronous read or write command. */
offset = status->offset + (rand () % (status->length - BUFFER_SIZE));
cmd = rand () & 1;
if (cmd == 0) {
if (nbd_pwrite (nbd, buf, BUFFER_SIZE, offset, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
status->bytes_sent += BUFFER_SIZE;
memcpy (&ramdisk[offset], buf, BUFFER_SIZE);
}
else {
if (nbd_pread (nbd, buf, BUFFER_SIZE, offset, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
status->bytes_received += BUFFER_SIZE;
if (memcmp (&ramdisk[offset], buf, BUFFER_SIZE) != 0) {
fprintf (stderr, "thread %zu: DATA INTEGRITY ERROR!\n",
status->i);
goto error;
}
}
status->requests++;
}
- printf ("thread %zu: finished OK\n", status->i);
+ run_usec = microtime () - start_usec;
+
+ printf ("thread %zu: finished OK in %.6f seconds\n",
+ status->i, (double) run_usec / MICROSECONDS);
if (nbd_shutdown (nbd, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
nbd_close (nbd);
free (buf);
status->status = 0;
pthread_exit (status);
error:
free (buf);
fprintf (stderr, "thread %zu: failed\n", status->i);
status->status = -1;
pthread_exit (status);
}
--
2.31.1
Nir Soffer
2021-Nov-14 07:21 UTC
[Libguestfs] [PATCH libnbd 2/6] tests/synch-parallel: Fix request loop time limit
Use microtime() for stopping the request loop. With this all threads run
for 10.000400 (+-0.000400), so using 10.0 seconds for computing the
stats is correct. Because the run time is practically the same in all
runs, we get more stable results for multiple runs.
Here are few example runs:
$ ./synch-parallel.sh
thread 4: finished OK in 10.000023 seconds
thread 1: finished OK in 10.000090 seconds
thread 0: finished OK in 10.000170 seconds
thread 2: finished OK in 10.000159 seconds
thread 5: finished OK in 10.000124 seconds
thread 3: finished OK in 10.000546 seconds
thread 6: finished OK in 10.000705 seconds
thread 7: finished OK in 10.000203 seconds
TLS: disabled
bytes sent: 6558744576 (655.874 Mbytes/s)
bytes received: 6567395328 (656.74 Mbytes/s)
I/O requests: 801156 (80115.6 IOPS)
$ ./synch-parallel.sh
thread 1: finished OK in 10.000013 seconds
thread 4: finished OK in 10.000083 seconds
thread 2: finished OK in 10.000007 seconds
thread 0: finished OK in 10.000046 seconds
thread 3: finished OK in 10.000080 seconds
thread 6: finished OK in 10.000013 seconds
thread 7: finished OK in 10.000052 seconds
thread 5: finished OK in 10.000266 seconds
TLS: disabled
bytes sent: 6436765696 (643.677 Mbytes/s)
bytes received: 6409781248 (640.978 Mbytes/s)
I/O requests: 784091 (78409.1 IOPS)
$ ./synch-parallel.sh
thread 1: finished OK in 10.000037 seconds
thread 0: finished OK in 10.000030 seconds
thread 7: finished OK in 10.000017 seconds
thread 2: finished OK in 10.000066 seconds
thread 5: finished OK in 10.000168 seconds
thread 4: finished OK in 10.000037 seconds
thread 6: finished OK in 10.000144 seconds
thread 3: finished OK in 10.000806 seconds
TLS: disabled
bytes sent: 6416236544 (641.624 Mbytes/s)
bytes received: 6434586624 (643.459 Mbytes/s)
I/O requests: 784352 (78435.2 IOPS)
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
tests/synch-parallel.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)
diff --git a/tests/synch-parallel.c b/tests/synch-parallel.c
index 686e406..72402d7 100644
--- a/tests/synch-parallel.c
+++ b/tests/synch-parallel.c
@@ -43,98 +43,91 @@
/* We keep a shadow of the RAM disk so we can check integrity of the data. */
static char *ramdisk;
/* This is also defined in synch-parallel.sh and checked here. */
#define EXPORTSIZE (8*1024*1024)
/* How long (seconds) that the test will run for. */
#define RUN_TIME 10
/* Number of threads. */
#define NR_THREADS 8
#define MICROSECONDS 1000000
/* Unix socket. */
static const char *unixsocket;
struct thread_status {
size_t i; /* Thread index, 0 .. NR_THREADS-1 */
- time_t end_time; /* Threads run until this end time. */
uint64_t offset, length; /* Area assigned to this thread. */
int status; /* Return status. */
unsigned requests; /* Total number of requests made. */
uint64_t bytes_sent, bytes_received; /* Bytes sent and received by thread. */
};
static void *start_thread (void *arg);
static inline int64_t
microtime (void)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * MICROSECONDS + tv.tv_usec;
}
int
main (int argc, char *argv[])
{
pthread_t threads[NR_THREADS];
struct thread_status status[NR_THREADS];
size_t i;
- time_t t;
int err;
unsigned requests, errors;
uint64_t bytes_sent, bytes_received;
if (argc != 2) {
fprintf (stderr, "%s socket\n", argv[0]);
exit (EXIT_FAILURE);
}
unixsocket = argv[1];
- /* Get the current time and the end time. */
- time (&t);
- t += RUN_TIME;
-
- srand (t + getpid ());
+ srand ((microtime () / MICROSECONDS) + getpid ());
/* Initialize the RAM disk with the initial data from
* nbdkit-pattern-filter.
*/
ramdisk = malloc (EXPORTSIZE);
if (ramdisk == NULL) {
perror ("calloc");
exit (EXIT_FAILURE);
}
for (i = 0; i < EXPORTSIZE; i += 8) {
uint64_t d = htobe64 (i);
memcpy (&ramdisk[i], &d, sizeof d);
}
/* Start the worker threads. */
for (i = 0; i < NR_THREADS; ++i) {
status[i].i = i;
- status[i].end_time = t;
status[i].offset = i * EXPORTSIZE / NR_THREADS;
status[i].length = EXPORTSIZE / NR_THREADS;
status[i].status = 0;
status[i].requests = 0;
status[i].bytes_sent = status[i].bytes_received = 0;
err = pthread_create (&threads[i], NULL, start_thread, &status[i]);
if (err != 0) {
errno = err;
perror ("pthread_create");
exit (EXIT_FAILURE);
}
}
/* Wait for the threads to exit. */
errors = 0;
requests = 0;
bytes_sent = bytes_received = 0;
for (i = 0; i < NR_THREADS; ++i) {
err = pthread_join (threads[i], NULL);
if (err != 0) {
@@ -167,42 +160,41 @@ main (int argc, char *argv[])
bytes_sent, (double) bytes_sent / RUN_TIME / 1000000);
printf ("bytes received: %" PRIu64 " (%g Mbytes/s)\n",
bytes_received, (double) bytes_received / RUN_TIME / 1000000);
printf ("I/O requests: %u (%g IOPS)\n",
requests, (double) requests / RUN_TIME);
exit (errors == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
}
#define BUFFER_SIZE 16384
static void *
start_thread (void *arg)
{
struct thread_status *status = arg;
struct nbd_handle *nbd;
char *buf;
int cmd;
uint64_t offset;
- time_t t;
- int64_t start_usec, run_usec;
+ int64_t start_usec, stop_usec, now_usec;
buf = calloc (BUFFER_SIZE, 1);
if (buf == NULL) {
perror ("calloc");
exit (EXIT_FAILURE);
}
nbd = nbd_create ();
if (nbd == NULL) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
#ifdef TLS
/* Require TLS on the handle and fail if not available or if the
* handshake fails.
*/
if (nbd_set_tls (nbd, LIBNBD_TLS_REQUIRE) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
@@ -213,77 +205,76 @@ start_thread (void *arg)
exit (EXIT_FAILURE);
}
if (nbd_set_tls_psk_file (nbd, "keys.psk") == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
#endif
/* Connect to nbdkit. */
if (nbd_connect_unix (nbd, unixsocket) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
assert (nbd_get_size (nbd) == EXPORTSIZE);
assert (nbd_can_multi_conn (nbd) > 0);
assert (nbd_is_read_only (nbd) == 0);
start_usec = microtime ();
+ stop_usec = start_usec + RUN_TIME * MICROSECONDS;
/* Issue commands. */
while (1) {
/* Run until the timer expires. */
- time (&t);
- if (t > status->end_time)
+ now_usec = microtime ();
+ if (now_usec >= stop_usec)
break;
/* Issue a synchronous read or write command. */
offset = status->offset + (rand () % (status->length - BUFFER_SIZE));
cmd = rand () & 1;
if (cmd == 0) {
if (nbd_pwrite (nbd, buf, BUFFER_SIZE, offset, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
status->bytes_sent += BUFFER_SIZE;
memcpy (&ramdisk[offset], buf, BUFFER_SIZE);
}
else {
if (nbd_pread (nbd, buf, BUFFER_SIZE, offset, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
status->bytes_received += BUFFER_SIZE;
if (memcmp (&ramdisk[offset], buf, BUFFER_SIZE) != 0) {
fprintf (stderr, "thread %zu: DATA INTEGRITY ERROR!\n",
status->i);
goto error;
}
}
status->requests++;
}
- run_usec = microtime () - start_usec;
-
printf ("thread %zu: finished OK in %.6f seconds\n",
- status->i, (double) run_usec / MICROSECONDS);
+ status->i, (double) (now_usec - start_usec) / MICROSECONDS);
if (nbd_shutdown (nbd, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
nbd_close (nbd);
free (buf);
status->status = 0;
pthread_exit (status);
error:
free (buf);
fprintf (stderr, "thread %zu: failed\n", status->i);
status->status = -1;
pthread_exit (status);
}
--
2.31.1
Nir Soffer
2021-Nov-14 07:21 UTC
[Libguestfs] [PATCH libnbd 3/6] tests/synch-parallel: Remove unneeded memcpy
The synch-parallel test was writing the thread buffer to the nbd server,
and then copying data from the ramdisk to the thread buffer. This looks
wrong for two reasons:
- The thread buffer is initialized to zeroes, and after every write, to
the contents of the ramdisk, so we always write data which does not
match the contents of the ramdisk. I guess this works since the patten
filter drops the written data.
- For every write, we pay for unneeded memcpy() adding unwanted noise to
the results.
Simplify the code to either copy a block from the ramdisk to the nbd
server or copy a block from nbd server to the thread buffer.
Testing show no significant difference with or without the unneeded
memcpy().
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
tests/synch-parallel.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tests/synch-parallel.c b/tests/synch-parallel.c
index 72402d7..cecfeae 100644
--- a/tests/synch-parallel.c
+++ b/tests/synch-parallel.c
@@ -218,48 +218,49 @@ start_thread (void *arg)
}
assert (nbd_get_size (nbd) == EXPORTSIZE);
assert (nbd_can_multi_conn (nbd) > 0);
assert (nbd_is_read_only (nbd) == 0);
start_usec = microtime ();
stop_usec = start_usec + RUN_TIME * MICROSECONDS;
/* Issue commands. */
while (1) {
/* Run until the timer expires. */
now_usec = microtime ();
if (now_usec >= stop_usec)
break;
/* Issue a synchronous read or write command. */
offset = status->offset + (rand () % (status->length - BUFFER_SIZE));
cmd = rand () & 1;
if (cmd == 0) {
- if (nbd_pwrite (nbd, buf, BUFFER_SIZE, offset, 0) == -1) {
+ /* Write block from ramdisk to nbd server. */
+ if (nbd_pwrite (nbd, &ramdisk[offset], BUFFER_SIZE, offset, 0) == -1)
{
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
status->bytes_sent += BUFFER_SIZE;
- memcpy (&ramdisk[offset], buf, BUFFER_SIZE);
}
else {
+ /* Read block from nbd server to buf. */
if (nbd_pread (nbd, buf, BUFFER_SIZE, offset, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
status->bytes_received += BUFFER_SIZE;
if (memcmp (&ramdisk[offset], buf, BUFFER_SIZE) != 0) {
fprintf (stderr, "thread %zu: DATA INTEGRITY ERROR!\n",
status->i);
goto error;
}
}
status->requests++;
}
printf ("thread %zu: finished OK in %.6f seconds\n",
status->i, (double) (now_usec - start_usec) / MICROSECONDS);
if (nbd_shutdown (nbd, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
--
2.31.1
Nir Soffer
2021-Nov-14 07:21 UTC
[Libguestfs] [PATCH libnbd 4/6] tests/synch-parallel: Show throughput in MiB/s
Using Mbytes/s is less useful, people expect values in MiB.
Example output:
$ ./synch-parallel.sh
...
bytes sent: 12686721024 (1209.9 MiB/s)
bytes received: 12734693376 (1214.47 MiB/s)
I/O requests: 96975 (9697.5 IOPS)
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
tests/synch-parallel.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/tests/synch-parallel.c b/tests/synch-parallel.c
index cecfeae..423e1f0 100644
--- a/tests/synch-parallel.c
+++ b/tests/synch-parallel.c
@@ -36,40 +36,42 @@
#include <sys/time.h>
#include <pthread.h>
#include <libnbd.h>
#include "byte-swapping.h"
/* We keep a shadow of the RAM disk so we can check integrity of the data. */
static char *ramdisk;
/* This is also defined in synch-parallel.sh and checked here. */
#define EXPORTSIZE (8*1024*1024)
/* How long (seconds) that the test will run for. */
#define RUN_TIME 10
/* Number of threads. */
#define NR_THREADS 8
+#define MiB (1024*1024)
+
#define MICROSECONDS 1000000
/* Unix socket. */
static const char *unixsocket;
struct thread_status {
size_t i; /* Thread index, 0 .. NR_THREADS-1 */
uint64_t offset, length; /* Area assigned to this thread. */
int status; /* Return status. */
unsigned requests; /* Total number of requests made. */
uint64_t bytes_sent, bytes_received; /* Bytes sent and received by thread. */
};
static void *start_thread (void *arg);
static inline int64_t
microtime (void)
{
struct timeval tv;
@@ -139,44 +141,44 @@ main (int argc, char *argv[])
fprintf (stderr, "thread %zu failed with status %d\n",
i, status[i].status);
errors++;
}
requests += status[i].requests;
bytes_sent += status[i].bytes_sent;
bytes_received += status[i].bytes_received;
}
free (ramdisk);
/* Print some stats. */
printf ("TLS: %s\n",
#ifdef TLS
"enabled"
#else
"disabled"
#endif
);
- printf ("bytes sent: %" PRIu64 " (%g Mbytes/s)\n",
- bytes_sent, (double) bytes_sent / RUN_TIME / 1000000);
- printf ("bytes received: %" PRIu64 " (%g Mbytes/s)\n",
- bytes_received, (double) bytes_received / RUN_TIME / 1000000);
+ printf ("bytes sent: %" PRIu64 " (%g MiB/s)\n",
+ bytes_sent, (double) bytes_sent / RUN_TIME / MiB);
+ printf ("bytes received: %" PRIu64 " (%g MiB/s)\n",
+ bytes_received, (double) bytes_received / RUN_TIME / MiB);
printf ("I/O requests: %u (%g IOPS)\n",
requests, (double) requests / RUN_TIME);
exit (errors == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
}
#define BUFFER_SIZE 16384
static void *
start_thread (void *arg)
{
struct thread_status *status = arg;
struct nbd_handle *nbd;
char *buf;
int cmd;
uint64_t offset;
int64_t start_usec, stop_usec, now_usec;
buf = calloc (BUFFER_SIZE, 1);
--
2.31.1
Nir Soffer
2021-Nov-14 07:21 UTC
[Libguestfs] [PATCH libnbd 5/6] tests/synch-parallel: Test multiple request sizes
Rename BUFFER_SIZE to REQUEST_SIZE and get the value from an environment
variable. Change the test script to test multiple values.
An example run:
$ ./synch-parallel.sh
Request size: 4096
thread 0: finished OK in 10.000019 seconds
thread 4: finished OK in 10.000003 seconds
thread 2: finished OK in 10.000015 seconds
thread 1: finished OK in 10.000053 seconds
thread 6: finished OK in 10.000075 seconds
thread 7: finished OK in 10.000012 seconds
thread 3: finished OK in 10.000024 seconds
thread 5: finished OK in 10.000000 seconds
TLS: disabled
bytes sent: 2412867584 (230.109 MiB/s)
bytes received: 2419851264 (230.775 MiB/s)
I/O requests: 1179863 (117986 IOPS)
Request size: 262144
thread 2: finished OK in 10.000030 seconds
thread 5: finished OK in 10.000415 seconds
thread 1: finished OK in 10.000500 seconds
thread 3: finished OK in 10.000459 seconds
thread 6: finished OK in 10.000275 seconds
thread 4: finished OK in 10.000520 seconds
thread 7: finished OK in 10.000529 seconds
thread 0: finished OK in 10.000768 seconds
TLS: disabled
bytes sent: 12791840768 (1219.92 MiB/s)
bytes received: 12748324864 (1215.78 MiB/s)
I/O requests: 97428 (9742.8 IOPS)
4k is a good size to test IOPS, and 256k is a good size for getting
maximum throughput. I kept the current default 16k as is.
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
tests/synch-parallel.c | 51 +++++++++++++++++++++++++++++++++--------
tests/synch-parallel.sh | 12 ++++++----
2 files changed, 49 insertions(+), 14 deletions(-)
diff --git a/tests/synch-parallel.c b/tests/synch-parallel.c
index 423e1f0..d6ab1df 100644
--- a/tests/synch-parallel.c
+++ b/tests/synch-parallel.c
@@ -36,82 +36,115 @@
#include <sys/time.h>
#include <pthread.h>
#include <libnbd.h>
#include "byte-swapping.h"
/* We keep a shadow of the RAM disk so we can check integrity of the data. */
static char *ramdisk;
/* This is also defined in synch-parallel.sh and checked here. */
#define EXPORTSIZE (8*1024*1024)
/* How long (seconds) that the test will run for. */
#define RUN_TIME 10
/* Number of threads. */
#define NR_THREADS 8
-#define MiB (1024*1024)
+#define KiB 1024
+#define MiB (1024*KiB)
#define MICROSECONDS 1000000
/* Unix socket. */
static const char *unixsocket;
+static long request_size;
+
struct thread_status {
size_t i; /* Thread index, 0 .. NR_THREADS-1 */
uint64_t offset, length; /* Area assigned to this thread. */
int status; /* Return status. */
unsigned requests; /* Total number of requests made. */
uint64_t bytes_sent, bytes_received; /* Bytes sent and received by thread. */
};
static void *start_thread (void *arg);
static inline int64_t
microtime (void)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * MICROSECONDS + tv.tv_usec;
}
+static long
+getenv_long(const char *name, long defval)
+{
+ const char *value;
+ char *end;
+ long res;
+
+ value = getenv (name);
+ if (value == NULL)
+ return defval;
+
+ res = strtol(value, &end, 10);
+ if (*end != '\0' || end == value) {
+ fprintf (stderr, "Invalid value for %s: '%s'\n", name,
value);
+ exit (EXIT_FAILURE);
+ }
+
+ return res;
+}
+
int
main (int argc, char *argv[])
{
pthread_t threads[NR_THREADS];
struct thread_status status[NR_THREADS];
size_t i;
int err;
unsigned requests, errors;
uint64_t bytes_sent, bytes_received;
if (argc != 2) {
fprintf (stderr, "%s socket\n", argv[0]);
exit (EXIT_FAILURE);
}
unixsocket = argv[1];
+ request_size = getenv_long ("REQUEST_SIZE", 16*KiB);
+ if (request_size < 4*KiB ||
+ request_size > 512*KiB ||
+ (EXPORTSIZE / NR_THREADS) % request_size != 0) {
+ fprintf (stderr,
+ "Invalid REQUEST_SIZE environment variable: %ld\n",
+ request_size);
+ exit (EXIT_FAILURE);
+ }
+
srand ((microtime () / MICROSECONDS) + getpid ());
/* Initialize the RAM disk with the initial data from
* nbdkit-pattern-filter.
*/
ramdisk = malloc (EXPORTSIZE);
if (ramdisk == NULL) {
perror ("calloc");
exit (EXIT_FAILURE);
}
for (i = 0; i < EXPORTSIZE; i += 8) {
uint64_t d = htobe64 (i);
memcpy (&ramdisk[i], &d, sizeof d);
}
/* Start the worker threads. */
for (i = 0; i < NR_THREADS; ++i) {
status[i].i = i;
status[i].offset = i * EXPORTSIZE / NR_THREADS;
status[i].length = EXPORTSIZE / NR_THREADS;
@@ -152,53 +185,51 @@ main (int argc, char *argv[])
/* Print some stats. */
printf ("TLS: %s\n",
#ifdef TLS
"enabled"
#else
"disabled"
#endif
);
printf ("bytes sent: %" PRIu64 " (%g MiB/s)\n",
bytes_sent, (double) bytes_sent / RUN_TIME / MiB);
printf ("bytes received: %" PRIu64 " (%g MiB/s)\n",
bytes_received, (double) bytes_received / RUN_TIME / MiB);
printf ("I/O requests: %u (%g IOPS)\n",
requests, (double) requests / RUN_TIME);
exit (errors == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
}
-#define BUFFER_SIZE 16384
-
static void *
start_thread (void *arg)
{
struct thread_status *status = arg;
struct nbd_handle *nbd;
char *buf;
int cmd;
uint64_t offset;
int64_t start_usec, stop_usec, now_usec;
- buf = calloc (BUFFER_SIZE, 1);
+ buf = calloc (request_size, 1);
if (buf == NULL) {
perror ("calloc");
exit (EXIT_FAILURE);
}
nbd = nbd_create ();
if (nbd == NULL) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
#ifdef TLS
/* Require TLS on the handle and fail if not available or if the
* handshake fails.
*/
if (nbd_set_tls (nbd, LIBNBD_TLS_REQUIRE) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
@@ -217,58 +248,58 @@ start_thread (void *arg)
if (nbd_connect_unix (nbd, unixsocket) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
exit (EXIT_FAILURE);
}
assert (nbd_get_size (nbd) == EXPORTSIZE);
assert (nbd_can_multi_conn (nbd) > 0);
assert (nbd_is_read_only (nbd) == 0);
start_usec = microtime ();
stop_usec = start_usec + RUN_TIME * MICROSECONDS;
/* Issue commands. */
while (1) {
/* Run until the timer expires. */
now_usec = microtime ();
if (now_usec >= stop_usec)
break;
/* Issue a synchronous read or write command. */
- offset = status->offset + (rand () % (status->length - BUFFER_SIZE));
+ offset = status->offset + (rand () % (status->length -
request_size));
cmd = rand () & 1;
if (cmd == 0) {
/* Write block from ramdisk to nbd server. */
- if (nbd_pwrite (nbd, &ramdisk[offset], BUFFER_SIZE, offset, 0) == -1)
{
+ if (nbd_pwrite (nbd, &ramdisk[offset], request_size, offset, 0) ==
-1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
- status->bytes_sent += BUFFER_SIZE;
+ status->bytes_sent += request_size;
}
else {
/* Read block from nbd server to buf. */
- if (nbd_pread (nbd, buf, BUFFER_SIZE, offset, 0) == -1) {
+ if (nbd_pread (nbd, buf, request_size, offset, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
- status->bytes_received += BUFFER_SIZE;
- if (memcmp (&ramdisk[offset], buf, BUFFER_SIZE) != 0) {
+ status->bytes_received += request_size;
+ if (memcmp (&ramdisk[offset], buf, request_size) != 0) {
fprintf (stderr, "thread %zu: DATA INTEGRITY ERROR!\n",
status->i);
goto error;
}
}
status->requests++;
}
printf ("thread %zu: finished OK in %.6f seconds\n",
status->i, (double) (now_usec - start_usec) / MICROSECONDS);
if (nbd_shutdown (nbd, 0) == -1) {
fprintf (stderr, "%s\n", nbd_get_error ());
goto error;
}
nbd_close (nbd);
free (buf);
status->status = 0;
diff --git a/tests/synch-parallel.sh b/tests/synch-parallel.sh
index 84c00d8..0ca9060 100755
--- a/tests/synch-parallel.sh
+++ b/tests/synch-parallel.sh
@@ -1,24 +1,28 @@
#!/usr/bin/env bash
# nbd client library in userspace
# Copyright (C) 2019 Red Hat Inc.
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
# Test synchronous parallel high level API requests.
-nbdkit -U - \
- --filter=cow \
- pattern size=8M \
- --run '$VG ./synch-parallel $unixsocket'
+for request_size in 4096 262144; do
+ echo "Request size: $request_size"
+ REQUEST_SIZE=$request_size nbdkit -U - \
+ --filter=cow \
+ pattern size=8M \
+ --run '$VG ./synch-parallel $unixsocket'
+ echo
+done
--
2.31.1
Nir Soffer
2021-Nov-14 07:21 UTC
[Libguestfs] [PATCH libnbd 6/6] tests/synch-parallel: Test multiple number of connections
Rename NR_THREADS to CONNECTIONS and get the value form an environment
variable. Change the test script to test multiple connections.
An example run:
$ ./synch-parallel.sh
Connections: 1
Request size: 4096
thread 0: finished OK in 10.000000 seconds
TLS: disabled
bytes sent: 996638720 (95.0469 MiB/s)
bytes received: 995004416 (94.891 MiB/s)
I/O requests: 486241 (48624.1 IOPS)
Connections: 1
Request size: 262144
thread 0: finished OK in 10.000047 seconds
TLS: disabled
bytes sent: 9216720896 (878.975 MiB/s)
bytes received: 9266528256 (883.725 MiB/s)
I/O requests: 70508 (7050.8 IOPS)
Connections: 2
Request size: 4096
thread 0: finished OK in 10.000015 seconds
thread 1: finished OK in 10.000012 seconds
TLS: disabled
bytes sent: 1681920000 (160.4 MiB/s)
bytes received: 1680896000 (160.303 MiB/s)
I/O requests: 821000 (82100 IOPS)
Connections: 2
Request size: 262144
thread 0: finished OK in 10.000048 seconds
thread 1: finished OK in 10.000060 seconds
TLS: disabled
bytes sent: 12331515904 (1176.03 MiB/s)
bytes received: 12310282240 (1174 MiB/s)
I/O requests: 94001 (9400.1 IOPS)
Connections: 4
Request size: 4096
thread 3: finished OK in 10.000004 seconds
thread 0: finished OK in 10.000029 seconds
thread 2: finished OK in 10.000011 seconds
thread 1: finished OK in 10.000025 seconds
TLS: disabled
bytes sent: 2024407040 (193.062 MiB/s)
bytes received: 2024652800 (193.086 MiB/s)
I/O requests: 988540 (98854 IOPS)
Connections: 4
Request size: 262144
thread 2: finished OK in 10.000300 seconds
thread 1: finished OK in 10.000360 seconds
thread 0: finished OK in 10.000340 seconds
thread 3: finished OK in 10.000325 seconds
TLS: disabled
bytes sent: 12098994176 (1153.85 MiB/s)
bytes received: 12016680960 (1146 MiB/s)
I/O requests: 91994 (9199.4 IOPS)
Connections: 8
Request size: 4096
thread 0: finished OK in 10.000002 seconds
thread 1: finished OK in 10.000050 seconds
thread 5: finished OK in 10.000098 seconds
thread 3: finished OK in 10.000013 seconds
thread 6: finished OK in 10.000239 seconds
thread 4: finished OK in 10.000126 seconds
thread 2: finished OK in 10.000068 seconds
thread 7: finished OK in 10.000215 seconds
TLS: disabled
bytes sent: 2110287872 (201.253 MiB/s)
bytes received: 2105614336 (200.807 MiB/s)
I/O requests: 1029273 (102927 IOPS)
Connections: 8
Request size: 262144
thread 7: finished OK in 10.000441 seconds
thread 6: finished OK in 10.000351 seconds
thread 2: finished OK in 10.000572 seconds
thread 0: finished OK in 10.000676 seconds
thread 3: finished OK in 10.000772 seconds
thread 5: finished OK in 10.000839 seconds
thread 4: finished OK in 10.000968 seconds
thread 1: finished OK in 10.000861 seconds
TLS: disabled
bytes sent: 11867783168 (1131.8 MiB/s)
bytes received: 11875647488 (1132.55 MiB/s)
I/O requests: 90574 (9057.4 IOPS)
Signed-off-by: Nir Soffer <nsoffer at redhat.com>
---
tests/synch-parallel.c | 30 ++++++++++++++++++++----------
tests/synch-parallel.sh | 19 ++++++++++++-------
2 files changed, 32 insertions(+), 17 deletions(-)
diff --git a/tests/synch-parallel.c b/tests/synch-parallel.c
index d6ab1df..099a906 100644
--- a/tests/synch-parallel.c
+++ b/tests/synch-parallel.c
@@ -33,154 +33,164 @@
#include <unistd.h>
#include <errno.h>
#include <assert.h>
#include <sys/time.h>
#include <pthread.h>
#include <libnbd.h>
#include "byte-swapping.h"
/* We keep a shadow of the RAM disk so we can check integrity of the data. */
static char *ramdisk;
/* This is also defined in synch-parallel.sh and checked here. */
#define EXPORTSIZE (8*1024*1024)
/* How long (seconds) that the test will run for. */
#define RUN_TIME 10
-/* Number of threads. */
-#define NR_THREADS 8
+#define MAX_CONNECTIONS 8
#define KiB 1024
#define MiB (1024*KiB)
#define MICROSECONDS 1000000
/* Unix socket. */
static const char *unixsocket;
static long request_size;
+static long connections;
struct thread_status {
- size_t i; /* Thread index, 0 .. NR_THREADS-1 */
+ size_t i; /* Thread index, 0 .. connections-1 */
uint64_t offset, length; /* Area assigned to this thread. */
int status; /* Return status. */
unsigned requests; /* Total number of requests made. */
uint64_t bytes_sent, bytes_received; /* Bytes sent and received by thread. */
};
static void *start_thread (void *arg);
static inline int64_t
microtime (void)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * MICROSECONDS + tv.tv_usec;
}
static long
getenv_long(const char *name, long defval)
{
const char *value;
char *end;
long res;
value = getenv (name);
if (value == NULL)
return defval;
res = strtol(value, &end, 10);
if (*end != '\0' || end == value) {
fprintf (stderr, "Invalid value for %s: '%s'\n", name,
value);
exit (EXIT_FAILURE);
}
return res;
}
int
main (int argc, char *argv[])
{
- pthread_t threads[NR_THREADS];
- struct thread_status status[NR_THREADS];
+ pthread_t threads[MAX_CONNECTIONS];
+ struct thread_status status[MAX_CONNECTIONS];
size_t i;
int err;
unsigned requests, errors;
uint64_t bytes_sent, bytes_received;
if (argc != 2) {
fprintf (stderr, "%s socket\n", argv[0]);
exit (EXIT_FAILURE);
}
unixsocket = argv[1];
+ connections = getenv_long ("CONNECTIONS", 8);
+ if (connections < 1 ||
+ connections > MAX_CONNECTIONS ||
+ EXPORTSIZE % connections != 0) {
+ fprintf (stderr,
+ "Invalid CONNECTIONS environment variable: %ld\n",
+ connections);
+ exit (EXIT_FAILURE);
+ }
+
request_size = getenv_long ("REQUEST_SIZE", 16*KiB);
if (request_size < 4*KiB ||
request_size > 512*KiB ||
- (EXPORTSIZE / NR_THREADS) % request_size != 0) {
+ (EXPORTSIZE / connections) % request_size != 0) {
fprintf (stderr,
"Invalid REQUEST_SIZE environment variable: %ld\n",
request_size);
exit (EXIT_FAILURE);
}
srand ((microtime () / MICROSECONDS) + getpid ());
/* Initialize the RAM disk with the initial data from
* nbdkit-pattern-filter.
*/
ramdisk = malloc (EXPORTSIZE);
if (ramdisk == NULL) {
perror ("calloc");
exit (EXIT_FAILURE);
}
for (i = 0; i < EXPORTSIZE; i += 8) {
uint64_t d = htobe64 (i);
memcpy (&ramdisk[i], &d, sizeof d);
}
/* Start the worker threads. */
- for (i = 0; i < NR_THREADS; ++i) {
+ for (i = 0; i < connections; ++i) {
status[i].i = i;
- status[i].offset = i * EXPORTSIZE / NR_THREADS;
- status[i].length = EXPORTSIZE / NR_THREADS;
+ status[i].offset = i * EXPORTSIZE / connections;
+ status[i].length = EXPORTSIZE / connections;
status[i].status = 0;
status[i].requests = 0;
status[i].bytes_sent = status[i].bytes_received = 0;
err = pthread_create (&threads[i], NULL, start_thread, &status[i]);
if (err != 0) {
errno = err;
perror ("pthread_create");
exit (EXIT_FAILURE);
}
}
/* Wait for the threads to exit. */
errors = 0;
requests = 0;
bytes_sent = bytes_received = 0;
- for (i = 0; i < NR_THREADS; ++i) {
+ for (i = 0; i < connections; ++i) {
err = pthread_join (threads[i], NULL);
if (err != 0) {
errno = err;
perror ("pthread_join");
exit (EXIT_FAILURE);
}
if (status[i].status != 0) {
fprintf (stderr, "thread %zu failed with status %d\n",
i, status[i].status);
errors++;
}
requests += status[i].requests;
bytes_sent += status[i].bytes_sent;
bytes_received += status[i].bytes_received;
}
free (ramdisk);
/* Print some stats. */
printf ("TLS: %s\n",
diff --git a/tests/synch-parallel.sh b/tests/synch-parallel.sh
index 0ca9060..ae35dd1 100755
--- a/tests/synch-parallel.sh
+++ b/tests/synch-parallel.sh
@@ -1,28 +1,33 @@
#!/usr/bin/env bash
# nbd client library in userspace
# Copyright (C) 2019 Red Hat Inc.
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
# Test synchronous parallel high level API requests.
-for request_size in 4096 262144; do
- echo "Request size: $request_size"
- REQUEST_SIZE=$request_size nbdkit -U - \
- --filter=cow \
- pattern size=8M \
- --run '$VG ./synch-parallel $unixsocket'
- echo
+for connections in 1 2 4 8; do
+ for request_size in 4096 262144; do
+ echo "Connections: $connections"
+ echo "Request size: $request_size"
+ CONNECTIONS=$connections \
+ REQUEST_SIZE=$request_size \
+ nbdkit -U - \
+ --filter=cow \
+ pattern size=8M \
+ --run '$VG ./synch-parallel $unixsocket'
+ echo
+ done
done
--
2.31.1
Richard W.M. Jones
2021-Nov-14 18:08 UTC
[Libguestfs] [PATCH libnbd 0/6] Enhance synch-parallel test
I'm lukewarm of this series, but I've got a few comments on it as well
as some questions below.
- In nbdkit we have a function tvdiff_usec (see
common/include/tvdiff.h). Maybe we should use that instead of
microtime?
- Rather than having a single test that does multiple runs (and so
runs for a very long time), it's better to have multiple tests
because they can run in parallel. This is how it could be done (but
see also my comment about benchmarks below).
* Copy original tests/synch-parallel.sh to
tests/synch-parallel-conn-1-request-4096.sh
tests/synch-parallel-conn-1-request-262144.sh
etc. (Or choose better names)
* Each test does:
CONNECTIONS=1 REQUEST_SIZE=4096 ./synch-parallel.sh (etc)
* Original synch-parallel.sh is unchanged, but remove it from TESTS
* Add the new scripts to TESTS
On Sun, Nov 14, 2021 at 09:21:39AM +0200, Nir Soffer
wrote:> I'm working on an application using the sync API. In my tests I see
best read
> throughput with 2 nbd connections which is unexpected.
I'm interested in why you're using the synchronous API. I think
it'll
inevitably be slower than using the asynch API because you can never
have multiple requests on a single TCP connection.
...> - The test scripts is much slower now (120 seconds instead of
> 10). We need to to separate the benchmark, running many
> combinations (synch-parallel-bench.sh) and the test, running one
> combination (sync-parallel.sh).
This is too long for tests. I think there are really two conflicting
requirements - you want to benchmark the synchronous API, which is a
worthwhile but separate goal from testing that things are working.
Should we consider having benchmarks in a separate directory or even a
separate git repo? (Separate repo is what I did for libguestfs).
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW