Richard W.M. Jones
2017-Nov-21 21:21 UTC
Re: [Libguestfs] [nbdkit PATCH v2 0/4] enable parallel nbd forwarding
This works OK on x86_64, but fails on our fast new Amberwing (aarch64) machine. I've attached the test-suite.log file, but I'm not very sure what's going wrong from that. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Richard W.M. Jones
2017-Nov-21 21:32 UTC
Re: [Libguestfs] [nbdkit PATCH v2 0/4] enable parallel nbd forwarding
On Tue, Nov 21, 2017 at 09:21:59PM +0000, Richard W.M. Jones wrote:> > This works OK on x86_64, but fails on our fast new Amberwing (aarch64) > machine. I've attached the test-suite.log file, but I'm not very sure > what's going wrong from that.... although not every time. It seems to be an intermittent race. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Eric Blake
2017-Nov-21 21:48 UTC
Re: [Libguestfs] [nbdkit PATCH v2 0/4] enable parallel nbd forwarding
On 11/21/2017 03:21 PM, Richard W.M. Jones wrote:> This works OK on x86_64, but fails on our fast new Amberwing (aarch64) > machine. I've attached the test-suite.log file, but I'm not very sure > what's going wrong from that. >I'll see what I can spot...> FAIL: test-parallel-file.sh > ==========================> > read 1/1 bytes at offset 0 > 1 bytes, 1 ops; 0.0007 sec (1.340 KiB/sec and 1371.7421 ops/sec) > wrote 1/1 bytes at offset 1 > 1 bytes, 1 ops; 0.0203 sec (49.193231 bytes/sec and 49.1932 ops/sec) > /var/tmp/nbdkit-1.1.18/src/nbdkit -v -t 1 -U - > /var/tmp/nbdkit-1.1.18/plugins/file/.libs/nbdkit-file-plugin.so file=file-data rdelay=2 wdelay=1 --run > $QEMU_IO -f raw -c "aio_read 0 1" -c "aio_write -P 2 1 1" -c aio_flush $nbd >> nbdkit: file[1]: debug: handshake complete, processing requests seriallyWhen run serially,> read 1/1 bytes at offset 0 > 1 bytes, 1 ops; 0:00:02.00 (0.499933 bytes/sec and 0.4999 ops/sec) > wrote 1/1 bytes at offset 1 > 1 bytes, 1 ops; 0:00:03.01 (0.331378 bytes/sec and 0.3314 ops/sec)the read takes 2 seconds, and the write takes an additional second.> nbdkit: debug: /var/tmp/nbdkit-1.1.18/plugins/file/.libs/nbdkit-file-plugin.so: unload > /var/tmp/nbdkit-1.1.18/src/nbdkit -v -U - > /var/tmp/nbdkit-1.1.18/plugins/file/.libs/nbdkit-file-plugin.so file=file-data rdelay=2 wdelay=1 --run > $QEMU_IO -f raw -c "aio_read 0 1" -c "aio_write -P 2 1 1" -c aio_flush $nbd >When run with threads,> nbdkit: file[1]: debug: handshake complete, processing requests with 16 threads > nbdkit: debug: starting worker thread file.0 > nbdkit: debug: starting worker thread file.1 > nbdkit: file.0: debug: acquire unload prevention lock > nbdkit: file.0: debug: pread count=1 offset=0 > nbdkit: debug: starting worker thread file.2 > nbdkit: file.1: debug: acquire unload prevention lock > nbdkit: file.1: debug: pwrite count=1 offset=1there are definitely two threads that both obtain the unload prevention lock at once (so we do have parallel processing),> nbdkit: file.1: debug: flush > nbdkit: file.0: debug: release unload prevention lock > nbdkit: file.1: debug: release unload prevention lock > nbdkit: file.2: debug: acquire unload prevention lock > nbdkit: file.2: debug: flush > nbdkit: file.2: debug: release unload prevention lock > nbdkit: file.4: debug: acquire unload prevention lock > nbdkit: file.4: debug: flush > nbdkit: file.4: debug: release unload prevention lockand then, because we have not wired up nbdkit to expose FUA support to the plugins, we are calling the flush callback in parallel through three threads at once (probably way more flushing than we need, but doesn't seem to be the cause of the failure). I don't know how many of those flushes were triggered by nbdkit connections.c and how many were triggered by NBD_CMD_FLUSH from qemu-io.> nbdkit: file.0: debug: client closed input socket, closing connection > nbdkit: file.0: debug: exiting worker thread file.0 > read 1/1 bytes at offset 0 > 1 bytes, 1 ops; 0:00:02.47 (0.403374 bytes/sec and 0.4034 ops/sec) > wrote 1/1 bytes at offset 1 > 1 bytes, 1 ops; 0:00:02.55 (0.391644 bytes/sec and 0.3916 ops/sec)qemu-io finally records the replies it receives, and says that the read took 2.4 seconds, then the write took an additional .08 seconds beyond the read. The two commands were obviously in flight in parallel (because the wdelay is 1 second; as evidenced by the earlier results), but there must be enough lock contention or other interaction between the threads that we did not manage to get the write complete before the read. When I run the same test locally, and get a PASS result, my qemu-io results look more like: wrote 1/1 bytes at offset 1 1 bytes, 1 ops; 0:00:01.00 (0.994919 bytes/sec and 0.9949 ops/sec) read 1/1 bytes at offset 0 1 bytes, 1 ops; 0:00:02.00 (0.499855 bytes/sec and 0.4999 ops/sec) Hmm - I wonder - could it be extra delay induced by the flush that happens to make the write delay longer than the read? Should I swap the commands (have read be 2 seconds, write be 1 second, so that any flushing tied to the write is more likely to still finish before the read)? Also, this makes it VERY obvious that without timestamps in our nbdkit -v output, it's hard to tell WHEN things are happening, to see where the bottlenecks are.> FAIL: test-parallel-nbd.sh > =========================>> nbdkit: file[1]: debug: release unload prevention lock > nbdkit: file[1]: debug: handshake complete, processing requests with 16 threads > nbdkit: nbd[1]: nbdkit: debug: debug: acquire unload prevention lockstarting worker thread file.0 > > nbdkit: nbd[1]: nbdkit: debug: debug: get_sizestarting worker thread file.2 > > nbdkit: nbd[1]: debug: can_writeHere, we have a nice demonstration that commid d02d9c9d works for messages from one process (my debugging was worse without the mutex in errors.c), but no difference when interleaving messages from multiple processes. If we're that worried about it, we could make errors.c create the entire message into a single buffer (prologue, body, and \n), and only then have a single stdio call on that giant buffer, to make cross-process interleaving less likely, at the expense of more memory manipulation to build up the error message into a temporary buffer. I'm not too fussed, though - you aren't often trying to debug two nbdkit processes writing to the same file at once.> /var/tmp/nbdkit-1.1.18/src/nbdkit -v -U - > /var/tmp/nbdkit-1.1.18/plugins/nbd/.libs/nbdkit-nbd-plugin.so socket=test-parallel-nbd.sock --run > $QEMU_IO -f raw -c "aio_read 0 1" -c "aio_write -P 2 1 1" -c aio_flush $nbd >And when we run the parallel nbd plugin over parallel file plugin, we see:> nbdkit: nbd[1]: debug: handshake complete, processing requests with 16 threads > nbdkit: debug: starting worker thread file.1 > nbdkit: debug: nbdkit: starting worker thread file.10debug: > starting worker thread nbd.0 > nbdkit: debug: starting worker thread nbd.2 > nbdkit: debug: starting worker thread nbd.3 > nbdkit: nbdkit: debug: debug: starting worker thread file.4starting worker thread nbd.1 > > nbdkit: nbd.3: debug: acquire unload prevention lock > nbdkit: nbd.3nbdkit: : debug: debug: starting worker thread file.12pread count=1 offset=0 > > nbdkit: nbd.3: debug: sending request with type 0 and cookie 0xa00000009nbd plugin sends the read request first,> nbdkit: debug: nbdkit: starting worker thread file.6debug: > starting worker thread nbd.4 > nbdkit: nbdkit: file.0debug: : starting worker thread nbd.6debug: > acquire unload prevention lock > nbdkit: file.0: debug: pread count=1 offset=0which the file plugin starts processing;> nbdkit: nbd.0: debug: acquire unload prevention lock > nbdkit: nbdkit: debug: nbd.0starting worker thread file.15: > debug: pwrite count=1 offset=1 > nbdkit: debug: starting worker thread nbd.5 > nbdkit: debug: starting worker thread file.8nbdkit: > nbd.0: debug: sending request with type 1 and cookie 0xc0000000bthen in another thread nbd sends the write request,> nbdkit: debug: starting worker thread file.14 > nbdkit: file.6: nbdkit: debug: debug: acquire unload prevention lockstarting worker thread nbd.7 > > nbdkit: file.6: debug: pwrite count=1 offset=1which the file plugin starts processing. So both processes are indeed interleaved, as desired.> nbdkit: debug: starting worker thread nbd.8 > nbdkit: debug: starting worker thread nbd.9 > nbdkit: debug: starting worker thread nbd.10 > nbdkit: debug: starting worker thread nbd.12 > nbdkit: debug: starting worker thread nbd.13 > nbdkit: debug: starting worker thread nbd.14 > nbdkit: debug: starting worker thread nbd.11 > nbdkit: debug: starting worker thread nbd.15 > nbdkit: file.6: debug: release unload prevention lock > nbdkit: debug: received reply for cookie 0xc0000000bAnd the write completes first, as desired.> nbdkit: nbd.0: debug: flush > nbdkit: nbd.0: debug: sending request with type 3 and cookie 0xc0000000b > nbdkit: file.5: debug: acquire unload prevention lock > nbdkit: file.5: debug: flush > nbdkit: file.0: debug: release unload prevention lock > nbdkit: debug: received reply for cookie 0xa00000009 > nbdkit: nbd.3: debug: release unload prevention lock > nbdkit: file.5: debug: release unload prevention lock > nbdkit: debug: received reply for cookie 0xc0000000b > nbdkit: nbd.0: debug: release unload prevention lock > nbdkit: nbd.6: debug: acquire unload prevention lock > nbdkit: nbd.6: debug: flush > nbdkit: nbd.6: debug: sending request with type 3 and cookie 0xa00000009 > nbdkit: file.7: debug: acquire unload prevention lock > nbdkit: file.7: debug: flush > nbdkit: file.7: debug: release unload prevention lock > nbdkit: debug: received reply for cookie 0xa00000009 > nbdkit: nbd.6: debug: release unload prevention lock > nbdkit: nbd.2: debug: acquire unload prevention lock > nbdkit: nbd.2: debug: flush > nbdkit: nbd.2: debug: sending request with type 3 and cookie 0xa00000009 > nbdkit: file.9: debug: acquire unload prevention lock > nbdkit: file.9: debug: flush > nbdkit: file.9: debug: release unload prevention lock > nbdkit: debug: received reply for cookie 0xa00000009but we have a lot of traffic on those flush requests> nbdkit: nbd.2: debug: release unload prevention lock > nbdkit: nbd.4: debug: client sent disconnect command, closing connection > nbdkit: nbd.4: debug: exiting worker thread nbd.4 > nbdkit: nbd.1: debug: client closed input socket, closing connection > read 1/1 bytes at offset 0 > 1 bytes, 1 ops; 0:00:02.47 (0.403408 bytes/sec and 0.4034 ops/sec) > wrote 1/1 bytes at offset 1 > 1 bytes, 1 ops; 0:00:02.56 (0.390498 bytes/sec and 0.3905 ops/sec)so that qemu-io sees the read take 2.4 seconds and the write took .09 seconds longer. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Eric Blake
2017-Nov-21 22:02 UTC
Re: [Libguestfs] [nbdkit PATCH v2 0/4] enable parallel nbd forwarding
On 11/21/2017 03:48 PM, Eric Blake wrote:> On 11/21/2017 03:21 PM, Richard W.M. Jones wrote: >> This works OK on x86_64, but fails on our fast new Amberwing (aarch64) >> machine. I've attached the test-suite.log file, but I'm not very sure >> what's going wrong from that. >> > > I'll see what I can spot... >> > Hmm - I wonder - could it be extra delay induced by the flush that > happens to make the write delay longer than the read? Should I swap the > commands (have read be 2 seconds, write be 1 second, so that any > flushing tied to the write is more likely to still finish before the read)?That should read: If we write the test do use file ... rdelay=1 wdelay=2 coupled with qemu-io -c 'aio_write ...' -c 'aio_read ...' instead of the current file ... rdelay=2 wdelay=1 qemu-io -c 'aio_read ...' -c 'aio_write ...' then the read (which probably does not have to flush) is more likely to complete quickly than the write, but only if the read gets issued in parallel. Want me to submit a patch along those lines? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Reasonably Related Threads
- [PATCH nbdkit 1/2] tests: Remove QEMU_IO / HAVE_QEMU_IO.
- [PATCH nbdkit] tests: Remove QEMU_IO / HAVE_QEMU_IO.
- [nbdkit PATCH] tests: Make parallel tests work at 512-byte granularity
- [PATCH 9/9] filters: Move rdelay/wdelay from file plugin to new delay filter.
- [nbdkit PATCH 3/4] tests: Don't let test-parallel-* hang on nbdkit bug