Eric Blake
2020-Aug-17 16:35 UTC
[Libguestfs] [nbdkit PATCH] sh: Prefer dd bs=1 over iflag=count_bytes
While iflag=count_bytes combined with bs > 1 allows for more efficient operation, it is a feature of GNU dd, and not present on other implementations such as BSD. Sticking to just POSIX features makes things more portable. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/nbdkit-loop.pod | 2 +- docs/nbdkit.pod | 4 ++-- plugins/eval/nbdkit-eval-plugin.pod | 6 +++--- plugins/sh/nbdkit-sh-plugin.pod | 2 +- plugins/sh/assemble.sh | 2 +- plugins/sh/example.sh | 7 +++---- tests/test-cacheextents.sh | 5 ++--- tests/test-eval-file.sh | 5 ++--- tests/test-eval.sh | 3 +-- tests/test-export-name.sh | 5 ++--- tests/test-parallel-sh.sh | 5 ++--- tests/test-readahead-test-plugin.sh | 2 +- tests/test-readahead.sh | 1 - tests/test-retry-reopen-fail.sh | 3 +-- tests/test-retry-size.sh | 3 +-- tests/test-retry.sh | 3 +-- tests/test-shell.sh | 5 ++--- tests/test-single-sh.sh | 3 +-- tests/test-tls-fallback.sh | 2 +- 19 files changed, 28 insertions(+), 40 deletions(-) diff --git a/docs/nbdkit-loop.pod b/docs/nbdkit-loop.pod index 055b5750..b21c2212 100644 --- a/docs/nbdkit-loop.pod +++ b/docs/nbdkit-loop.pod @@ -120,7 +120,7 @@ creates a disk which contains a bad sector: echo EIO Bad block >&2 exit 1 else - dd if=/dev/zero count=$3 iflag=count_bytes + dd bs=1 if=/dev/zero count=$3 fi ;; *) exit 2 ;; esac diff --git a/docs/nbdkit.pod b/docs/nbdkit.pod index 98f069f7..67653bc0 100644 --- a/docs/nbdkit.pod +++ b/docs/nbdkit.pod @@ -138,7 +138,7 @@ L<nbdkit-sh-plugin(3)>: nbdkit sh - <<'EOF' case "$1" in get_size) echo 1M ;; - pread) dd if=/dev/zero count=$3 iflag=count_bytes ;; + pread) dd bs=1 if=/dev/zero count=$3 ;; *) exit 2 ;; esac EOF @@ -149,7 +149,7 @@ The same example as above can be written entirely on the command line using L<nbdkit-eval-plugin(1)>: nbdkit eval get_size='echo 1M' \ - pread='dd if=/dev/zero count=$3 iflag=count_bytes' + pread='dd bs=1 if=/dev/zero count=$3' =back diff --git a/plugins/eval/nbdkit-eval-plugin.pod b/plugins/eval/nbdkit-eval-plugin.pod index 7126c6de..d5918e28 100644 --- a/plugins/eval/nbdkit-eval-plugin.pod +++ b/plugins/eval/nbdkit-eval-plugin.pod @@ -51,7 +51,7 @@ all other methods are identical). Create a 64M read-only disk of zeroes: nbdkit eval get_size=' echo 64M ' \ - pread=' dd if=/dev/zero count=$3 iflag=count_bytes ' + pread=' dd bs=1 if=/dev/zero count=$3 ' The following command is the eval plugin equivalent of L<nbdkit-file-plugin(1)> (except not as fast and missing many @@ -60,8 +60,8 @@ features): nbdkit eval \ config='ln -sf "$(realpath "$3")" $tmpdir/file' \ get_size='stat -Lc %s $tmpdir/file' \ - pread='dd if=$tmpdir/file skip=$4 count=$3 iflag=count_bytes,skip_bytes' \ - pwrite='dd of=$tmpdir/file seek=$4 conv=notrunc oflag=seek_bytes' \ + pread='dd bs=1 if=$tmpdir/file skip=$4 count=$3' \ + pwrite='dd bs=1 of=$tmpdir/file seek=$4 conv=notrunc' \ file=disk.img =head1 PARAMETERS diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod index 07d90b57..70741139 100644 --- a/plugins/sh/nbdkit-sh-plugin.pod +++ b/plugins/sh/nbdkit-sh-plugin.pod @@ -42,7 +42,7 @@ as the name of the script, like this: nbdkit sh - <<'EOF' case "$1" in get_size) echo 1M ;; - pread) dd if=/dev/zero count=$3 iflag=count_bytes ;; + pread) dd bs=1 if=/dev/zero count=$3 ;; *) exit 2 ;; esac EOF diff --git a/plugins/sh/assemble.sh b/plugins/sh/assemble.sh index 5d994857..45b9b698 100755 --- a/plugins/sh/assemble.sh +++ b/plugins/sh/assemble.sh @@ -46,7 +46,7 @@ case "$1" in echo 512 ;; pread) - dd if=$b skip=$4 count=$3 iflag=count_bytes,skip_bytes + dd bs=1 if=$b skip=$4 count=$3 ;; can_write) # Default is yes to make it writable (but writes will fail). diff --git a/plugins/sh/example.sh b/plugins/sh/example.sh index 4f547db0..03249b27 100755 --- a/plugins/sh/example.sh +++ b/plugins/sh/example.sh @@ -117,12 +117,12 @@ case "$1" in pread) # Read the requested part of the disk and write to stdout. - dd iflag=skip_bytes,count_bytes skip=$4 count=$3 if=$f || exit 1 + dd bs=1 skip=$4 count=$3 if=$f || exit 1 ;; pwrite) # Copy data from stdin and write it to the disk. - dd oflag=seek_bytes conv=notrunc seek=$4 of=$f || exit 1 + dd bs=1 conv=notrunc seek=$4 of=$f || exit 1 ;; can_write) @@ -162,8 +162,7 @@ case "$1" in # cache) # Implement an efficient prefetch, if desired. # It is intentionally omitted from this example. - # dd iflag=skip_bytes,count_bytes skip=$4 count=$3 \ - # if=$f of=/dev/null || exit 1 + # dd bs=1 skip=$4 count=$3 if=$f of=/dev/null || exit 1 # ;; can_cache) diff --git a/tests/test-cacheextents.sh b/tests/test-cacheextents.sh index 9580348c..48de9365 100755 --- a/tests/test-cacheextents.sh +++ b/tests/test-cacheextents.sh @@ -1,6 +1,6 @@ #!/usr/bin/env bash # nbdkit -# Copyright (C) 2019 Red Hat Inc. +# Copyright (C) 2019-2020 Red Hat Inc. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are @@ -37,7 +37,6 @@ set -e requires grep --version requires qemu-img --version requires qemu-io --version -requires dd iflag=count_bytes </dev/null sock="$(mktemp -u)" sockurl="nbd+unix:///?socket=$sock" @@ -71,7 +70,7 @@ case "$1" in echo ${i}M $block_size $((i%4)) done ;; - pread) dd if=/dev/zero count=$3 iflag=count_bytes ;; + pread) dd bs=1 if=/dev/zero count=$3 ;; can_write) ;; pwrite) dd of=/dev/null ;; can_trim) ;; diff --git a/tests/test-eval-file.sh b/tests/test-eval-file.sh index 7d6f79d4..d291803f 100755 --- a/tests/test-eval-file.sh +++ b/tests/test-eval-file.sh @@ -39,7 +39,6 @@ set -x requires guestfish --version requires test -f disk -requires dd iflag=count_bytes </dev/null files="eval-file.img" rm -f $files @@ -50,8 +49,8 @@ cp disk eval-file.img nbdkit -fv -U - eval \ config='ln -sf "$(realpath "$3")" $tmpdir/file' \ get_size='stat -Lc %s $tmpdir/file' \ - pread='dd if=$tmpdir/file skip=$4 count=$3 iflag=count_bytes,skip_bytes' \ - pwrite='dd of=$tmpdir/file seek=$4 conv=notrunc oflag=seek_bytes' \ + pread='dd bs=1 if=$tmpdir/file skip=$4 count=$3' \ + pwrite='dd bs=1 of=$tmpdir/file seek=$4 conv=notrunc' \ file=eval-file.img \ --run ' guestfish \ diff --git a/tests/test-eval.sh b/tests/test-eval.sh index 07a9ea1e..212f2836 100755 --- a/tests/test-eval.sh +++ b/tests/test-eval.sh @@ -35,7 +35,6 @@ set -e set -x requires qemu-img --version -requires dd iflag=count_bytes </dev/null files="eval.out eval.missing" rm -f $files @@ -47,7 +46,7 @@ cleanup_fn rm -f $files # nbdkit, and nbdkit shuts down before the .close callback is called. nbdkit -U - eval \ get_size='echo 64M' \ - pread='dd if=/dev/zero count=$3 iflag=count_bytes' \ + pread='dd bs=1 if=/dev/zero count=$3' \ missing='echo "in missing: $@" >> eval.missing; exit 2' \ unload='' \ --run 'qemu-img info $nbd; sleep 10' > eval.out diff --git a/tests/test-export-name.sh b/tests/test-export-name.sh index 5ff0d088..40f2ce18 100755 --- a/tests/test-export-name.sh +++ b/tests/test-export-name.sh @@ -1,6 +1,6 @@ #!/usr/bin/env bash # nbdkit -# Copyright (C) 2019 Red Hat Inc. +# Copyright (C) 2019-2020 Red Hat Inc. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are @@ -35,7 +35,6 @@ set -e set -x requires nbdsh --version -requires dd iflag=count_bytes </dev/null sock=`mktemp -u` files="$sock exportname.pid" @@ -59,7 +58,7 @@ case "$1" in echo ${#2} ;; pread) - printf %s "$2" | dd skip=$4 count=$3 iflag=skip_bytes,count_bytes + printf %s "$2" | dd bs=1 skip=$4 count=$3 ;; *) exit 2 ;; esac diff --git a/tests/test-parallel-sh.sh b/tests/test-parallel-sh.sh index 8d8b9b19..4a26f0a8 100755 --- a/tests/test-parallel-sh.sh +++ b/tests/test-parallel-sh.sh @@ -36,7 +36,6 @@ source ./functions.sh requires test -f file-data requires qemu-io --version requires timeout --version -requires dd iflag=count_bytes </dev/null nbdkit --dump-plugin sh | grep -q ^thread_model=parallel || { echo "nbdkit lacks support for parallel requests"; exit 77; } @@ -87,8 +86,8 @@ fi case \$1 in thread_model) echo parallel ;; get_size) stat -L -c %s \$f || exit 1 ;; - pread) dd iflag=skip_bytes,count_bytes skip=\$4 count=\$3 if=\$f || exit 1 ;; - pwrite) dd oflag=seek_bytes conv=notrunc seek=\$4 of=\$f || exit 1 ;; + pread) dd bs=1 skip=\$4 count=\$3 if=\$f || exit 1 ;; + pwrite) dd bs=1 conv=notrunc seek=\$4 of=\$f || exit 1 ;; can_write) ;; *) exit 2 ;; esac diff --git a/tests/test-readahead-test-plugin.sh b/tests/test-readahead-test-plugin.sh index e8c57c5c..c2af913c 100755 --- a/tests/test-readahead-test-plugin.sh +++ b/tests/test-readahead-test-plugin.sh @@ -40,7 +40,7 @@ case "$1" in ;; pread) sleep 5 - dd if=/dev/zero count=$3 iflag=count_bytes + dd bs=1 if=/dev/zero count=$3 ;; *) exit 2 diff --git a/tests/test-readahead.sh b/tests/test-readahead.sh index b1aabd88..329bcd68 100755 --- a/tests/test-readahead.sh +++ b/tests/test-readahead.sh @@ -40,7 +40,6 @@ set -x requires python3 --version requires python3 -c 'import nbd' -requires dd iflag=count_bytes </dev/null files="readahead.img" rm -f $files diff --git a/tests/test-retry-reopen-fail.sh b/tests/test-retry-reopen-fail.sh index 55704f33..60e579e2 100755 --- a/tests/test-retry-reopen-fail.sh +++ b/tests/test-retry-reopen-fail.sh @@ -40,7 +40,6 @@ set -x fail=0 requires qemu-io --version -requires dd iflag=count_bytes </dev/null files="retry-reopen-fail-count retry-reopen-fail-open-count retry-reopen-fail-status" @@ -96,7 +95,7 @@ case "$1" in echo "EIO pread failed" >&2 exit 1 else - dd if=/dev/zero count=$3 iflag=count_bytes + dd bs=1 if=/dev/zero count=$3 fi ;; diff --git a/tests/test-retry-size.sh b/tests/test-retry-size.sh index ac3daf69..7ddc20ca 100755 --- a/tests/test-retry-size.sh +++ b/tests/test-retry-size.sh @@ -37,7 +37,6 @@ set -x fail=0 requires qemu-io --version -requires dd iflag=count_bytes </dev/null files="retry-size-open-count retry-size-fail" rm -f $files @@ -81,7 +80,7 @@ case "$1" in touch retry-size-fail fi ;; esac - dd if=/dev/zero count=$3 iflag=count_bytes + dd bs=1 if=/dev/zero count=$3 ;; *) exit 2 ;; esac diff --git a/tests/test-retry.sh b/tests/test-retry.sh index 173bd20a..fd2513f9 100755 --- a/tests/test-retry.sh +++ b/tests/test-retry.sh @@ -35,7 +35,6 @@ set -e set -x requires qemu-img --version -requires dd iflag=count_bytes </dev/null files="retry.img retry-count retry-open-count" rm -f $files @@ -65,7 +64,7 @@ case "$1" in echo "EIO pread failed" >&2 exit 1 else - dd if=/dev/zero count=$3 iflag=count_bytes + dd bs=1 if=/dev/zero count=$3 fi ;; diff --git a/tests/test-shell.sh b/tests/test-shell.sh index 028b8b9c..9bf00204 100755 --- a/tests/test-shell.sh +++ b/tests/test-shell.sh @@ -41,8 +41,7 @@ case "$1" in ;; pread) - dd iflag=skip_bytes,count_bytes skip=$4 count=$3 if=$f || \ - exit 1 + dd bs=1 skip=$4 count=$3 if=$f || exit 1 ;; pwrite) @@ -51,7 +50,7 @@ case "$1" in *) echo "garbage flags: '$5'" >&2 exit 1; esac - dd oflag=seek_bytes conv=notrunc seek=$4 of=$f || exit 1 + dd bs=1 conv=notrunc seek=$4 of=$f || exit 1 ;; can_write) ;; diff --git a/tests/test-single-sh.sh b/tests/test-single-sh.sh index 63f38dff..0342e712 100755 --- a/tests/test-single-sh.sh +++ b/tests/test-single-sh.sh @@ -35,7 +35,6 @@ set -e set -x requires nbdsh --version -requires dd iflag=count_bytes </dev/null files="single-sh.script single-sh.log" rm -f $files @@ -60,7 +59,7 @@ fi cat >single-sh.script <<\EOF case $1 in get_size) echo 1m ;; - pread) dd if=/dev/zero count=$3 iflag=count_bytes ;; + pread) dd bs=1 if=/dev/zero count=$3 ;; *) exit 2 ;; esac EOF diff --git a/tests/test-tls-fallback.sh b/tests/test-tls-fallback.sh index 4b48c1cb..ededae39 100755 --- a/tests/test-tls-fallback.sh +++ b/tests/test-tls-fallback.sh @@ -71,7 +71,7 @@ case $1 in fi echo $3 ;; get_size) echo "$2" | wc -c ;; - pread) echo "$2" | dd skip=$4 count=$3 iflag=skip_bytes,count_bytes ;; + pread) echo "$2" | dd bs=1 skip=$4 count=$3 ;; can_write | can_trim) exit 0 ;; *) exit 2 ;; esac -- 2.28.0
Eric Blake
2020-Aug-17 17:45 UTC
Re: [Libguestfs] [nbdkit PATCH] sh: Prefer dd bs=1 over iflag=count_bytes
On 8/17/20 11:35 AM, Eric Blake wrote:> While iflag=count_bytes combined with bs > 1 allows for more efficient > operation, it is a feature of GNU dd, and not present on other > implementations such as BSD. Sticking to just POSIX features makes > things more portable. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > docs/nbdkit-loop.pod | 2 +- > docs/nbdkit.pod | 4 ++-- > plugins/eval/nbdkit-eval-plugin.pod | 6 +++--- > plugins/sh/nbdkit-sh-plugin.pod | 2 +- > plugins/sh/assemble.sh | 2 +- > plugins/sh/example.sh | 7 +++---- > tests/test-cacheextents.sh | 5 ++--- > tests/test-eval-file.sh | 5 ++--- > tests/test-eval.sh | 3 +-- > tests/test-export-name.sh | 5 ++--- > tests/test-parallel-sh.sh | 5 ++--- > tests/test-readahead-test-plugin.sh | 2 +- > tests/test-readahead.sh | 1 - > tests/test-retry-reopen-fail.sh | 3 +-- > tests/test-retry-size.sh | 3 +-- > tests/test-retry.sh | 3 +-- > tests/test-shell.sh | 5 ++--- > tests/test-single-sh.sh | 3 +-- > tests/test-tls-fallback.sh | 2 +- > 19 files changed, 28 insertions(+), 40 deletions(-)To see if this makes a noticeable impact on GNU systems, I compared: $ time make -C tests check TESTS='test-cacheextents.sh test-eval-file.sh test-eval.sh test-export-name.sh test-parallel-sh.sh test-readahead.sh test-retry-reopen-fail.sh test-retry-size.sh test-retry.sh test-shell test-single-sh.sh test-tls-fallback.sh' after first completing an entire 'make check' to prime the cache. Typical timings of a few runs were before: real 1m45.445s user 0m6.626s sys 0m4.181s after: real 1m53.366s user 0m13.510s sys 0m13.387s So it is indeed a noticeable slowdown (although it also points out that the bulk of our test time is spend sleep()ing, waiting for servers to start or for parallel tests to prove their merit). But not all the tests use the same amount of .pread calls; breaking it down into per-test timings, I see the worst degradation in test-eval-file.sh (8 seconds!), second worst in test-shell (~0.8s); but both make sense - those are the tests where we are hammering the file with the most .pread and .pwrite calls from libguestfs, rather than a one-off read or even testing something unrelated to read. Maybe instead of directly using this patch, we instead compromise by mentioning the speedup trick in the .pod files (list the portable spelling side-by-side with the efficient GNU-only spelling), and make those two tests use the efficient spelling. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2020-Aug-18 14:58 UTC
Re: [Libguestfs] [nbdkit PATCH] sh: Prefer dd bs=1 over iflag=count_bytes
On Mon, Aug 17, 2020 at 11:35:39AM -0500, Eric Blake wrote:> While iflag=count_bytes combined with bs > 1 allows for more efficient > operation, it is a feature of GNU dd, and not present on other > implementations such as BSD. Sticking to just POSIX features makes > things more portable.I'm not very convinced by this. Maybe we should persuade the BSD folk to implement this useful feature instead? As one example:> diff --git a/docs/nbdkit-loop.pod b/docs/nbdkit-loop.pod > index 055b5750..b21c2212 100644 > --- a/docs/nbdkit-loop.pod > +++ b/docs/nbdkit-loop.pod > @@ -120,7 +120,7 @@ creates a disk which contains a bad sector: > echo EIO Bad block >&2 > exit 1 > else > - dd if=/dev/zero count=$3 iflag=count_bytes > + dd bs=1 if=/dev/zero count=$3 > fi ;; > *) exit 2 ;; > esacthis change could be a killer for performance since the suggested size of the disk is 64M. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Eric Blake
2020-Aug-18 15:11 UTC
Re: [Libguestfs] [nbdkit PATCH] sh: Prefer dd bs=1 over iflag=count_bytes
On 8/18/20 9:58 AM, Richard W.M. Jones wrote:> On Mon, Aug 17, 2020 at 11:35:39AM -0500, Eric Blake wrote: >> While iflag=count_bytes combined with bs > 1 allows for more efficient >> operation, it is a feature of GNU dd, and not present on other >> implementations such as BSD. Sticking to just POSIX features makes >> things more portable. > > I'm not very convinced by this. Maybe we should persuade > the BSD folk to implement this useful feature instead?Wouldn't hurt to try.> > As one example: > >> diff --git a/docs/nbdkit-loop.pod b/docs/nbdkit-loop.pod >> index 055b5750..b21c2212 100644 >> --- a/docs/nbdkit-loop.pod >> +++ b/docs/nbdkit-loop.pod >> @@ -120,7 +120,7 @@ creates a disk which contains a bad sector: >> echo EIO Bad block >&2 >> exit 1 >> else >> - dd if=/dev/zero count=$3 iflag=count_bytes >> + dd bs=1 if=/dev/zero count=$3 >> fi ;; >> *) exit 2 ;; >> esac > > this change could be a killer for performance since the suggested size > of the disk is 64M.Another thing we could do: I've got a proposal out to let plugins advertise block size preferences. Once that is in place, a shell script that uses dd could advertise minimum block size of 512, leave bs=512 (the default), and compute count/skip/seek by $(($3 / 512)) when the script must be portable to BSD dd. Performance will then not be penalized, at the expense that the user would have to use the --filter=blocksize to get sub-block read/writes. But yeah, given the definite slowdown to the testsuite for heavy-hitters, I'm thinking the best we can do without a minimum block size advertisement is documentation of why we prefer GNU dd features, and the slower fallback possible on BSD. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Possibly Parallel Threads
- Re: [nbdkit PATCH] sh: Prefer dd bs=1 over iflag=count_bytes
- Re: [PATCH nbdkit v2 3/3] tests: Add a simple test of nbdkit_export_name.
- Re: [nbdkit PATCH v2] Introduce cacheextents filter
- [nbdkit PATCH v2 15/17] sh: Enable parallel thread model, when possible
- Re: [PATCH nbdkit 1/3] sh: Implement inline scripts.