Richard W.M. Jones
2022-Jun-29 13:36 UTC
[Libguestfs] [PATCH libnbd v3 3/3] copy/copy-nbd-error.sh: Make this test non-stochastic
Because the test previously used error rates of 50%, it could sometimes "fail to fail". This is noticable if you run the test repeatedly: $ while make -C copy check TESTS=copy-nbd-error.sh >& /tmp/log ; do echo -n . ; done This now happens more often because of the larger requests made by the new multi-threaded loop, resulting in fewer calls to the error filter, so a greater chance that a series of 50% coin tosses will come up all heads in the test. Fix this by making the test non-stocastic. Fixes: commit 8d444b41d09a700c7ee6f9182a649f3f2d325abb --- copy/copy-nbd-error.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/copy/copy-nbd-error.sh b/copy/copy-nbd-error.sh index 0088807f54..01524a890c 100755 --- a/copy/copy-nbd-error.sh +++ b/copy/copy-nbd-error.sh @@ -40,7 +40,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \ # Failure to read should be fatal echo "Testing read failures on non-sparse source" $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \ - error-pread-rate=0.5 ] null: && fail=1 + error-pread-rate=1 ] null: && fail=1 # However, reliable block status on a sparse image can avoid the need to read echo "Testing read failures on sparse source" @@ -51,7 +51,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error null 5M \ echo "Testing write data failures on arbitrary destination" $VG nbdcopy -- [ nbdkit --exit-with-parent -v pattern 5M ] \ [ nbdkit --exit-with-parent -v --filter=error --filter=noextents \ - memory 5M error-pwrite-rate=0.5 ] && fail=1 + memory 5M error-pwrite-rate=1 ] && fail=1 # However, writing zeroes can bypass the need for normal writes echo "Testing write data failures from sparse source" -- 2.37.0.rc2
Laszlo Ersek
2022-Jun-29 16:01 UTC
[Libguestfs] [PATCH libnbd v3 3/3] copy/copy-nbd-error.sh: Make this test non-stochastic
On 06/29/22 15:36, Richard W.M. Jones wrote:> Because the test previously used error rates of 50%, it could > sometimes "fail to fail". This is noticable if you run the test > repeatedly: > > $ while make -C copy check TESTS=copy-nbd-error.sh >& /tmp/log ; do echo -n . ; done > > This now happens more often because of the larger requests made by the > new multi-threaded loop, resulting in fewer calls to the error filter, > so a greater chance that a series of 50% coin tosses will come up all > heads in the test. > > Fix this by making the test non-stocastic. > > Fixes: commit 8d444b41d09a700c7ee6f9182a649f3f2d325abb > --- > copy/copy-nbd-error.sh | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/copy/copy-nbd-error.sh b/copy/copy-nbd-error.sh > index 0088807f54..01524a890c 100755 > --- a/copy/copy-nbd-error.sh > +++ b/copy/copy-nbd-error.sh > @@ -40,7 +40,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \ > # Failure to read should be fatal > echo "Testing read failures on non-sparse source" > $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error pattern 5M \ > - error-pread-rate=0.5 ] null: && fail=1 > + error-pread-rate=1 ] null: && fail=1 > > # However, reliable block status on a sparse image can avoid the need to read > echo "Testing read failures on sparse source" > @@ -51,7 +51,7 @@ $VG nbdcopy -- [ nbdkit --exit-with-parent -v --filter=error null 5M \ > echo "Testing write data failures on arbitrary destination" > $VG nbdcopy -- [ nbdkit --exit-with-parent -v pattern 5M ] \ > [ nbdkit --exit-with-parent -v --filter=error --filter=noextents \ > - memory 5M error-pwrite-rate=0.5 ] && fail=1 > + memory 5M error-pwrite-rate=1 ] && fail=1 > > # However, writing zeroes can bypass the need for normal writes > echo "Testing write data failures from sparse source" >Wasn't the original intent of the 50% error rate that the first error manifest usually at a different offset every time? If we change the error rate to 1, the tests will fail upon the first access, which kind of breaks the original intent. I wonder if we could determine a random offset in advance, and make sure that the read or write access fails 100%, but only if the request covers that offset. ... The probability that n subsequent accesses *don't* fail is (1-error_rate)^n. (The probability that at least one access fails is 1-(1-error_rate)^n.) And n is given by (I think?) image_size/request_size. So, if we change the request_size, we can recalculate "n", for the test not to fail with the same probability as before. (1-err1)^(imgsz/rsz1) = (1-err2)^(imgsz/rsz2) draw the imgsz'th root of both sides (1-err1)^(1/rsz1) = (1-err2)^(1/rsz2) raise both sides to the rsz2'nd power (1-err1)^(rsz2/rsz1) = 1-err2 err2 = 1 - (1-err1)^(rsz2/rsz1) I know that err1=0.5, but don't know rsz2 and rsz1 (the request sizes after, and before, the last patch in the series). Assuming (just guessing!) we increased the request size 8-fold, we'd have to go from error rate 0.5 to: err2 = 1 - (1-0.5)^8 = 1 - (1/2)^8 = 1 - (1 / 256) = 255/256 = 0.99609375 We basically group every eight coin tosses into one super-toss, and want the latter to show "failure" with the same probability as *at least one* of the original 8 tosses failing. Laszlo