thr3ads.net - Libguestfs - [Libguestfs] [nbdkit PATCH] retry: Tweak error message [Mar 2020]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2020-Mar-31 21:00 UTC

[Libguestfs] [nbdkit PATCH] retry: Tweak error message

The retry filter defaults to 5 retries, but when run with verbose
details produces some confusing output:

$ rm -f /tmp/inject; (sleep 5s; touch /tmp/inject)& ./nbdkit -f -v -U - \
  null 1G --filter=retry --filter=noextents --filter=error error-rate=100% \
  error-file=/tmp/inject --filter=delay rdelay=1 \
  --run 'qemu-img convert $nbd out.img'
...
nbdkit: null[1]: debug: noextents: pread count=2097152 offset=14680064
nbdkit: null[1]: debug: error: pread count=2097152 offset=14680064
nbdkit: null[1]: error: injecting EIO error into pread
nbdkit: null[1]: debug: retry 6: original errno = 5
nbdkit: null[1]: debug: could not recover after 5 retries

The code is correct (there were six attempts, but that corresponds to
5 retries after the initial attempt; comments mention that retry=0
turns the filter into a no-op but even that requires an initial
attempt).  So all we need to adjust is the output.

Fixes: f0f0ec49
Signed-off-by: Eric Blake <eblake@redhat.com>
---

I'm pushing this in response to bugzilla.redhat.com/1819240
At the same time, I'm looking at whether qemu-img should be tweaked
to have a mode where it gives up on the first EIO error, rather than
plowing on through the rest of the image even after the first error.

 filters/retry/retry.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/filters/retry/retry.c b/filters/retry/retry.c
index 11e5313b..db91d7ca 100644
--- a/filters/retry/retry.c
+++ b/filters/retry/retry.c
@@ -173,7 +173,7 @@ do_retry (struct retry_handle *h,

  again:
   /* Log the original errno since it will be lost when we retry. */
-  nbdkit_debug ("retry %d: original errno = %d", data->retry+1,
*err);
+  nbdkit_debug ("retry attempt %d: original errno = %d",
data->retry, *err);

   if (data->retry >= retries) {
     nbdkit_debug ("could not recover after %d retries", retries);
-- 
2.26.0.rc2

Richard W.M. Jones

2020-Apr-01 10:12 UTC

head link

[Libguestfs] [nbdkit PATCH] retry: Tweak error message

On Tue, Mar 31, 2020 at 04:00:22PM -0500, Eric Blake
wrote:> The retry filter defaults to 5 retries, but when run with verbose
> details produces some confusing output:
> 
> $ rm -f /tmp/inject; (sleep 5s; touch /tmp/inject)& ./nbdkit -f -v -U -
\
>   null 1G --filter=retry --filter=noextents --filter=error error-rate=100%
\
>   error-file=/tmp/inject --filter=delay rdelay=1 \
>   --run 'qemu-img convert $nbd out.img'
> ...
> nbdkit: null[1]: debug: noextents: pread count=2097152 offset=14680064
> nbdkit: null[1]: debug: error: pread count=2097152 offset=14680064
> nbdkit: null[1]: error: injecting EIO error into pread
> nbdkit: null[1]: debug: retry 6: original errno = 5
> nbdkit: null[1]: debug: could not recover after 5 retries
> 
> The code is correct (there were six attempts, but that corresponds to
> 5 retries after the initial attempt; comments mention that retry=0
> turns the filter into a no-op but even that requires an initial
> attempt).  So all we need to adjust is the output.
> 
> Fixes: f0f0ec49
> Signed-off-by: Eric Blake <eblake at redhat.com>
> ---
> 
> I'm pushing this in response to bugzilla.redhat.com/1819240
> At the same time, I'm looking at whether qemu-img should be tweaked
> to have a mode where it gives up on the first EIO error, rather than
> plowing on through the rest of the image even after the first error.
> 
>  filters/retry/retry.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/filters/retry/retry.c b/filters/retry/retry.c
> index 11e5313b..db91d7ca 100644
> --- a/filters/retry/retry.c
> +++ b/filters/retry/retry.c
> @@ -173,7 +173,7 @@ do_retry (struct retry_handle *h,
> 
>   again:
>    /* Log the original errno since it will be lost when we retry. */
> -  nbdkit_debug ("retry %d: original errno = %d",
data->retry+1, *err);
> +  nbdkit_debug ("retry attempt %d: original errno = %d",
data->retry, *err);
> 
>    if (data->retry >= retries) {
>      nbdkit_debug ("could not recover after %d retries",
retries);
I think adding 1 was correct, so that the retries count from 1.

How about moving the message to make it conditional instead,
like this?

again:
  /* Log the original errno since it will be lost when we retry. */
  if (data->retry < retries)
    nbdkit_debug ("retry %d: original errno = %d", data->retry+1,
*err);
  else {
     nbdkit_debug ("could not recover after %d retries", retries);
  ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat people.redhat.com/~rjones
Read my programming and virtualization blog: rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
people.redhat.com/~rjones/virt-top

Eric Blake

2020-Apr-01 13:46 UTC

head link

Re: [Libguestfs] [nbdkit PATCH] retry: Tweak error message

On 4/1/20 5:12 AM, Richard W.M. Jones wrote:> On Tue, Mar 31, 2020 at 04:00:22PM -0500, Eric Blake wrote:
>> The retry filter defaults to 5 retries, but when run with verbose
>> details produces some confusing output:
>>
>> $ rm -f /tmp/inject; (sleep 5s; touch /tmp/inject)& ./nbdkit -f -v
-U - \
>>    null 1G --filter=retry --filter=noextents --filter=error
error-rate=100% \
>>    error-file=/tmp/inject --filter=delay rdelay=1 \
>>    --run 'qemu-img convert $nbd out.img'
>> ...
>> nbdkit: null[1]: debug: noextents: pread count=2097152 offset=14680064
>> nbdkit: null[1]: debug: error: pread count=2097152 offset=14680064
>> nbdkit: null[1]: error: injecting EIO error into pread
>> nbdkit: null[1]: debug: retry 6: original errno = 5
>> nbdkit: null[1]: debug: could not recover after 5 retries
>>
>>    again:
>>     /* Log the original errno since it will be lost when we retry. */
>> -  nbdkit_debug ("retry %d: original errno = %d",
data->retry+1, *err);
>> +  nbdkit_debug ("retry attempt %d: original errno = %d",
data->retry, *err);
>>
>>     if (data->retry >= retries) {
>>       nbdkit_debug ("could not recover after %d retries",
retries);
> 
> I think adding 1 was correct, so that the retries count from 1.
But the problem is the timing of when the messages are printed.  And 
remember, this code is only reached after a failed attempt, not on 
success.  Pre-patch, we have:

initial try succeeds - no message

or (here, with retries limited to 2):

initial try fails:
  retry 1: original error
delay
retry 1 fails:
  retry 2: original error
delay
retry 2 fails:
  retry 3: original error
  could not recover after 2 retries
> 
> How about moving the message to make it conditional instead,
> like this?
> 
> again:
>    /* Log the original errno since it will be lost when we retry. */
>    if (data->retry < retries)
>      nbdkit_debug ("retry %d: original errno = %d",
data->retry+1, *err);
>    else {
>       nbdkit_debug ("could not recover after %d retries",
retries);
>    ...
with your suggestion, we'd have:

initial try fails:
  retry 1: original error
delay
retry 1 fails:
  retry 2: original error
delay
retry 2 fails:
  could not recover after 2 retries

where we lost the log of the errno of retry 2.  Maybe better would be:

initial try fails:
  attempt 0 failed with errno %d, starting delay for retry 1
delay
retry 1 fails:
  attempt 1 failed with errno %d, starting delay for retry 2
delay
retry 2 fails:
  attempt 2 failed with errno %d, could not recover

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Reasonably Related Threads

Search for more seemingly similar threads

Libguestfs - Mar 2020 - [nbdkit PATCH] retry: Tweak error message

[Libguestfs] [nbdkit PATCH] retry: Tweak error message

[Libguestfs] [nbdkit PATCH] retry: Tweak error message

Re: [Libguestfs] [nbdkit PATCH] retry: Tweak error message

Reasonably Related Threads