Richard W.M. Jones
2021-Oct-23 10:10 UTC
[Libguestfs] [PATCH libnbd] lib/poll.c: Retry poll after EINTR
On Sat, Oct 23, 2021 at 01:25:21AM +0300, Nir Soffer wrote:> I see a rare random failure when calling BlockStatus via Go binding: > > block_status: nbd_block_status: poll: Interrupted system call > > I could not reproduce this with "nbdinfo --map", even after modifying it > to call nbd_block_status() for every 128 MiB. > > Fixing this in nbd_unlock_poll() avoids this issue in the entire > library, when we wait for command completion. This seems more useful > that fixing it in all libnbd clients. > > Tested using a go client listing all extents in an image, calling > BlockStatus for every 128m with fedora 34 qcow2 image. Without this fix, > this was always failing. > > $ hyperfine -r1000 --show-output "./client nbd+unix://?socket=/tmp/nbd.sock > /dev/null" > Benchmark 1: ./client nbd+unix://?socket=/tmp/nbd.sock > /dev/null > Time (mean ? ?): 31.6 ms ? 3.1 ms [User: 8.8 ms, System: 7.2 ms] > Range (min ? max): 26.1 ms ? 52.3 ms 1000 runs > > Signed-off-by: Nir Soffer <nsoffer at redhat.com> > --- > lib/poll.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/lib/poll.c b/lib/poll.c > index edfcc59..df01d94 100644 > --- a/lib/poll.c > +++ b/lib/poll.c > @@ -57,8 +57,11 @@ nbd_unlocked_poll (struct nbd_handle *h, int timeout) > * would allow other threads to close file descriptors which we have > * passed to poll. > */ > - r = poll (fds, 1, timeout); > - debug (h, "poll end: r=%d revents=%x", r, fds[0].revents); > + do { > + r = poll (fds, 1, timeout); > + debug (h, "poll end: r=%d revents=%x", r, fds[0].revents); > + } while (r == -1 && errno == EINTR); > + > if (r == -1) { > set_error (errno, "poll"); > return -1;I think this is a perfectly reasonable solution, and better than modifying the language bindings. I guess Eric should have the final word here, but from my point of view ACK. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
Eric Blake
2021-Oct-28 12:40 UTC
[Libguestfs] [PATCH libnbd] lib/poll.c: Retry poll after EINTR
Sorry for a late reply... On Sat, Oct 23, 2021 at 11:10:08AM +0100, Richard W.M. Jones wrote:> > Fixing this in nbd_unlock_poll() avoids this issue in the entire > > library, when we wait for command completion. This seems more useful > > that fixing it in all libnbd clients. > > > > +++ b/lib/poll.c > > @@ -57,8 +57,11 @@ nbd_unlocked_poll (struct nbd_handle *h, int timeout) > > * would allow other threads to close file descriptors which we have > > * passed to poll. > > */ > > - r = poll (fds, 1, timeout); > > - debug (h, "poll end: r=%d revents=%x", r, fds[0].revents); > > + do { > > + r = poll (fds, 1, timeout); > > + debug (h, "poll end: r=%d revents=%x", r, fds[0].revents); > > + } while (r == -1 && errno == EINTR); > > + > > if (r == -1) { > > set_error (errno, "poll"); > > return -1; > > I think this is a perfectly reasonable solution, and better than > modifying the language bindings. I guess Eric should have the final > word here, but from my point of view ACK.As a library, libnbd has no idea if the overall program has installed signal handlers that may trigger EINTR, so being robust by retrying seems the sanest approach. ACK from me as well. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Eric Blake
2021-Oct-28 12:45 UTC
[Libguestfs] [PATCH libnbd] lib/poll.c: Retry poll after EINTR
On Sat, Oct 23, 2021 at 11:10:08AM +0100, Richard W.M. Jones wrote: Sorry for a delayed reply,> > > > Fixing this in nbd_unlock_poll() avoids this issue in the entire > > library, when we wait for command completion. This seems more useful > > that fixing it in all libnbd clients. > >> > - r = poll (fds, 1, timeout); > > - debug (h, "poll end: r=%d revents=%x", r, fds[0].revents); > > + do { > > + r = poll (fds, 1, timeout); > > + debug (h, "poll end: r=%d revents=%x", r, fds[0].revents); > > + } while (r == -1 && errno == EINTR); > > + > > if (r == -1) { > > set_error (errno, "poll"); > > return -1; > > I think this is a perfectly reasonable solution, and better than > modifying the language bindings. I guess Eric should have the final > word here, but from my point of view ACK.Whether a syscall fails with EINTR depends on which signal handlers the application has installed. But as the library can't control which signal handlers may or may not be modified by the rest of the application, having the library assume that EINTR is always possible is indeed more robust, and nicer than forcing every user of the library to deal with the fallout for the library choking on EINTR. ACK from me as well. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org