Tage Johansson
2023-Sep-04 16:07 UTC
[Libguestfs] [libnbd PATCH] rust: Use mio::poll instead of requiring epoll
From: Eric Blake <eblake at redhat.com>
CI shows our async handle fails to build on FreeBSD and MacOS (where
epoll() is not available as a syscall, and therefore not available as
a Rust crate). We can instead accomplish the same level-probing
effects by doing a zero-timeout poll with mio (which defers under the
hood to epoll on Linux, and kqueue on BSD).
Fixes: 223a9965 ("rust: async: Create an async friendly handle type")
CC: Tage Johansson <tage.j.lists at posteo.net>
Signed-off-by: Eric Blake <eblake at redhat.com>
---
rust/Cargo.toml | 3 ++-
rust/src/async_handle.rs | 40 +++++++++++++++++++++++-----------------
2 files changed, 25 insertions(+), 18 deletions(-)
diff --git a/rust/Cargo.toml b/rust/Cargo.toml
index 0879b34..678848a 100644
--- a/rust/Cargo.toml
+++ b/rust/Cargo.toml
@@ -49,9 +49,10 @@ thiserror = "1.0.40"
log = { version = "0.4.19", optional = true }
libc = "0.2.147"
tokio = { optional = true, version = "1.29.1", default-features =
false, features = ["rt", "sync", "net"] }
-epoll = "4.3.3"
+mio = { optional = true, version = "0.8.0" }
[features]
+tokio = ["dep:tokio", "dep:mio"]
default = ["log", "tokio"]
[dev-dependencies]
diff --git a/rust/src/async_handle.rs b/rust/src/async_handle.rs
index 6793ce9..8f3e8df 100644
--- a/rust/src/async_handle.rs
+++ b/rust/src/async_handle.rs
@@ -35,9 +35,11 @@ use crate::sys;
use crate::Handle;
use crate::{Error, FatalErrorKind, Result};
use crate::{AIO_DIRECTION_BOTH, AIO_DIRECTION_READ, AIO_DIRECTION_WRITE};
-use epoll::Events;
+use mio::unix::SourceFd;
+use mio::{Events, Interest as MioInterest, Poll, Token};
use std::sync::Arc;
use std::sync::Mutex;
+use std::time::Duration;
use tokio::io::{unix::AsyncFd, Interest, Ready as IoReady};
use tokio::sync::Notify;
use tokio::task;
@@ -176,13 +178,8 @@ async fn polling_task(handle_data: &HandleData) ->
Result<(), FatalErrorKind> {
} = handle_data;
let fd = handle.aio_get_fd().map_err(Error::to_fatal)?;
let tokio_fd = AsyncFd::new(fd)?;
- let epfd = epoll::create(false)?;
- epoll::ctl(
- epfd,
- epoll::ControlOptions::EPOLL_CTL_ADD,
- fd,
- epoll::Event::new(Events::EPOLLIN | Events::EPOLLOUT, 42),
- )?;
+ let mut events = Events::with_capacity(1);
+ let mut poll = Poll::new()?;
// The following loop does approximately the following things:
//
@@ -248,19 +245,28 @@ async fn polling_task(handle_data: &HandleData) ->
Result<(), FatalErrorKind> {
}
drop(pending_cmds_lock);
- // Use epoll to check the current read/write availability on the fd.
+ // Use mio poll to check the current read/write availability on the fd.
// This is needed because Tokio supports only edge-triggered
// notifications but Libnbd requires level-triggered notifications.
- let mut revent = epoll::Event { data: 0, events: 0 };
// Setting timeout to 0 means that it will return immediately.
- epoll::wait(epfd, 0, std::slice::from_mut(&mut revent))?;
- let revents = Events::from_bits(revent.events).unwrap();
- if !revents.contains(Events::EPOLLIN) {
- ready_guard.clear_ready_matching(IoReady::READABLE);
- }
- if !revents.contains(Events::EPOLLOUT) {
- ready_guard.clear_ready_matching(IoReady::WRITABLE);
+ // mio states that it is OS-dependent on whether a single event
+ // can be both readable and writable, but we survive just fine
+ // if we only see one direction even when both are available.
+ poll.registry().register(
+ &mut SourceFd(&fd),
+ Token(0),
+ MioInterest::READABLE | MioInterest::WRITABLE,
+ )?;
+ poll.poll(&mut events, Some(Duration::ZERO))?;
+ for event in &events {
+ if !event.is_readable() {
+ ready_guard.clear_ready_matching(IoReady::READABLE);
+ }
+ if !event.is_writable() {
+ ready_guard.clear_ready_matching(IoReady::WRITABLE);
+ }
}
ready_guard.retain_ready();
+ poll.registry().deregister(&mut SourceFd(&fd))?;
}
}
base-commit: 9afb980d05d6144129c899285e44779757a380e8
prerequisite-patch-id: 8d1779610795021ed5a3d0973ddf9ef854cd6a24
prerequisite-patch-id: 1f0f333311f11ac9b1ff0be08f5aa0a9904ba4de
prerequisite-patch-id: dc4af7343c57a4f99dc82918c07470030e542747
prerequisite-patch-id: 8e8f7a043c80d6c24e883967f5bd952a64db1228
prerequisite-patch-id: ba7b3482e2e16f76b5f285daeeda30b31a841912
prerequisite-patch-id: 219a9595e550a8caf43d466dcb2b044114e1b7bf
prerequisite-patch-id: 3de46c9673221bff1d897970aa983b3f8e6cab74
prerequisite-patch-id: 4235d5e174fce05b9a947b3b838bebf968f0fa6a
prerequisite-patch-id: 07773355d5718e0593c4030a8f035fc11fea3715
prerequisite-patch-id: f023deea8b706706e3c2980403064d90a254af3c
prerequisite-patch-id: dd415a31fa127d90bcc7d993a2659e39d7d4cae8
prerequisite-patch-id: 618cc5f574207f36818ef803f2e586314c2458f5
prerequisite-patch-id: 8639c6cc4fec58f4761771c5d8a9476d538c6251
prerequisite-patch-id: 9bc660aed54a6266b014993ff0f388a26ac2982a
prerequisite-patch-id: a334eeb471214d13ea2f47604b0a5b6d86b869b6
--
2.42.0
Tage Johansson
2023-Sep-04 17:45 UTC
[Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll
On 9/1/2023 3:41 PM, Eric Blake wrote:> On Fri, Sep 01, 2023 at 08:35:58AM +0000, Tage Johansson wrote: >> On 8/30/2023 10:11 PM, Eric Blake wrote: >>> CI shows our async handle fails to build on FreeBSD and MacOS (where >>> epoll() is not available as a syscall, and therefore not available as >>> a Rust crate). We can instead accomplish the same level-probing >>> effects by doing a zero-timeout poll with mio (which defers under the >>> hood to epoll on Linux, and kqueue on BSD). >> >> The problem with mio is that it uses edge triggered notifications. According >> to this page <https://docs.rs/mio/latest/mio/struct.Poll.html>: >> >>> Draining readiness >>> Once a readiness event is received, the corresponding operation must be >>> performed repeatedly until it returns variant >>> std::io::error::ErrorKind::WouldBlock. Unless this is done, there is no >>> guarantee that another readiness event will be delivered, even if >>> further data is received for the event source. > Would that still be true even if we deregister the interest after our > one-shot poll, and reregister it immediately before-hand? In other > words, even if registration sets up edge triggers for future > notifications, the fact that we are only doing a one-shot query seems > like it would be sufficient to use our one-shot as a level probe > (assuming that the registration starts by learning the current state > of the fd before waiting for any future edges). > > I agree that once we get a notification after a .poll(), the normal > assumption is that we must loop and consume all data before calling > .poll again (because the .poll would block until the next edge - but > if we didn't consume to the blocking point, there is no next edge). > But since we are NOT looping, nor doing any consumption work, our > timeout of 0 is trying to behave as an instant level check, and if > adding a reregister/deregister wrapper around the .poll makes it more > reliable (by wiping out all prior events and re-priming the > registration to start with the current level), that may be all the > more we need.It should work. I tried to do that and it works on my machine at least. Although it might be a bit uggly, it may be a short term solution. I will send your patch with these modifications soon as reply to this message.>> The motivation behind using epoll in addition to tokio in the first place >> was to check if fd is readable/writable without necessarily knowing if the >> last read/write operation returned [ErrorKind::WouldBlock]. And it doesn't >> seems that mio full fills that requirenment. > Do we need to expand the libnbd API itself to make it more obvious > whether our state machine completed because it hit a EWOULDBLOCK > scenario? In general, once you kick the state machine (by sending a > new command, or calling one of the two aio_notify calls), the state > machine tries to then process as much data as possible until it blocks > again, rather than stopping after just processing a single > transaction. That is, if libnbd isn't already designed to work with > edge triggers, how much harder would it be to modify it (perhaps by > adding a new API) so that it DOES work nicely with edge triggers?This would be the cleanest and best solution I think. Adding two methods to the handle: aio_read_blocked() and aio_write_blocked() which return true iff the last read/write operation blocked. It could be two bool fields on the handle struct which are set to false by default and reset to false whenever a read/write operation is performed without blocking. It is very important that aio_read_blocked() will return true only if the last read operation blocked, (and the same thing for aio_write_blocked()).> If we can solve _that_ problem, then the Rust async handle wouldn't > need to use epoll, mio, or anything else; tokio's edge trigger > behavior would already be sufficient on its own.Yes, this is true. It would be more efficient, easier to understand, and require less dependencies. -- Best regards, Tage