Tage Johansson
2023-Sep-04 16:07 UTC
[Libguestfs] [libnbd PATCH] rust: Use mio::poll instead of requiring epoll
From: Eric Blake <eblake at redhat.com> CI shows our async handle fails to build on FreeBSD and MacOS (where epoll() is not available as a syscall, and therefore not available as a Rust crate). We can instead accomplish the same level-probing effects by doing a zero-timeout poll with mio (which defers under the hood to epoll on Linux, and kqueue on BSD). Fixes: 223a9965 ("rust: async: Create an async friendly handle type") CC: Tage Johansson <tage.j.lists at posteo.net> Signed-off-by: Eric Blake <eblake at redhat.com> --- rust/Cargo.toml | 3 ++- rust/src/async_handle.rs | 40 +++++++++++++++++++++++----------------- 2 files changed, 25 insertions(+), 18 deletions(-) diff --git a/rust/Cargo.toml b/rust/Cargo.toml index 0879b34..678848a 100644 --- a/rust/Cargo.toml +++ b/rust/Cargo.toml @@ -49,9 +49,10 @@ thiserror = "1.0.40" log = { version = "0.4.19", optional = true } libc = "0.2.147" tokio = { optional = true, version = "1.29.1", default-features = false, features = ["rt", "sync", "net"] } -epoll = "4.3.3" +mio = { optional = true, version = "0.8.0" } [features] +tokio = ["dep:tokio", "dep:mio"] default = ["log", "tokio"] [dev-dependencies] diff --git a/rust/src/async_handle.rs b/rust/src/async_handle.rs index 6793ce9..8f3e8df 100644 --- a/rust/src/async_handle.rs +++ b/rust/src/async_handle.rs @@ -35,9 +35,11 @@ use crate::sys; use crate::Handle; use crate::{Error, FatalErrorKind, Result}; use crate::{AIO_DIRECTION_BOTH, AIO_DIRECTION_READ, AIO_DIRECTION_WRITE}; -use epoll::Events; +use mio::unix::SourceFd; +use mio::{Events, Interest as MioInterest, Poll, Token}; use std::sync::Arc; use std::sync::Mutex; +use std::time::Duration; use tokio::io::{unix::AsyncFd, Interest, Ready as IoReady}; use tokio::sync::Notify; use tokio::task; @@ -176,13 +178,8 @@ async fn polling_task(handle_data: &HandleData) -> Result<(), FatalErrorKind> { } = handle_data; let fd = handle.aio_get_fd().map_err(Error::to_fatal)?; let tokio_fd = AsyncFd::new(fd)?; - let epfd = epoll::create(false)?; - epoll::ctl( - epfd, - epoll::ControlOptions::EPOLL_CTL_ADD, - fd, - epoll::Event::new(Events::EPOLLIN | Events::EPOLLOUT, 42), - )?; + let mut events = Events::with_capacity(1); + let mut poll = Poll::new()?; // The following loop does approximately the following things: // @@ -248,19 +245,28 @@ async fn polling_task(handle_data: &HandleData) -> Result<(), FatalErrorKind> { } drop(pending_cmds_lock); - // Use epoll to check the current read/write availability on the fd. + // Use mio poll to check the current read/write availability on the fd. // This is needed because Tokio supports only edge-triggered // notifications but Libnbd requires level-triggered notifications. - let mut revent = epoll::Event { data: 0, events: 0 }; // Setting timeout to 0 means that it will return immediately. - epoll::wait(epfd, 0, std::slice::from_mut(&mut revent))?; - let revents = Events::from_bits(revent.events).unwrap(); - if !revents.contains(Events::EPOLLIN) { - ready_guard.clear_ready_matching(IoReady::READABLE); - } - if !revents.contains(Events::EPOLLOUT) { - ready_guard.clear_ready_matching(IoReady::WRITABLE); + // mio states that it is OS-dependent on whether a single event + // can be both readable and writable, but we survive just fine + // if we only see one direction even when both are available. + poll.registry().register( + &mut SourceFd(&fd), + Token(0), + MioInterest::READABLE | MioInterest::WRITABLE, + )?; + poll.poll(&mut events, Some(Duration::ZERO))?; + for event in &events { + if !event.is_readable() { + ready_guard.clear_ready_matching(IoReady::READABLE); + } + if !event.is_writable() { + ready_guard.clear_ready_matching(IoReady::WRITABLE); + } } ready_guard.retain_ready(); + poll.registry().deregister(&mut SourceFd(&fd))?; } } base-commit: 9afb980d05d6144129c899285e44779757a380e8 prerequisite-patch-id: 8d1779610795021ed5a3d0973ddf9ef854cd6a24 prerequisite-patch-id: 1f0f333311f11ac9b1ff0be08f5aa0a9904ba4de prerequisite-patch-id: dc4af7343c57a4f99dc82918c07470030e542747 prerequisite-patch-id: 8e8f7a043c80d6c24e883967f5bd952a64db1228 prerequisite-patch-id: ba7b3482e2e16f76b5f285daeeda30b31a841912 prerequisite-patch-id: 219a9595e550a8caf43d466dcb2b044114e1b7bf prerequisite-patch-id: 3de46c9673221bff1d897970aa983b3f8e6cab74 prerequisite-patch-id: 4235d5e174fce05b9a947b3b838bebf968f0fa6a prerequisite-patch-id: 07773355d5718e0593c4030a8f035fc11fea3715 prerequisite-patch-id: f023deea8b706706e3c2980403064d90a254af3c prerequisite-patch-id: dd415a31fa127d90bcc7d993a2659e39d7d4cae8 prerequisite-patch-id: 618cc5f574207f36818ef803f2e586314c2458f5 prerequisite-patch-id: 8639c6cc4fec58f4761771c5d8a9476d538c6251 prerequisite-patch-id: 9bc660aed54a6266b014993ff0f388a26ac2982a prerequisite-patch-id: a334eeb471214d13ea2f47604b0a5b6d86b869b6 -- 2.42.0
Tage Johansson
2023-Sep-04 17:45 UTC
[Libguestfs] [libnbd PATCH 2/2] rust: Use mio::poll instead of requiring epoll
On 9/1/2023 3:41 PM, Eric Blake wrote:> On Fri, Sep 01, 2023 at 08:35:58AM +0000, Tage Johansson wrote: >> On 8/30/2023 10:11 PM, Eric Blake wrote: >>> CI shows our async handle fails to build on FreeBSD and MacOS (where >>> epoll() is not available as a syscall, and therefore not available as >>> a Rust crate). We can instead accomplish the same level-probing >>> effects by doing a zero-timeout poll with mio (which defers under the >>> hood to epoll on Linux, and kqueue on BSD). >> >> The problem with mio is that it uses edge triggered notifications. According >> to this page <https://docs.rs/mio/latest/mio/struct.Poll.html>: >> >>> Draining readiness >>> Once a readiness event is received, the corresponding operation must be >>> performed repeatedly until it returns variant >>> std::io::error::ErrorKind::WouldBlock. Unless this is done, there is no >>> guarantee that another readiness event will be delivered, even if >>> further data is received for the event source. > Would that still be true even if we deregister the interest after our > one-shot poll, and reregister it immediately before-hand? In other > words, even if registration sets up edge triggers for future > notifications, the fact that we are only doing a one-shot query seems > like it would be sufficient to use our one-shot as a level probe > (assuming that the registration starts by learning the current state > of the fd before waiting for any future edges). > > I agree that once we get a notification after a .poll(), the normal > assumption is that we must loop and consume all data before calling > .poll again (because the .poll would block until the next edge - but > if we didn't consume to the blocking point, there is no next edge). > But since we are NOT looping, nor doing any consumption work, our > timeout of 0 is trying to behave as an instant level check, and if > adding a reregister/deregister wrapper around the .poll makes it more > reliable (by wiping out all prior events and re-priming the > registration to start with the current level), that may be all the > more we need.It should work. I tried to do that and it works on my machine at least. Although it might be a bit uggly, it may be a short term solution. I will send your patch with these modifications soon as reply to this message.>> The motivation behind using epoll in addition to tokio in the first place >> was to check if fd is readable/writable without necessarily knowing if the >> last read/write operation returned [ErrorKind::WouldBlock]. And it doesn't >> seems that mio full fills that requirenment. > Do we need to expand the libnbd API itself to make it more obvious > whether our state machine completed because it hit a EWOULDBLOCK > scenario? In general, once you kick the state machine (by sending a > new command, or calling one of the two aio_notify calls), the state > machine tries to then process as much data as possible until it blocks > again, rather than stopping after just processing a single > transaction. That is, if libnbd isn't already designed to work with > edge triggers, how much harder would it be to modify it (perhaps by > adding a new API) so that it DOES work nicely with edge triggers?This would be the cleanest and best solution I think. Adding two methods to the handle: aio_read_blocked() and aio_write_blocked() which return true iff the last read/write operation blocked. It could be two bool fields on the handle struct which are set to false by default and reset to false whenever a read/write operation is performed without blocking. It is very important that aio_read_blocked() will return true only if the last read operation blocked, (and the same thing for aio_write_blocked()).> If we can solve _that_ problem, then the Rust async handle wouldn't > need to use epoll, mio, or anything else; tokio's edge trigger > behavior would already be sufficient on its own.Yes, this is true. It would be more efficient, easier to understand, and require less dependencies. -- Best regards, Tage