Danilo Krummrich
2025-Mar-21 12:12 UTC
[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation
On Fri, Mar 21, 2025 at 09:04:16AM -0300, Jason Gunthorpe wrote:> On Fri, Mar 21, 2025 at 11:35:40AM +0100, Simona Vetter wrote: > > On Wed, Mar 19, 2025 at 02:21:32PM -0300, Jason Gunthorpe wrote: > > > On Thu, Mar 13, 2025 at 03:32:14PM +0100, Simona Vetter wrote: > > > > > > > So I think you can still achieve that building on top of revocable and a > > > > few more abstractions that are internally unsafe. Or are you thinking of > > > > different runtime checks? > > > > > > I'm thinking on the access side of the revocable you don't have a > > > failure path. Instead you get the access or runtime violation if the > > > driver is buggy. This eliminates all the objectionable failure paths > > > and costs on the performance paths of the driver. > > > > > > And perhaps also on the remove path you have runtime checking if > > > "driver lifetime bound" objects have all been cleaned up. > > > > > > The point is to try to behave more like the standard fence pattern and > > > get some level of checking that can make r4l comfortable without > > > inventing new kernel lifecycle models. > > > > > > > Yeah maybe we're not that far really. But I'm still not clear how to do > > > > an entirely revoke-less world. > > > > > > Not entirely, you end up revoking big things. Like RDMA revokes the > > > driver ops callbacks using SRCU. It doesn't revoke individual > > > resources or DMA maps. > > > > > > I have the same feeling about this micro-revoke direction, I don't > > > know how to implement this. The DMA API is very challenging, > > > especially the performance use of DMA API. > > > > Ah I think we're in agreement, I think once we get to big subsystems we > > really want subsystem-level revokes like you describe here. And rust > > already has this concept of a "having one thing guarantess you access to > > another". For example an overall lock to a big datastructure gives you > > access to all the invidiual nodes, see LockedBy. So I think we're covered > > here. > > Make some sense if Rust can do that. > > > For me the basic Revocable really is more for all the odd-ball > > random pieces that aren't covered by subsystem constructs already. And > > maybe drm needs to rethink a bunch of things in this area in general, not > > just for rust. So maybe we should extend the rustdoc to explain that bare > > Revocable isn't how entire subsystems rust abstractions should be built? > > Then why provide it? Like why provide revoke for DMA API or MMIO as > mandatory part of the core kernel rust bindings if it isn't supposed > to be used and instead rely on this LockedBy sort of thing?Not all device resources are managed in the context of the subsystem, so subsystem-level revokes do not apply. For the DMA coherent allocations, please see my comment in [1]. Revoking the device resources associated with a DMA coherent allocation should hence never cause any overhead for accessing DMA memory. [1] https://github.com/Rust-for-Linux/linux/blob/rust-next/rust/kernel/dma.rs#L120
Jason Gunthorpe
2025-Mar-21 17:49 UTC
[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation
On Fri, Mar 21, 2025 at 01:12:30PM +0100, Danilo Krummrich wrote:> Not all device resources are managed in the context of the subsystem, so > subsystem-level revokes do not apply.They could, you could say that these rust APIs are only safe to use for device drivers with C code providing a fence semantic, eg through a subsystem.> For the DMA coherent allocations, please see my comment in [1]. Revoking the > device resources associated with a DMA coherent allocation should hence never > cause any overhead for accessing DMA memory.> [1] https://github.com/Rust-for-Linux/linux/blob/rust-next/rust/kernel/dma.rs#L120I don't know what to make of this. You argued so much to support revocable for rust ideological reasons and in the end the proposal is to just completely gives up on all of that? Not even an optional runtime check? :( And I'm not sure about the comment written:> // However, it is neither desirable nor necessary to protect the allocated memory of the DMA > // allocation from surviving device unbind;There are alot of things on this path that depend on the struct device, there are two kinds of per-device coherent memory allocators and swiotlb as well. It looks like there are cases where the actual memory must not outlive the driver binding. So, I'd argue that it is necessary, and changing that on the C side looks like a big project. Jason