thr3ads.net - Nouveau - [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation [Mar 2025]

If this information is useful, please help other people find it:
Share via:

Danilo Krummrich

2025-Mar-21 12:12 UTC

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

On Fri, Mar 21, 2025 at 09:04:16AM -0300, Jason Gunthorpe
wrote:> On Fri, Mar 21, 2025 at 11:35:40AM +0100, Simona Vetter wrote:
> > On Wed, Mar 19, 2025 at 02:21:32PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Mar 13, 2025 at 03:32:14PM +0100, Simona Vetter wrote:
> > > 
> > > > So I think you can still achieve that building on top of
revocable and a
> > > > few more abstractions that are internally unsafe. Or are you
thinking of
> > > > different runtime checks?
> > > 
> > > I'm thinking on the access side of the revocable you
don't have a
> > > failure path. Instead you get the access or runtime violation if
the
> > > driver is buggy. This eliminates all the objectionable failure
paths
> > > and costs on the performance paths of the driver.
> > > 
> > > And perhaps also on the remove path you have runtime checking if
> > > "driver lifetime bound" objects have all been cleaned
up.
> > > 
> > > The point is to try to behave more like the standard fence
pattern and
> > > get some level of checking that can make r4l comfortable without
> > > inventing new kernel lifecycle models.
> > > 
> > > > Yeah maybe we're not that far really. But I'm still
not clear how to do
> > > > an entirely revoke-less world.
> > > 
> > > Not entirely, you end up revoking big things. Like RDMA revokes
the
> > > driver ops callbacks using SRCU. It doesn't revoke individual
> > > resources or DMA maps.
> > > 
> > > I have the same feeling about this micro-revoke direction, I
don't
> > > know how to implement this. The DMA API is very challenging,
> > > especially the performance use of DMA API.
> > 
> > Ah I think we're in agreement, I think once we get to big
subsystems we
> > really want subsystem-level revokes like you describe here. And rust
> > already has this concept of a "having one thing guarantess you
access to
> > another". For example an overall lock to a big datastructure
gives you
> > access to all the invidiual nodes, see LockedBy. So I think we're
covered
> > here.
> 
> Make some sense if Rust can do that.
> 
> > For me the basic Revocable really is more for all the odd-ball
> > random pieces that aren't covered by subsystem constructs already.
And
> > maybe drm needs to rethink a bunch of things in this area in general,
not
> > just for rust. So maybe we should extend the rustdoc to explain that
bare
> > Revocable isn't how entire subsystems rust abstractions should be
built?
> 
> Then why provide it? Like why provide revoke for DMA API or MMIO as
> mandatory part of the core kernel rust bindings if it isn't supposed
> to be used and instead rely on this LockedBy sort of thing?
Not all device resources are managed in the context of the subsystem, so
subsystem-level revokes do not apply.

For the DMA coherent allocations, please see my comment in [1]. Revoking the
device resources associated with a DMA coherent allocation should hence never
cause any overhead for accessing DMA memory.

[1]
https://github.com/Rust-for-Linux/linux/blob/rust-next/rust/kernel/dma.rs#L120

Jason Gunthorpe

2025-Mar-21 17:49 UTC

head link

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

On Fri, Mar 21, 2025 at 01:12:30PM +0100, Danilo Krummrich wrote:
> Not all device resources are managed in the context of the subsystem, so
> subsystem-level revokes do not apply.
They could, you could say that these rust APIs are only safe to use
for device drivers with C code providing a fence semantic, eg through
a subsystem.
> For the DMA coherent allocations, please see my comment in [1]. Revoking
the
> device resources associated with a DMA coherent allocation should hence
never
> cause any overhead for accessing DMA memory.
> [1]
https://github.com/Rust-for-Linux/linux/blob/rust-next/rust/kernel/dma.rs#L120
I don't know what to make of this. You argued so much to support
revocable for rust ideological reasons and in the end the proposal is
to just completely gives up on all of that?

Not even an optional runtime check? :(

And I'm not sure about the comment written:
> // However, it is neither desirable nor necessary to protect the allocated
memory of the DMA
> // allocation from surviving device unbind;
There are alot of things on this path that depend on the struct
device, there are two kinds of per-device coherent memory allocators
and swiotlb as well.

It looks like there are cases where the actual memory must not outlive
the driver binding.

So, I'd argue that it is necessary, and changing that on the C side
looks like a big project.

Jason

Nouveau - Mar 2025 - [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation