thr3ads.net - Nouveau - [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation [Feb 2025]

If this information is useful, please help other people find it:
Share via:

Danilo Krummrich

2025-Feb-26 01:16 UTC

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

On Tue, Feb 25, 2025 at 08:49:16PM -0400, Jason Gunthorpe
wrote:> I'm pointing out the fundamental different in approachs. The typical
> widely used pattern results in __device_release_driver() completing
> with no concurrent driver code running.
Typically yes, but there are exceptions, such as DRM.
> 
> DRM achieves this, in part, by using drm_dev_unplug().
No, DRM can have concurrent driver code running, which is why drm_dev_enter()
returns whether the device is unplugged already, such that subsequent
operations, (e.g. I/O) can be omitted.
> 
> The Rust approach ends up with __device_release_driver() completing
> and leaving driver code still running in other threads.
No, this has nothing to do with Rust device / driver or I/O abstractions.

It entirely depends on the driver you implement. If you register a DRM device,
then yes, there may be concurrent driver code running after
__device_release_driver() completes. But this is specific to the DRM
implementation, *not* to Rust.

Again, the reason a pci::Bar needs to be revocable in Rust is that we can't
have
the driver potentially keep the pci::Bar alive (or even access it) after the
device is unbound.

A driver can also be unbound without the module being removed, and if the driver
would be able to keep the pci::Bar alive, it would mean that the resource region
is not freed and the MMIO mapping is not unmapped, because the resource region
and the MMIO mapping is bound to the lifetime of the pci::Bar object. This would
not be acceptable for a Rust driver.

That this also comes in handy for subsystems like DRM, where we could have
attempts to access to the pci::Bar object after the device is unbound by design,
can be seen as a nice side effect.

Jason Gunthorpe

2025-Feb-26 17:21 UTC

head link

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

On Wed, Feb 26, 2025 at 02:16:58AM +0100, Danilo Krummrich
wrote:> > DRM achieves this, in part, by using drm_dev_unplug().
> 
> No, DRM can have concurrent driver code running, which is why
drm_dev_enter()
> returns whether the device is unplugged already, such that subsequent
> operations, (e.g. I/O) can be omitted.
Ah, I did notice that the driver was the one providing the
file_operations struct so of course the core code can't protect the
driver ops. Yuk :\
> Again, the reason a pci::Bar needs to be revocable in Rust is that we
can't have
> the driver potentially keep the pci::Bar alive (or even access it) after
the
> device is unbound.
My impression is that nobody has yet come up with a Rust way to
implement the normal kernel design pattern of revoke threads then free
objects in safe rust.

Yes, this is a peculiar lifetime model, but it is pretty important in
the kernel. I'm not convinced you can just fully ignore it in Rust as
a design pattern. We use it pretty much everywhere a function pointer
is involved.

For instance, I'm looking at workqueue.rs and wondering why is it safe
against Execute After Free races. I see none of the C functions I
would expect to be used to prevent those races in the code.

Even the simple example:

//! fn print_later(val: Arc<MyStruct>) {
//!     let _ = workqueue::system().enqueue(val);
//! }

Seems to be missing the EAF prevention ie WorkItem::run() is in .text
of THIS_MODULE and I see nothing is preventing THIS_MODULE from being
unloaded.

The expectation of work queues is to follow the above revoke threads
then free pattern. A module should do that sequence in the driver
remove() or module __exit function.

Jason

Nouveau - Feb 2025 - [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation