thr3ads.net - Nouveau - [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation [Mar 2025]

If this information is useful, please help other people find it:
Share via:

Simona Vetter

2025-Mar-05 07:30 UTC

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

On Tue, Mar 04, 2025 at 12:42:01PM -0400, Jason Gunthorpe
wrote:> On Tue, Mar 04, 2025 at 05:10:45PM +0100, Simona Vetter wrote:
> > On Fri, Feb 28, 2025 at 02:40:13PM -0400, Jason Gunthorpe wrote:
> > > On Fri, Feb 28, 2025 at 11:52:57AM +0100, Simona Vetter wrote:
> > > 
> > > > - Nuke the driver binding manually through sysfs with the
unbind files.
> > > > - Nuke all userspace that might beholding files and other
resources open.
> > > > - At this point the module refcount should be zero and you
can unload it.
> > > > 
> > > > Except developers really don't like the manual unbind
step, and so we're
> > > > missing try_module_get() in a bunch of places where it
really should be.
> > > 
> > > IMHO they are not missing, we just have a general rule that if a
> > > cleanup function, required to be called prior to module exit,
revokes
> > > any .text pointers then you don't need to hold the module
refcount.
> > > 
> > > file_operations doesn't have such a cleanup function which is
why it
> > > takes the refcount.
> > > 
> > > hrtimer does have such a function which is why it doesn't
take the
> > > refcount.
> > 
> > I was talking about a bunch of other places, where it works like
> > file_operations, except we don't bother with the module reference
count.
> > I've seen patches fly by where people "fix" these things
because module
> > unload is "broken".
> 
> Sure, but there are only two correct API approaches, either you
> require the user to make a cancel call that sanitizes the module
> references, or you manage them internally.
> 
> Hope and pray isn't an option :)
> 
> > gpu drivers can hog console_lock (yes we're trying to get away
from that
> > as much as possible), at that point a cavalier attitude of "you
can just
> > wait" isn't very appreciated.
> 
> What are you trying to solve here? If the system is already stuck
> infinitely on the console lock why is module remove even being
> considered?
> 
> module remove shouldn't be a remedy for a crashed driver...
I mean hotunplug here, and trying to make that correct.

This confusion is is why this is so hard, because there's really two main
users for all this:

- developers who want to quickly test new driver versions without full
  reboot. They're often preferring convenience over correctness, like with
  the removal of module refcounting that's strictly needed but means they
  first have to unbind drivers in sysfs before they can unload the driver.

  Another one is that this use-case prefers that the hw is cleanly shut
  down, so that you can actually load the new driver from a well-known
  state. And it's entirely ok if this all fails occasionally, it's just
  for development and testing.

- hotunplug as an actual use-case. Bugs are not ok. The hw can go away at
  any moment. And it might happen while you're holding console_lock. You
  generally do not remove the actual module here, which is why for the
  actual production use-case getting that part right isn't really
  required. But getting the lifetimes of all the various
  structs/objects/resources perfectly right is required.

So the "stuck on console_lock" is the 2nd case, not the first. Module
unload doesn't even come into play on that one.
> > > But so is half removing the driver while it is doing *anything*
and
> > > trying to mitigate that with a different kind of hard to do
locking
> > > fix. *shrug*
> > 
> > The thing is that rust helps you enormously with implementing
revocable
> > resources and making sure you're not cheating with all the
bail-out paths.
> 
> Assuming a half alive driver with MMIO and interrupts ripped away
> doesn't lock up.
Rust's drop takes care of that for you. It's not guaranteed, but
it's a
case of "the minimal amount of typing yields correct code", unlike C,
where that just blows up for sure.
> Assuming all your interrupt triggered sleeps have gained a shootdown
> mechanism.
Hence why I want revocable to only be rcu, not srcu.
> Assuming all the new extra error paths this creates don't corrupt the
> internal state of the driver and cause it to lockup.
Yeah this one is a bit scary. Corrupting the state is doable, locking up
is much less likely I think, it seems to be more leaks that you get if
rust goes wrong.
> Meh. It doesn't seem like such an obvious win to me. Personally I'm
> terrified of the idea of a zombie driver half sitting around in a
> totally untestable configuration working properly..
Yeah agreed. I might really badly regret this all. But I'm not sold that
switching to message passing design is really going to be better, while
it's definitely going to be a huge amount of work.
> > It cannot help you with making sure you have interruptible/abortable
> > sleeps in all the right places. 
> 
> :(
> 
> > > Like, I see a THIS_MODULE in driver->fops ==
amdgpu_driver_kms_fops ?
> > 
> > Yeah it's there, except only for the userspace references and not
for the
> > kernel internal ones. Because developers get a bit prickle about
adding
> > those unfortunately due to "it breaks module unload". Maybe
we just should
> > add them, at least for rust.
> 
> Yeah, I think such obviously wrong things should be pushed back
> against. We don't want EAF bugs in the kernel, we want security...
Maybe the two different use-cases above help explain why I'm a bit more
pragmatic here. As long as the hotunplug case does not gain bugs (or gets
some fixed) I'm fairly lax with hacks for the driver developer use-case of
reloading modules.
> > You've missed the "it will upset developers part".
I've seen people remove
> > module references that are needed, to "fix" driver
unloading.
> 
> When done properly the module can be unloaded. Most rdma driver
> modules are unloadable, live, while FDs are open.
> 
> > The third part is that I'm not aware of anything in rust that
would
> > guarantee that the function pointer and the module reference actually
> > belong to each another. Which means another runtime check most likely,
and
> > hence another thing that shouldn't fail which kinda can now.
> 
> I suspect it has to come from the C code API contracts, which leak
> into the binding design.
> 
> If the C API handles module refcounting internally then rust is fine
> so long as it enforces THIS_MODULE.
You could do contrived stuff and pass function pointers around, so that
THIS_MODULE doesn't actually match up with the function pointer. Sure
it's
really stupid, but the idea with rust is that for memory safety stuff like
this, it's not just stupid, but impossible and the compiler will catch
you. So we need a tad more for rust.
> If the C API requires cancel then rust is fine so long as the binding
> guarantees cancel before module unload.
Yeah this is again where I think rust needs a bit more, because the
compiler can't always nicely proof this for you in all the
"obvious"
cases.
-Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Jason Gunthorpe

2025-Mar-05 15:10 UTC

head link

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

On Wed, Mar 05, 2025 at 08:30:34AM +0100, Simona Vetter
wrote:> - developers who want to quickly test new driver versions without full
>   reboot. They're often preferring convenience over correctness, like
with
>   the removal of module refcounting that's strictly needed but means
they
>   first have to unbind drivers in sysfs before they can unload the driver.
> 
>   Another one is that this use-case prefers that the hw is cleanly shut
>   down, so that you can actually load the new driver from a well-known
>   state. And it's entirely ok if this all fails occasionally, it's
just
>   for development and testing.
I've never catered to this because if you do this one:
> - hotunplug as an actual use-case. Bugs are not ok. The hw can go away at
>   any moment. And it might happen while you're holding console_lock.
You
>   generally do not remove the actual module here, which is why for the
>   actual production use-case getting that part right isn't really
>   required. But getting the lifetimes of all the various
>   structs/objects/resources perfectly right is required.
Fully and properly then developers are happy too..

And we were always able to do this one..
> So the "stuck on console_lock" is the 2nd case, not the first.
Module
> unload doesn't even come into play on that one.
I don't see reliable hot unplug if the driver can get stuck on a
lock :|
> > Assuming all your interrupt triggered sleeps have gained a shootdown
> > mechanism.
> 
> Hence why I want revocable to only be rcu, not srcu.
Sorry, I was not clear. You also have to make the PCI interrupt(s)
revocable. Just like the MMIO it cannot leak past the remove() as a
matter of driver-model correctness.

So, you end up disabling the interrupt while the driver is still
running and any sleeps in the driver that are waiting for an interrupt
still need to be shot down.

Further, I just remembered, (Danilo please notice!) there is another
related issue here that DMA mappings *may not* outlive remove()
either. netdev had a bug related to this recently and it was all
agreed that it is not allowed. The kernel can crash in a couple of
different ways if you try to do this.

https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738 at
kernel.org/T/#m0c7dda0fb5981240879c5ca489176987d688844c

 > a device with no driver bound should not be passed to the DMA API,
 > much less a dead device that's already been removed from its parent
 > bus.

So now we have a driver design that must have:
 1) Revocable MMIO
 2) Revocable Interrupts
 3) Revocable DMA mappings
 4) Revocable HW DMA - the HW MUST stop doing DMA before the DMA API
    is shut down. Failure is a correctness/UAF/security issue

Somehow the driver has to implement this, not get confused or lock up,
all while Rust doesn't help you guarentee much of any of the important
properties related to #2/#3/#4. And worse all this manual recvocable
stuff is special and unique to hot-unplug. So it will all be untested
and broken.

Looks really hard to me. *Especially* the wild DMA thing.

This has clearly been missed here as with the current suggestion to
just revoke MMIO means the driver can't actually go out and shutdown
it's HW DMA after-the-fact since the MMIO is gone. Thus you are pretty
much guaranteed to fail #4, by design, which is a serious issue.

I'm sorry it has taken so many emails to reach this, I did know it,
but didn't put the pieces coherently together till just now :\

Compare that to how RDMA works, where we do a DMA shutdown by
destroying all the objects just the same as if the user closed a
FD. The normal destruction paths fence the HW DMA and we end up in
remove with cleanly shutdown HW and no DMA API open. The core code
manages all of this. Simple, correct, no buggy hotplug only paths.
> Yeah agreed. I might really badly regret this all. But I'm not sold
that
> switching to message passing design is really going to be better, while
> it's definitely going to be a huge amount of work.
Yeah, I'd think from where DRM is now continuing trying to address the
sleeps is more tractable and achievable than a message passing
redesign..
> > If the C API handles module refcounting internally then rust is fine
> > so long as it enforces THIS_MODULE.
> 
> You could do contrived stuff and pass function pointers around, so that
> THIS_MODULE doesn't actually match up with the function pointer.
Ah.. I guess rust would have to validate the function pointers and the
THIS_MODULE are consistent at runtime time before handing them off to
C to prevent this. Seems like a reasonable thing to put under some
CONFIG_DEBUG, also seems a bit hard to implement perhaps..
> > If the C API requires cancel then rust is fine so long as the binding
> > guarantees cancel before module unload.
> 
> Yeah this is again where I think rust needs a bit more, because the
> compiler can't always nicely proof this for you in all the
"obvious"
> cases.
But in the discussion about the hrtimer it was asserted that Rust can :)

I believe it could be, so long as rust bindings are pretty restricted
and everything rolls up and cancels when things are destroyed. Nothing
should be able to leak out as a principle of the all the binding
designs.

Seems like a hard design to enforce across all bindings, eg workqeue
is already outside of it. Seems like something that should be written
down in a binding design document..

Jason

Nouveau - Mar 2025 - [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

[RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation