John Hubbard
2025-Aug-13 23:28 UTC
[PATCH] gpu: nova-core: avoid probing non-display/compute PCI functions
This avoids a kernel crash that reliably happens 100% of the time on my particular hardware (x86 with Ampere or Blackwell GPUs installed). The problem is that NovaCore has so far been very sloppy about figuring out if probe() has found a supported PCI PF (Physical Function). By that I mean: probe() sets up BAR0 (which involves a lot of very careful devres and Device<Bound> details behind the scenes). And then if it is handling a non-supported device such as the .1 audio PF on many GPUs, it fails out due to an unexpected BAR0 size. We have been fortunate that the BAR0 sizes are different. Really, we should be filtering on PCI class ID instead. These days I think we can confidently pick out Nova's supported PF's via PCI class ID. And if not, then we'll revisit. There *might* also be a deeper problem involving devres_release_all() teardown, but on the other hand, it could also be a difference of opinion about how early it is supposed to be droppable. Because failing later in the probe() path works just fine. So instead of digging into devres lifetimes that are already correct when used carefully, just stop stress-testing that subsystem via inaccurate behavior in the first place: a) Expose the PCI Device class (available during probe()) to Rust. b) Use the PCI Device class to filter out non-display, non-compute PFs. Relevant excerpts of the crash are shown below. ... NovaCore 0000:c1:00.0: GPU instance built NovaCore 0000:c1:00.1: Probe Nova Core GPU driver. NovaCore 0000:c1:00.1: enabling device (0000 -> 0002) NovaCore 0000:c1:00.1: probe with driver NovaCore failed with error -22 ... Bad IO access at port 0x0 () WARNING: CPU: 26 PID: 748 at lib/iomap.c:45 pci_iounmap+0x3f/0x50 ... <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x2c/0x70 [nova_core] devres_release_all+0xa8/0xf0 really_probe+0x30f/0x420 __driver_probe_device+0x77/0xf0 driver_probe_device+0x22/0x1b0 __driver_attach+0x118/0x250 bus_for_each_dev+0x105/0x130 bus_add_driver+0x163/0x2a0 driver_register+0x5d/0xf0 init_module+0x6d/0x1000 [nova_core] do_one_initcall+0xde/0x380 do_init_module+0x60/0x250 ...and then: BUG: kernel NULL pointer dereference, address: 0000000000000538 RIP: 0010:pci_release_region+0x10/0x60 ... <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x36/0x70 [nova_core] devres_release_all+0xa8/0xf0 really_probe+0x30f/0x420 __driver_probe_device+0x77/0xf0 driver_probe_device+0x22/0x1b0 __driver_attach+0x118/0x250 bus_for_each_dev+0x105/0x130 bus_add_driver+0x163/0x2a0 driver_register+0x5d/0xf0 init_module+0x6d/0x1000 [nova_core] do_one_initcall+0xde/0x380 do_init_module+0x60/0x250 Signed-off-by: John Hubbard <jhubbard at nvidia.com> --- drivers/gpu/nova-core/driver.rs | 13 +++++++++++++ rust/kernel/pci.rs | 6 ++++++ 2 files changed, 19 insertions(+) diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs index 274989ea1fb4..4e0e6f5338e9 100644 --- a/drivers/gpu/nova-core/driver.rs +++ b/drivers/gpu/nova-core/driver.rs @@ -31,6 +31,19 @@ impl pci::Driver for NovaCore { fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> Result<Pin<KBox<Self>>> { dev_dbg!(pdev.as_ref(), "Probe Nova Core GPU driver.\n"); + let class_code = pdev.class(); + + if class_code != bindings::PCI_CLASS_DISPLAY_VGA + && class_code != bindings::PCI_CLASS_DISPLAY_3D + { + dev_dbg!( + pdev.as_ref(), + "Skipping non-display NVIDIA device with class 0x{:04x}\n", + class_code + ); + return Err(kernel::error::code::ENODEV); + } + pdev.enable_device_mem()?; pdev.set_master(); diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs index 887ee611b553..b6416fe7bdfd 100644 --- a/rust/kernel/pci.rs +++ b/rust/kernel/pci.rs @@ -399,6 +399,12 @@ pub fn device_id(&self) -> u16 { unsafe { (*self.as_raw()).device } } + /// Returns the PCI class code (class and subclass). + pub fn class(&self) -> u32 { + // SAFETY: `self.as_raw` is a valid pointer to a `struct pci_dev`. + unsafe { (*self.as_raw()).class >> 8 } + } + /// Returns the size of the given PCI bar resource. pub fn resource_len(&self, bar: u32) -> Result<bindings::resource_size_t> { if !Bar::index_is_valid(bar) { base-commit: dfc0f6373094dd88e1eaf76c44f2ff01b65db851 -- 2.50.1
Danilo Krummrich
2025-Aug-13 23:50 UTC
[PATCH] gpu: nova-core: avoid probing non-display/compute PCI functions
On Thu Aug 14, 2025 at 1:28 AM CEST, John Hubbard wrote:> NovaCore 0000:c1:00.0: GPU instance built > NovaCore 0000:c1:00.1: Probe Nova Core GPU driver. > NovaCore 0000:c1:00.1: enabling device (0000 -> 0002) > NovaCore 0000:c1:00.1: probe with driver NovaCore failed with error -22 > ... > Bad IO access at port 0x0 () > WARNING: CPU: 26 PID: 748 at lib/iomap.c:45 pci_iounmap+0x3f/0x50 > ... > <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x2c/0x70 [nova_core] > devres_release_all+0xa8/0xf0 > really_probe+0x30f/0x420 > __driver_probe_device+0x77/0xf0 > driver_probe_device+0x22/0x1b0 > __driver_attach+0x118/0x250 > bus_for_each_dev+0x105/0x130 > bus_add_driver+0x163/0x2a0 > driver_register+0x5d/0xf0 > init_module+0x6d/0x1000 [nova_core] > do_one_initcall+0xde/0x380 > do_init_module+0x60/0x250 > > ...and then: > BUG: kernel NULL pointer dereference, address: 0000000000000538 > RIP: 0010:pci_release_region+0x10/0x60 > ... > <kernel::devres::Devres<kernel::pci::Bar<16777216>>>::devres_callback+0x36/0x70 [nova_core] > devres_release_all+0xa8/0xf0 > really_probe+0x30f/0x420 > __driver_probe_device+0x77/0xf0 > driver_probe_device+0x22/0x1b0 > __driver_attach+0x118/0x250 > bus_for_each_dev+0x105/0x130 > bus_add_driver+0x163/0x2a0 > driver_register+0x5d/0xf0 > init_module+0x6d/0x1000 [nova_core] > do_one_initcall+0xde/0x380 > do_init_module+0x60/0x250This is caused by a bug in Devres, which I already fixed in [1]. With the patch in [1] nova-core should gracefully fail probing for the non-supported device classes as expected. However, I think we still want to filter by PCI class, so the patch is fine in general. :) Few comments below. [1] https://lore.kernel.org/lkml/20250812130928.11075-1-dakr at kernel.org/> > Signed-off-by: John Hubbard <jhubbard at nvidia.com> > --- > drivers/gpu/nova-core/driver.rs | 13 +++++++++++++ > rust/kernel/pci.rs | 6 ++++++ > 2 files changed, 19 insertions(+) > > diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs > index 274989ea1fb4..4e0e6f5338e9 100644 > --- a/drivers/gpu/nova-core/driver.rs > +++ b/drivers/gpu/nova-core/driver.rs > @@ -31,6 +31,19 @@ impl pci::Driver for NovaCore { > fn probe(pdev: &pci::Device<Core>, _info: &Self::IdInfo) -> Result<Pin<KBox<Self>>> { > dev_dbg!(pdev.as_ref(), "Probe Nova Core GPU driver.\n"); > > + let class_code = pdev.class(); > + > + if class_code != bindings::PCI_CLASS_DISPLAY_VGA > + && class_code != bindings::PCI_CLASS_DISPLAY_3DI think it would be nice if we could provide a Rust enum for PCI classes, such that this could be pci::Class::DISPLAY_VGA instead. Of course the same is true for PCI (sub)vendor, (sub)device IDs.> + { > + dev_dbg!( > + pdev.as_ref(), > + "Skipping non-display NVIDIA device with class 0x{:04x}\n", > + class_code > + ); > + return Err(kernel::error::code::ENODEV);With the prelude included you should be able to use ENODEV directly.> + } > + > pdev.enable_device_mem()?; > pdev.set_master(); > > diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rsPlease split the PCI part up into a separate patch.> index 887ee611b553..b6416fe7bdfd 100644 > --- a/rust/kernel/pci.rs > +++ b/rust/kernel/pci.rs > @@ -399,6 +399,12 @@ pub fn device_id(&self) -> u16 { > unsafe { (*self.as_raw()).device } > } > > + /// Returns the PCI class code (class and subclass). > + pub fn class(&self) -> u32 { > + // SAFETY: `self.as_raw` is a valid pointer to a `struct pci_dev`. > + unsafe { (*self.as_raw()).class >> 8 } > + } > + > /// Returns the size of the given PCI bar resource. > pub fn resource_len(&self, bar: u32) -> Result<bindings::resource_size_t> { > if !Bar::index_is_valid(bar) { > > base-commit: dfc0f6373094dd88e1eaf76c44f2ff01b65db851 > -- > 2.50.1