Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization
Hi everyone, I have respinned this series due to the introduction of `Devres::access()` in nova-next - this addition seriously simplifies the way we access the BAR and justifies sending a new version, although there are also more minor changes. While this series is still far from bringing the GPU into a state where it can do anything useful, it sets up the basic layout of the driver upon which we can build in order to continue with the next steps of GSP booting, as well as supporting more chipsets. Upon successful probe, the driver will display the range of the WPR2 region constructed by FWSEC-FRTS: [ 95.436000] NovaCore 0000:01:00.0: WPR2: 0xffc00000-0xffce0000 [ 95.436002] NovaCore 0000:01:00.0: GPU instance built This code is based on nova-next. There are bits of documentation still missing, these are addressed by Joel in his own documentation patch series [1]. I'll also double-check and send follow-up patches if anything is still missing after that. I have also tried to look at ways to split the patch adding falcon support, but couldn't find any that would not be awkward. Starting review from `falcon.rs` and going down to the HAL should be the logical order for a smooth review. [1] https://lore.kernel.org/rust-for-linux/20250503040802.1411285-1-joelagnelf at nvidia.com/ Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- Changes in v3: - Rebased on top of latest nova-next. - Use the new Devres::access() and remove the now unneeded with_bar!() macro. - Dropped `rust: devres: allow to borrow a reference to the resource's Device` as it is not needed anymore. - Fixed more erroneous uses of `ERANGE` error. - Optimized alignment computations of the FB layout a bit. - Link to v2: https://lore.kernel.org/r/20250501-nova-frts-v2-0-b4a137175337 at nvidia.com Changes in v2: - Rebased on latest nova-next. - Fixed all clippy warnings. - Added `count` and `size` methods to `CoherentAllocation`. - Added method to obtain a reference to the `Device` from a `Devres` (this is super convenient). - Split `DmaObject` into its own patch and added `Deref` implementation. - Squashed field names from [3] into "extract FWSEC from BIOS". - Fixed erroneous use of `ERANGE` error. - Reworked `register!()` macro towards a more intuitive syntax, moved its helper macros into internal rules to avoid polluting the macro namespace. - Renamed all registers to capital snake case to better match OpenRM. - Removed declarations for registers that are not used yet. - Added more documentation for items not covered by Joel's documentation patches. - Removed timer device and replaced it with a helper function using `Ktime`. This also made [4] unneeded so it is dropped. - Unregister the sysmem flush page upon device destruction. - ... probably more that I forgot. >_< - Link to v1: https://lore.kernel.org/r/20250420-nova-frts-v1-0-ecd1cca23963 at nvidia.com [3] https://lore.kernel.org/all/20250423225405.139613-6-joelagnelf at nvidia.com/ [4] https://lore.kernel.org/lkml/20250420-nova-frts-v1-1-ecd1cca23963 at nvidia.com/ --- Alexandre Courbot (17): rust: dma: expose the count and size of CoherentAllocation gpu: nova-core: derive useful traits for Chipset gpu: nova-core: add missing GA100 definition gpu: nova-core: take bound device in Gpu::new gpu: nova-core: define registers layout using helper macro gpu: nova-core: fix layout of NV_PMC_BOOT_0 gpu: nova-core: move Firmware to firmware module rust: make ETIMEDOUT error available gpu: nova-core: wait for GFW_BOOT completion gpu: nova-core: add DMA object struct gpu: nova-core: register sysmem flush page gpu: nova-core: add helper function to wait on condition gpu: nova-core: add falcon register definitions and base code gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS gpu: nova-core: compute layout of the FRTS region gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS gpu: nova-core: load and run FWSEC-FRTS Joel Fernandes (2): rust: num: Add an upward alignment helper for usize nova-core: Add support for VBIOS ucode extraction for boot Documentation/gpu/nova/core/todo.rst | 6 + drivers/gpu/nova-core/devinit.rs | 38 + drivers/gpu/nova-core/dma.rs | 57 ++ drivers/gpu/nova-core/driver.rs | 2 +- drivers/gpu/nova-core/falcon.rs | 537 ++++++++++++++ drivers/gpu/nova-core/falcon/gsp.rs | 22 + drivers/gpu/nova-core/falcon/hal.rs | 56 ++ drivers/gpu/nova-core/falcon/hal/ga102.rs | 120 +++ drivers/gpu/nova-core/falcon/sec2.rs | 8 + drivers/gpu/nova-core/firmware.rs | 105 ++- drivers/gpu/nova-core/firmware/fwsec.rs | 359 +++++++++ drivers/gpu/nova-core/gpu.rs | 223 ++++-- drivers/gpu/nova-core/gsp.rs | 3 + drivers/gpu/nova-core/gsp/fb.rs | 108 +++ drivers/gpu/nova-core/nova_core.rs | 5 + drivers/gpu/nova-core/regs.rs | 259 +++++-- drivers/gpu/nova-core/regs/macros.rs | 380 ++++++++++ drivers/gpu/nova-core/util.rs | 29 + drivers/gpu/nova-core/vbios.rs | 1144 +++++++++++++++++++++++++++++ rust/kernel/dma.rs | 14 + rust/kernel/error.rs | 1 + rust/kernel/lib.rs | 1 + rust/kernel/num.rs | 44 ++ 23 files changed, 3427 insertions(+), 94 deletions(-) --- base-commit: b75a99e1077b12c5631fef7ac36970a89f6021f7 change-id: 20250417-nova-frts-96ef299abe2c Best regards, -- Alexandre Courbot <acourbot at nvidia.com>
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 01/19] rust: dma: expose the count and size of CoherentAllocation
These properties are very useful to have and should be accessible. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- rust/kernel/dma.rs | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs index 605e01e35715667f93297fd9ec49d8e7032e0910..18602d771054fceb80c29278b1945254312ed7c6 100644 --- a/rust/kernel/dma.rs +++ b/rust/kernel/dma.rs @@ -201,6 +201,20 @@ pub fn alloc_coherent( CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attrs(0)) } + /// Returns the number of elements `T` in this allocation. + /// + /// Note that this is not the size of the allocation in bytes, which is provided by + /// [`Self::size`]. + pub fn count(&self) -> usize { + self.count + } + + /// Returns the size in bytes of this allocation. + pub fn size(&self) -> usize { + // This is guaranteed not to overflow as the same operation has been done in `alloc_attrs`. + self.count * core::mem::size_of::<T>() + } + /// Returns the base address to the allocated region in the CPU's virtual address space. pub fn start_ptr(&self) -> *const T { self.cpu_addr -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 02/19] gpu: nova-core: derive useful traits for Chipset
We will commonly need to compare chipset versions, so derive the ordering traits to make that possible. Also derive Copy and Clone since passing Chipset by value will be more efficient than by reference. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 17c9660da45034762edaa78e372d8821144cdeb7..4de67a2dc16302c00530026156d7264cbc7e5b32 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -13,7 +13,7 @@ macro_rules! define_chipset { ({ $($variant:ident = $value:expr),* $(,)* }) => { /// Enum representation of the GPU chipset. - #[derive(fmt::Debug)] + #[derive(fmt::Debug, Copy, Clone, PartialOrd, Ord, PartialEq, Eq)] pub(crate) enum Chipset { $($variant = $value),*, } -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 03/19] gpu: nova-core: add missing GA100 definition
linux-firmware contains a directory for GA100, and it is a defined chipset in Nouveau. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 4de67a2dc16302c00530026156d7264cbc7e5b32..9fe6aedaa9563799c2624d461d4e37ee9b094909 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -54,6 +54,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> { TU117 = 0x167, TU116 = 0x168, // Ampere + GA100 = 0x170, GA102 = 0x172, GA103 = 0x173, GA104 = 0x174, @@ -73,7 +74,7 @@ pub(crate) fn arch(&self) -> Architecture { Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => { Architecture::Turing } - Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => { + Self::GA100 | Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => { Architecture::Ampere } Self::AD102 | Self::AD103 | Self::AD104 | Self::AD106 | Self::AD107 => { -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 04/19] gpu: nova-core: take bound device in Gpu::new
We will need to perform things like allocating DMA memory during device creation, so make sure to take the device context that will allow us to perform these actions. This also allows us to use Devres::access to obtain the BAR without holding a RCU lock. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 9fe6aedaa9563799c2624d461d4e37ee9b094909..a64a306e0ec87f96e7aab47cfe00a5c280c95a64 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -134,9 +134,8 @@ pub(crate) struct Spec { } impl Spec { - fn new(bar: &Devres<Bar0>) -> Result<Spec> { - let bar = bar.try_access().ok_or(ENXIO)?; - let boot0 = regs::Boot0::read(&bar); + fn new(bar: &Bar0) -> Result<Spec> { + let boot0 = regs::Boot0::read(bar); Ok(Self { chipset: boot0.chipset().try_into()?, @@ -183,8 +182,12 @@ pub(crate) struct Gpu { } impl Gpu { - pub(crate) fn new(pdev: &pci::Device, bar: Devres<Bar0>) -> Result<impl PinInit<Self>> { - let spec = Spec::new(&bar)?; + pub(crate) fn new( + pdev: &pci::Device<device::Bound>, + devres_bar: Devres<Bar0>, + ) -> Result<impl PinInit<Self>> { + let bar = devres_bar.access(pdev.as_ref())?; + let spec = Spec::new(bar)?; let fw = Firmware::new(pdev.as_ref(), &spec, "535.113.01")?; dev_info!( @@ -195,6 +198,10 @@ pub(crate) fn new(pdev: &pci::Device, bar: Devres<Bar0>) -> Result<impl PinInit< spec.revision ); - Ok(pin_init!(Self { spec, bar, fw })) + Ok(pin_init!(Self { + spec, + bar: devres_bar, + fw + })) } } -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 05/19] gpu: nova-core: define registers layout using helper macro
Add the register!() macro, which defines a given register's layout and provide bit-field accessors with a way to convert them to a given type. This macro will allow us to make clear definitions of the registers and manipulate their fields safely. The long-term goal is to eventually move it to the kernel crate so it can be used my other drivers as well, but it was agreed to first land it into nova-core and make it mature there. To illustrate its usage, use it to define the layout for the Boot0 (renamed to NV_PMC_BOOT_0 to match OpenRM's naming scheme) and take advantage of its accessors. Suggested-by: Danilo Krummrich <dakr at kernel.org> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- Documentation/gpu/nova/core/todo.rst | 6 + drivers/gpu/nova-core/gpu.rs | 10 +- drivers/gpu/nova-core/regs.rs | 61 ++---- drivers/gpu/nova-core/regs/macros.rs | 380 +++++++++++++++++++++++++++++++++++ 4 files changed, 403 insertions(+), 54 deletions(-) diff --git a/Documentation/gpu/nova/core/todo.rst b/Documentation/gpu/nova/core/todo.rst index 234d753d3eacc709b928b1ccbfc9750ef36ec4ed..8a459fc088121f770bfcda5dfb4ef51c712793ce 100644 --- a/Documentation/gpu/nova/core/todo.rst +++ b/Documentation/gpu/nova/core/todo.rst @@ -102,7 +102,13 @@ Usage: let boot0 = Boot0::read(&bar); pr_info!("Revision: {}\n", boot0.revision()); +Note: a work-in-progress implementation currently resides in +`drivers/gpu/nova-core/regs/macros.rs` and is used in nova-core. It would be +nice to improve it (possibly using proc macros) and move it to the `kernel` +crate so it can be used by other components as well. + | Complexity: Advanced +| Contact: Alexandre Courbot Delay / Sleep abstractions -------------------------- diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index a64a306e0ec87f96e7aab47cfe00a5c280c95a64..43139b527fac35f5f33facbc0be50b9715044245 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -112,10 +112,10 @@ pub(crate) struct Revision { } impl Revision { - fn from_boot0(boot0: regs::Boot0) -> Self { + fn from_boot0(boot0: regs::NV_PMC_BOOT_0) -> Self { Self { - major: boot0.major_rev(), - minor: boot0.minor_rev(), + major: boot0.major_revision(), + minor: boot0.minor_revision(), } } } @@ -135,10 +135,10 @@ pub(crate) struct Spec { impl Spec { fn new(bar: &Bar0) -> Result<Spec> { - let boot0 = regs::Boot0::read(bar); + let boot0 = regs::NV_PMC_BOOT_0::read(bar); Ok(Self { - chipset: boot0.chipset().try_into()?, + chipset: boot0.chipset()?, revision: Revision::from_boot0(boot0), }) } diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index b1a25b86ef17a6710e6236d5e7f1f26cd4407ce3..498fefb52f33bf01518f19d32287962f1fdc3224 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -1,55 +1,18 @@ // SPDX-License-Identifier: GPL-2.0 -use crate::driver::Bar0; +// Required to retain the original register names used by OpenRM, which are all capital snake case +// but are mapped to types. +#![allow(non_camel_case_types)] -// TODO -// -// Create register definitions via generic macros. See task "Generic register -// abstraction" in Documentation/gpu/nova/core/todo.rst. +#[macro_use] +mod macros; -const BOOT0_OFFSET: usize = 0x00000000; +use crate::gpu::Chipset; -// 3:0 - chipset minor revision -const BOOT0_MINOR_REV_SHIFT: u8 = 0; -const BOOT0_MINOR_REV_MASK: u32 = 0x0000000f; +/* PMC */ -// 7:4 - chipset major revision -const BOOT0_MAJOR_REV_SHIFT: u8 = 4; -const BOOT0_MAJOR_REV_MASK: u32 = 0x000000f0; - -// 23:20 - chipset implementation Identifier (depends on architecture) -const BOOT0_IMPL_SHIFT: u8 = 20; -const BOOT0_IMPL_MASK: u32 = 0x00f00000; - -// 28:24 - chipset architecture identifier -const BOOT0_ARCH_MASK: u32 = 0x1f000000; - -// 28:20 - chipset identifier (virtual register field combining BOOT0_IMPL and -// BOOT0_ARCH) -const BOOT0_CHIPSET_SHIFT: u8 = BOOT0_IMPL_SHIFT; -const BOOT0_CHIPSET_MASK: u32 = BOOT0_IMPL_MASK | BOOT0_ARCH_MASK; - -#[derive(Copy, Clone)] -pub(crate) struct Boot0(u32); - -impl Boot0 { - #[inline] - pub(crate) fn read(bar: &Bar0) -> Self { - Self(bar.read32(BOOT0_OFFSET)) - } - - #[inline] - pub(crate) fn chipset(&self) -> u32 { - (self.0 & BOOT0_CHIPSET_MASK) >> BOOT0_CHIPSET_SHIFT - } - - #[inline] - pub(crate) fn minor_rev(&self) -> u8 { - ((self.0 & BOOT0_MINOR_REV_MASK) >> BOOT0_MINOR_REV_SHIFT) as u8 - } - - #[inline] - pub(crate) fn major_rev(&self) -> u8 { - ((self.0 & BOOT0_MAJOR_REV_MASK) >> BOOT0_MAJOR_REV_SHIFT) as u8 - } -} +register!(NV_PMC_BOOT_0 @ 0x00000000, "Basic revision information about the GPU" { + 3:0 minor_revision as u8, "Minor revision of the chip"; + 7:4 major_revision as u8, "Major revision of the chip"; + 28:20 chipset as u32 ?=> Chipset, "Chipset model"; +}); diff --git a/drivers/gpu/nova-core/regs/macros.rs b/drivers/gpu/nova-core/regs/macros.rs new file mode 100644 index 0000000000000000000000000000000000000000..7ecc70efb3cd723b673cd72915e72b8a4a009f06 --- /dev/null +++ b/drivers/gpu/nova-core/regs/macros.rs @@ -0,0 +1,380 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Macro to define register layout and accessors. +//! +//! A single register typically includes several fields, which are accessed through a combination +//! of bit-shift and mask operations that introduce a class of potential mistakes, notably because +//! not all possible field values are necessarily valid. +//! +//! The macro in this module allow to define, using an intruitive and readable syntax, a dedicated +//! type for each register with its own field accessors that can return an error is a field's value +//! is invalid. + +/// Defines a dedicated type for a register with an absolute offset, alongside with getter and +/// setter methods for its fields and methods to read and write it from an `Io` region. +/// +/// Example: +/// +/// ```no_run +/// register!(BOOT_0 @ 0x00000100, "Basic revision information about the GPU" { +/// 3:0 minor_revision as u8, "Minor revision of the chip"; +/// 7:4 major_revision as u8, "Major revision of the chip"; +/// 28:20 chipset as u32 ?=> Chipset, "Chipset model"; +/// }); +/// ``` +/// +/// This defines a `BOOT_0` type which can be read or written from offset `0x100` of an `Io` +/// region. It is composed of 3 fields, for instance `minor_revision` is made of the 4 less +/// significant bits of the register. Each field can be accessed and modified using accessor +/// methods: +/// +/// ```no_run +/// // Read from the register's defined offset (0x100). +/// let boot0 = BOOT_0::read(&bar); +/// pr_info!("chip revision: {}.{}", boot0.major_revision(), boot0.minor_revision()); +/// +/// // `Chipset::try_from` will be called with the value of the field and returns an error if the +/// // value is invalid. +/// let chipset = boot0.chipset()?; +/// +/// // Update some fields and write the value back. +/// boot0.set_major_revision(3).set_minor_revision(10).write(&bar); +/// +/// // Or just read and update the register in a single step: +/// BOOT_0::alter(&bar, |r| r.set_major_revision(3).set_minor_revision(10)); +/// ``` +/// +/// Fields can be defined as follows: +/// +/// - `as <type>` simply returns the field value casted as the requested integer type, typically +/// `u32`, `u16`, `u8` or `bool`. Note that `bool` fields must have a range of 1 bit. +/// - `as <type> => <into_type>` calls `<into_type>`'s `From::<<type>>` implementation and returns +/// the result. +/// - `as <type> ?=> <try_into_type>` calls `<try_into_type>`'s `TryFrom::<<type>>` implementation +/// and returns the result. This is useful on fields for which not all values are value. +/// +/// The documentation strings are optional. If present, they will be added to the type's +/// definition, or the field getter and setter methods they are attached to. +/// +/// Putting a `+` before the address of the register makes it relative to a base: the `read` and +/// `write` methods take a `base` argument that is added to the specified address before access, +/// and `try_read` and `try_write` methods are also created, allowing access with offsets unknown +/// at compile-time: +/// +/// ```no_run +/// register!(CPU_CTL @ +0x0000010, "CPU core control" { +/// 0:0 start as bool, "Start the CPU core"; +/// }); +/// +/// // Flip the `start` switch for the CPU core which base address is at `CPU_BASE`. +/// let cpuctl = CPU_CTL::read(&bar, CPU_BASE); +/// pr_info!("CPU CTL: {:#x}", cpuctl); +/// cpuctl.set_start(true).write(&bar, CPU_BASE); +/// ``` +macro_rules! register { + // Creates a register at a fixed offset of the MMIO space. + ( + $name:ident @ $offset:literal $(, $comment:literal)? { + $($fields:tt)* + } + ) => { + register!(@common $name $(, $comment)?); + register!(@field_accessors $name { $($fields)* }); + register!(@io $name @ $offset); + }; + + // Creates a register at a relative offset from a base address. + ( + $name:ident @ + $offset:literal $(, $comment:literal)? { + $($fields:tt)* + } + ) => { + register!(@common $name $(, $comment)?); + register!(@field_accessors $name { $($fields)* }); + register!(@io$name @ + $offset); + }; + + // Defines the wrapper `$name` type, as well as its relevant implementations (`Debug`, `BitOr`, + // and conversion to regular `u32`). + (@common $name:ident $(, $comment:literal)?) => { + $( + #[doc=$comment] + )? + #[repr(transparent)] + #[derive(Clone, Copy, Default)] + pub(crate) struct $name(u32); + + // TODO: display the raw hex value, then the value of all the fields. This requires + // matching the fields, which will complexify the syntax considerably... + impl ::core::fmt::Debug for $name { + fn fmt(&self, f: &mut ::core::fmt::Formatter<'_>) -> ::core::fmt::Result { + f.debug_tuple(stringify!($name)) + .field(&format_args!("0x{0:x}", &self.0)) + .finish() + } + } + + impl core::ops::BitOr for $name { + type Output = Self; + + fn bitor(self, rhs: Self) -> Self::Output { + Self(self.0 | rhs.0) + } + } + + impl ::core::convert::From<$name> for u32 { + fn from(reg: $name) -> u32 { + reg.0 + } + } + }; + + // Defines all the field getter/methods methods for `$name`. + ( + @field_accessors $name:ident { + $($hi:tt:$lo:tt $field:ident as $type:tt + $(?=> $try_into_type:ty)? + $(=> $into_type:ty)? + $(, $comment:literal)? + ; + )* + } + ) => { + $( + register!(@check_field_bounds $hi:$lo $field as $type); + )* + + #[allow(dead_code)] + impl $name { + $( + register!(@field_accessor $name $hi:$lo $field as $type + $(?=> $try_into_type)? + $(=> $into_type)? + $(, $comment)? + ; + ); + )* + } + }; + + // Boolean fields must have `$hi == $lo`. + (@check_field_bounds $hi:tt:$lo:tt $field:ident as bool) => { + #[allow(clippy::eq_op)] + const _: () = { + kernel::build_assert!( + $hi == $lo, + concat!("boolean field `", stringify!($field), "` covers more than one bit") + ); + }; + }; + + // Non-boolean fields must have `$hi >= $lo`. + (@check_field_bounds $hi:tt:$lo:tt $field:ident as $type:tt) => { + #[allow(clippy::eq_op)] + const _: () = { + kernel::build_assert!( + $hi >= $lo, + concat!("field `", stringify!($field), "`'s MSB is smaller than its LSB") + ); + }; + }; + + // Catches fields defined as `bool` and convert them into a boolean value. + ( + @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as bool => $into_type:ty + $(, $comment:literal)?; + ) => { + register!( + @leaf_accessor $name $hi:$lo $field as bool + { |f| <$into_type>::from(if f != 0 { true } else { false }) } + $into_type => $into_type $(, $comment)?; + ); + }; + + // Shortcut for fields defined as `bool` without the `=>` syntax. + ( + @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as bool $(, $comment:literal)?; + ) => { + register!(@field_accessor $name $hi:$lo $field as bool => bool $(, $comment)?;); + }; + + // Catches the `?=>` syntax for non-boolean fields. + ( + @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:tt ?=> $try_into_type:ty + $(, $comment:literal)?; + ) => { + register!(@leaf_accessor $name $hi:$lo $field as $type + { |f| <$try_into_type>::try_from(f as $type) } $try_into_type => + ::core::result::Result< + $try_into_type, + <$try_into_type as ::core::convert::TryFrom<$type>>::Error + > + $(, $comment)?;); + }; + + // Catches the `=>` syntax for non-boolean fields. + ( + @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:tt => $into_type:ty + $(, $comment:literal)?; + ) => { + register!(@leaf_accessor $name $hi:$lo $field as $type + { |f| <$into_type>::from(f as $type) } $into_type => $into_type $(, $comment)?;); + }; + + // Shortcut for fields defined as non-`bool` without the `=>` or `?=>` syntax. + ( + @field_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:tt + $(, $comment:literal)?; + ) => { + register!(@field_accessor $name $hi:$lo $field as $type => $type $(, $comment)?;); + }; + + // Generates the accessor methods for a single field. + ( + @leaf_accessor $name:ident $hi:tt:$lo:tt $field:ident as $type:ty + { $process:expr } $to_type:ty => $res_type:ty $(, $comment:literal)?; + ) => { + kernel::macros::paste!( + const [<$field:upper>]: ::core::ops::RangeInclusive<u8> = $lo..=$hi; + const [<$field:upper _MASK>]: u32 = ((((1 << $hi) - 1) << 1) + 1) - ((1 << $lo) - 1); + const [<$field:upper _SHIFT>]: u32 = Self::[<$field:upper _MASK>].trailing_zeros(); + ); + + $( + #[doc="Returns the value of this field:"] + #[doc=$comment] + )? + #[inline] + pub(crate) fn $field(self) -> $res_type { + kernel::macros::paste!( + const MASK: u32 = $name::[<$field:upper _MASK>]; + const SHIFT: u32 = $name::[<$field:upper _SHIFT>]; + ); + let field = ((self.0 & MASK) >> SHIFT); + + $process(field) + } + + kernel::macros::paste!( + $( + #[doc="Sets the value of this field:"] + #[doc=$comment] + )? + #[inline] + pub(crate) fn [<set_ $field>](mut self, value: $to_type) -> Self { + const MASK: u32 = $name::[<$field:upper _MASK>]; + const SHIFT: u32 = $name::[<$field:upper _SHIFT>]; + let value = ((value as u32) << SHIFT) & MASK; + self.0 = (self.0 & !MASK) | value; + + self + } + ); + }; + + // Creates the IO accessors for a fixed offset register. + (@io $name:ident @ $offset:literal) => { + #[allow(dead_code)] + impl $name { + #[inline] + pub(crate) fn read<const SIZE: usize, T>(io: &T) -> Self where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + { + Self(io.read32($offset)) + } + + #[inline] + pub(crate) fn write<const SIZE: usize, T>(self, io: &T) where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + { + io.write32(self.0, $offset) + } + + #[inline] + pub(crate) fn alter<const SIZE: usize, T, F>( + io: &T, + f: F, + ) where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + F: ::core::ops::FnOnce(Self) -> Self, + { + let reg = f(Self::read(io)); + reg.write(io); + } + } + }; + + // Create the IO accessors for a relative offset register. + (@io $name:ident @ + $offset:literal) => { + #[allow(dead_code)] + impl $name { + #[inline] + pub(crate) fn read<const SIZE: usize, T>( + io: &T, + base: usize, + ) -> Self where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + { + Self(io.read32(base + $offset)) + } + + #[inline] + pub(crate) fn write<const SIZE: usize, T>( + self, + io: &T, + base: usize, + ) where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + { + io.write32(self.0, base + $offset) + } + + #[inline] + pub(crate) fn alter<const SIZE: usize, T, F>( + io: &T, + base: usize, + f: F, + ) where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + F: ::core::ops::FnOnce(Self) -> Self, + { + let reg = f(Self::read(io, base)); + reg.write(io, base); + } + + #[inline] + pub(crate) fn try_read<const SIZE: usize, T>( + io: &T, + base: usize, + ) -> ::kernel::error::Result<Self> where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + { + io.try_read32(base + $offset).map(Self) + } + + #[inline] + pub(crate) fn try_write<const SIZE: usize, T>( + self, + io: &T, + base: usize, + ) -> ::kernel::error::Result<()> where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + { + io.try_write32(self.0, base + $offset) + } + + #[inline] + pub(crate) fn try_alter<const SIZE: usize, T, F>( + io: &T, + base: usize, + f: F, + ) -> ::kernel::error::Result<()> where + T: ::core::ops::Deref<Target = ::kernel::io::Io<SIZE>>, + F: ::core::ops::FnOnce(Self) -> Self, + { + let reg = f(Self::try_read(io, base)?); + reg.try_write(io, base) + } + } + }; +} -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 06/19] gpu: nova-core: fix layout of NV_PMC_BOOT_0
The layout of NV_PMC_BOOT_0 has two small issues: - The "chipset" field, while useful to identify a chip, is actually an aggregate of two distinct fields named "architecture" and "implementation". - The "architecture" field is split, with its MSB being at a different location than the rest of its bits. Redefine the register layout to match its actual definition as provided by OpenRM and expose the fully-constructed "architecture" field through our own "Architecture" type. The "chipset" pseudo-field is also useful to have, so keep providing it. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 19 ++++++++++++++++--- drivers/gpu/nova-core/regs.rs | 26 ++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 43139b527fac35f5f33facbc0be50b9715044245..29db6726a1f5196bd2d1778117313c987ccf343b 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -101,9 +101,22 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { /// Enum representation of the GPU generation. #[derive(fmt::Debug)] pub(crate) enum Architecture { - Turing, - Ampere, - Ada, + Turing = 0x16, + Ampere = 0x17, + Ada = 0x19, +} + +impl TryFrom<u8> for Architecture { + type Error = Error; + + fn try_from(value: u8) -> core::result::Result<Self, Self::Error> { + match value { + 0x16 => Ok(Self::Turing), + 0x17 => Ok(Self::Ampere), + 0x19 => Ok(Self::Ada), + _ => Err(ENODEV), + } + } } pub(crate) struct Revision { diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 498fefb52f33bf01518f19d32287962f1fdc3224..bfb9555b203ff880c0fc373bb22e5ce6048015d4 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -4,15 +4,37 @@ // but are mapped to types. #![allow(non_camel_case_types)] +use kernel::error::Error; + #[macro_use] mod macros; -use crate::gpu::Chipset; +use crate::gpu::{Architecture, Chipset}; /* PMC */ register!(NV_PMC_BOOT_0 @ 0x00000000, "Basic revision information about the GPU" { 3:0 minor_revision as u8, "Minor revision of the chip"; 7:4 major_revision as u8, "Major revision of the chip"; - 28:20 chipset as u32 ?=> Chipset, "Chipset model"; + 8:8 architecture_1 as u8, "MSB of the architecture"; + 23:20 implementation as u8, "Implementation version of the architecture"; + 28:24 architecture_0 as u8, "Lower bits of the architecture"; }); + +impl NV_PMC_BOOT_0 { + /// Combines `architecture_0` and `architecture_1` to obtain the architecture of the chip. + pub(crate) fn architecture(self) -> Result<Architecture, Error> { + Architecture::try_from( + self.architecture_0() | (self.architecture_1() << Self::ARCHITECTURE_0.len()), + ) + } + + /// Combines `architecture` and `implementation` to obtain a code unique to the chipset. + pub(crate) fn chipset(self) -> Result<Chipset, Error> { + self.architecture() + .map(|arch| { + ((arch as u32) << Self::IMPLEMENTATION.len()) | self.implementation() as u32 + }) + .and_then(Chipset::try_from) + } +} -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 07/19] gpu: nova-core: move Firmware to firmware module
We will extend the firmware methods, so move it to its own module instead to keep gpu.rs focused. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/firmware.rs | 46 +++++++++++++++++++++++++++++++++++++-- drivers/gpu/nova-core/gpu.rs | 35 +++-------------------------- 2 files changed, 47 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 6e6361c59ca1ae9a52185e66e850ba1db93eb8ce..cb79d039948858e657c9a23a62ed27ff780ac169 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -1,12 +1,54 @@ // SPDX-License-Identifier: GPL-2.0 -use crate::gpu; +//! Contains structures and functions dedicated to the parsing, building and patching of firmwares +//! to be loaded into a given execution unit. + +use kernel::device; use kernel::firmware; +use kernel::prelude::*; +use kernel::str::CString; + +use crate::gpu; +use crate::gpu::Chipset; + +pub(crate) const FIRMWARE_VERSION: &str = "535.113.01"; + +/// Structure encapsulating the firmware blobs required for the GPU to operate. +#[expect(dead_code)] +pub(crate) struct Firmware { + pub booter_load: firmware::Firmware, + pub booter_unload: firmware::Firmware, + pub bootloader: firmware::Firmware, + pub gsp: firmware::Firmware, +} + +impl Firmware { + pub(crate) fn new( + dev: &device::Device<device::Bound>, + chipset: Chipset, + ver: &str, + ) -> Result<Firmware> { + let mut chip_name = CString::try_from_fmt(fmt!("{}", chipset))?; + chip_name.make_ascii_lowercase(); + + let request = |name_| { + CString::try_from_fmt(fmt!("nvidia/{}/gsp/{}-{}.bin", &*chip_name, name_, ver)) + .and_then(|path| firmware::Firmware::request(&path, dev)) + }; + + Ok(Firmware { + booter_load: request("booter_load")?, + booter_unload: request("booter_unload")?, + bootloader: request("bootloader")?, + gsp: request("gsp")?, + }) + } +} pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>); impl<const N: usize> ModInfoBuilder<N> { - const VERSION: &'static str = "535.113.01"; + const VERSION: &'static str = FIRMWARE_VERSION; const fn make_entry_file(self, chipset: &str, fw: &str) -> Self { ModInfoBuilder( diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 29db6726a1f5196bd2d1778117313c987ccf343b..9c6a9270ffa374bd386af352f6ad9b857f9c5f8d 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -1,10 +1,9 @@ // SPDX-License-Identifier: GPL-2.0 -use kernel::{ - device, devres::Devres, error::code::*, firmware, fmt, pci, prelude::*, str::CString, -}; +use kernel::{device, devres::Devres, error::code::*, pci, prelude::*}; use crate::driver::Bar0; +use crate::firmware::Firmware; use crate::regs; use crate::util; use core::fmt; @@ -157,34 +156,6 @@ fn new(bar: &Bar0) -> Result<Spec> { } } -/// Structure encapsulating the firmware blobs required for the GPU to operate. -#[expect(dead_code)] -pub(crate) struct Firmware { - booter_load: firmware::Firmware, - booter_unload: firmware::Firmware, - bootloader: firmware::Firmware, - gsp: firmware::Firmware, -} - -impl Firmware { - fn new(dev: &device::Device, spec: &Spec, ver: &str) -> Result<Firmware> { - let mut chip_name = CString::try_from_fmt(fmt!("{}", spec.chipset))?; - chip_name.make_ascii_lowercase(); - - let request = |name_| { - CString::try_from_fmt(fmt!("nvidia/{}/gsp/{}-{}.bin", &*chip_name, name_, ver)) - .and_then(|path| firmware::Firmware::request(&path, dev)) - }; - - Ok(Firmware { - booter_load: request("booter_load")?, - booter_unload: request("booter_unload")?, - bootloader: request("bootloader")?, - gsp: request("gsp")?, - }) - } -} - /// Structure holding the resources required to operate the GPU. #[pin_data] pub(crate) struct Gpu { @@ -201,7 +172,7 @@ pub(crate) fn new( ) -> Result<impl PinInit<Self>> { let bar = devres_bar.access(pdev.as_ref())?; let spec = Spec::new(bar)?; - let fw = Firmware::new(pdev.as_ref(), &spec, "535.113.01")?; + let fw = Firmware::new(pdev.as_ref(), spec.chipset, "535.113.01")?; dev_info!( pdev.as_ref(), -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 08/19] rust: make ETIMEDOUT error available
We will use this error in the nova-core driver. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- rust/kernel/error.rs | 1 + 1 file changed, 1 insertion(+) diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs index 3dee3139fcd4379b94748c0ba1965f4e1865b633..083c7b068cf4e185100de96e520c54437898ee72 100644 --- a/rust/kernel/error.rs +++ b/rust/kernel/error.rs @@ -65,6 +65,7 @@ macro_rules! declare_err { declare_err!(EDOM, "Math argument out of domain of func."); declare_err!(ERANGE, "Math result not representable."); declare_err!(EOVERFLOW, "Value too large for defined data type."); + declare_err!(ETIMEDOUT, "Connection timed out."); declare_err!(ERESTARTSYS, "Restart the system call."); declare_err!(ERESTARTNOINTR, "System call was interrupted by a signal and will be restarted."); declare_err!(ERESTARTNOHAND, "Restart if no handler."); -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 09/19] gpu: nova-core: wait for GFW_BOOT completion
Upon reset, the GPU executes the GFW_BOOT firmware in order to initialize its base parameters such as clocks. The driver must ensure that this step is completed before using the hardware. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/devinit.rs | 38 ++++++++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/driver.rs | 2 +- drivers/gpu/nova-core/gpu.rs | 5 +++++ drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 11 +++++++++++ 5 files changed, 56 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/devinit.rs b/drivers/gpu/nova-core/devinit.rs new file mode 100644 index 0000000000000000000000000000000000000000..5be2e0344fb651e5e53c9223aefeb5b2d95b8de1 --- /dev/null +++ b/drivers/gpu/nova-core/devinit.rs @@ -0,0 +1,38 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Methods for device initialization. + +use kernel::bindings; +use kernel::prelude::*; + +use crate::driver::Bar0; +use crate::regs; + +/// Wait for devinit FW completion. +/// +/// Upon reset, the GPU runs some firmware code to setup its core parameters. Most of the GPU is +/// considered unusable until this step is completed, so it must be waited on very early during +/// driver initialization. +pub(crate) fn wait_gfw_boot_completion(bar: &Bar0) -> Result<()> { + let mut timeout = 2000; + + loop { + let gfw_booted = regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK::read(bar) + .read_protection_level0() + && (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_05::read(bar).value() & 0xff) == 0xff; + + if gfw_booted { + return Ok(()); + } + + if timeout == 0 { + return Err(ETIMEDOUT); + } + timeout -= 1; + + // TODO: use `read_poll_timeout` once it is available. + // (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori at gmail.com/) + // SAFETY: msleep should be safe to call with any parameter. + unsafe { bindings::msleep(2) }; + } +} diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs index a08fb6599267a960f0e07b6efd0e3b6cdc296aa4..752ba4b0fcfe8d835d366570bb2f807840a196da 100644 --- a/drivers/gpu/nova-core/driver.rs +++ b/drivers/gpu/nova-core/driver.rs @@ -10,7 +10,7 @@ pub(crate) struct NovaCore { pub(crate) gpu: Gpu, } -const BAR0_SIZE: usize = 8; +const BAR0_SIZE: usize = 0x1000000; pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>; kernel::pci_device_table!( diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 9c6a9270ffa374bd386af352f6ad9b857f9c5f8d..a9fcf74717791dc7e23678869bf84f61e51873e2 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -2,6 +2,7 @@ use kernel::{device, devres::Devres, error::code::*, pci, prelude::*}; +use crate::devinit; use crate::driver::Bar0; use crate::firmware::Firmware; use crate::regs; @@ -182,6 +183,10 @@ pub(crate) fn new( spec.revision ); + // We must wait for GFW_BOOT completion before doing any significant setup on the GPU. + devinit::wait_gfw_boot_completion(bar) + .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?; + Ok(pin_init!(Self { spec, bar: devres_bar, diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index a91cd924054b49966937a8db6aab9cd0614f10de..3d75e7acc6e0ddcfa142b1fb134cb5c966aaf472 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -2,6 +2,7 @@ //! Nova Core GPU Driver +mod devinit; mod driver; mod firmware; mod gpu; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index bfb9555b203ff880c0fc373bb22e5ce6048015d4..401d885539cee03cbe732102f5e2233785a7b284 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -38,3 +38,14 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> { .and_then(Chipset::try_from) } } + +/* PGC6 */ + +register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 { + 0:0 read_protection_level0 as bool; +}); + +// TODO: This is an array of registers. +register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05 @ 0x00118234 { + 31:0 value as u32; +}); -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 10/19] gpu: nova-core: add DMA object struct
Since we will need to allocate lots of distinct memory chunks to be shared between GPU and CPU, introduce a type dedicated to that. It is a light wrapper around CoherentAllocation. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/dma.rs | 60 ++++++++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/nova_core.rs | 1 + 2 files changed, 61 insertions(+) diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs new file mode 100644 index 0000000000000000000000000000000000000000..9d90ae01d0044eaab4ddbc3eba216741d7a623ef --- /dev/null +++ b/drivers/gpu/nova-core/dma.rs @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Simple DMA object wrapper. + +// To be removed when all code is used. +#![expect(dead_code)] + +use core::ops::{Deref, DerefMut}; + +use kernel::device; +use kernel::dma::CoherentAllocation; +use kernel::page::PAGE_SIZE; +use kernel::prelude::*; + +pub(crate) struct DmaObject { + dma: CoherentAllocation<u8>, +} + +impl DmaObject { + pub(crate) fn new(dev: &device::Device<device::Bound>, len: usize) -> Result<Self> { + let len = core::alloc::Layout::from_size_align(len, PAGE_SIZE) + .map_err(|_| EINVAL)? + .pad_to_align() + .size(); + let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?; + + Ok(Self { dma }) + } + + pub(crate) fn from_data(dev: &device::Device<device::Bound>, data: &[u8]) -> Result<Self> { + Self::new(dev, data.len()).map(|mut dma_obj| { + // SAFETY: + // - The copied data fits within the size of the allocated object. + // - We have just created this object and there is no other user at this stage. + unsafe { + core::ptr::copy_nonoverlapping( + data.as_ptr(), + dma_obj.dma.start_ptr_mut(), + data.len(), + ); + } + + dma_obj + }) + } +} + +impl Deref for DmaObject { + type Target = CoherentAllocation<u8>; + + fn deref(&self) -> &Self::Target { + &self.dma + } +} + +impl DerefMut for DmaObject { + fn deref_mut(&mut self) -> &mut Self::Target { + &mut self.dma + } +} diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 3d75e7acc6e0ddcfa142b1fb134cb5c966aaf472..1c7333e9fabe357f3ecbc6944ca98b66fa17c9a5 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -3,6 +3,7 @@ //! Nova Core GPU Driver mod devinit; +mod dma; mod driver; mod firmware; mod gpu; -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 11/19] gpu: nova-core: register sysmem flush page
A page of system memory is reserved so sysmembar can perform a read on it if a system write occurred since the last flush. Do this early as it can be required to e.g. reset the GPU falcons. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 45 +++++++++++++++++++++++++++++++++++++++++-- drivers/gpu/nova-core/regs.rs | 10 ++++++++++ 2 files changed, 53 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index a9fcf74717791dc7e23678869bf84f61e51873e2..c338da69ecbc2200f1ef3061a4d62971b021e3eb 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -3,6 +3,7 @@ use kernel::{device, devres::Devres, error::code::*, pci, prelude::*}; use crate::devinit; +use crate::dma::DmaObject; use crate::driver::Bar0; use crate::firmware::Firmware; use crate::regs; @@ -158,12 +159,32 @@ fn new(bar: &Bar0) -> Result<Spec> { } /// Structure holding the resources required to operate the GPU. -#[pin_data] +#[pin_data(PinnedDrop)] pub(crate) struct Gpu { spec: Spec, /// MMIO mapping of PCI BAR 0 bar: Devres<Bar0>, fw: Firmware, + // System memory page required for flushing all pending GPU-side memory writes done through + // PCIE into system memory. + sysmem_flush: DmaObject, +} + +#[pinned_drop] +impl PinnedDrop for Gpu { + fn drop(self: Pin<&mut Self>) { + // Unregister the sysmem flush page before we release it. + let _ = self.bar.try_access_with(|b| { + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default() + .set_adr_39_08(0) + .write(b); + if self.spec.chipset >= Chipset::GA102 { + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default() + .set_adr_63_40(0) + .write(b); + } + }); + } } impl Gpu { @@ -187,10 +208,30 @@ pub(crate) fn new( devinit::wait_gfw_boot_completion(bar) .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?; + // System memory page required for sysmembar to properly flush into system memory. + let sysmem_flush = { + let page = DmaObject::new(pdev.as_ref(), kernel::bindings::PAGE_SIZE)?; + + // Register the sysmem flush page. + let handle = page.dma_handle(); + + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default() + .set_adr_39_08((handle >> 8) as u32) + .write(bar); + if spec.chipset >= Chipset::GA102 { + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default() + .set_adr_63_40((handle >> 40) as u32) + .write(bar); + } + + page + }; + Ok(pin_init!(Self { spec, bar: devres_bar, - fw + fw, + sysmem_flush, })) } } diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 401d885539cee03cbe732102f5e2233785a7b284..218cb6441eb0e5c6e5b52eabba006163eec0c8b4 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -39,6 +39,16 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> { } } +/* PFB */ + +register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR @ 0x00100c10 { + 31:0 adr_39_08 as u32; +}); + +register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI @ 0x00100c40 { + 23:0 adr_63_40 as u32; +}); + /* PGC6 */ register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 { -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 12/19] gpu: nova-core: add helper function to wait on condition
While programming the hardware, we frequently need to busy-wait until a condition (like a given bit of a register to switch value) happens. Add a basic `wait_on` helper function to wait on such conditions expressed as a closure, with a timeout argument. This is temporary as we will switch to `read_poll_timeout` [1] once it is available. [1] https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori at gmail.com/ Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/util.rs | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs index 332a64cfc6a9d7d787fbdc228887c0be53a97160..afb525228431a2645afe7bb34988e9537757b1d7 100644 --- a/drivers/gpu/nova-core/util.rs +++ b/drivers/gpu/nova-core/util.rs @@ -1,5 +1,10 @@ // SPDX-License-Identifier: GPL-2.0 +use core::time::Duration; + +use kernel::prelude::*; +use kernel::time::Ktime; + pub(crate) const fn to_lowercase_bytes<const N: usize>(s: &str) -> [u8; N] { let src = s.as_bytes(); let mut dst = [0; N]; @@ -19,3 +24,28 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str { Err(_) => kernel::build_error!("Bytes are not valid UTF-8."), } } + +/// Wait until `cond` is true or `timeout` elapsed. +/// +/// When `cond` evaluates to `Some`, its return value is returned. +/// +/// `Err(ETIMEDOUT)` is returned if `timeout` has been reached without `cond` evaluating to +/// `Some`. +/// +/// TODO: replace with `read_poll_timeout` once it is available. +/// (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori at gmail.com/) +#[expect(dead_code)] +pub(crate) fn wait_on<R, F: Fn() -> Option<R>>(timeout: Duration, cond: F) -> Result<R> { + let start_time = Ktime::ktime_get(); + + loop { + if let Some(ret) = cond() { + return Ok(ret); + } + + let cur_time = Ktime::ktime_get(); + if (cur_time - start_time).to_ns() > timeout.as_nanos() as i64 { + return Err(ETIMEDOUT); + } + } +} -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 13/19] gpu: nova-core: add falcon register definitions and base code
Add the common Falcon code and HAL for Ampere GPUs, and instantiate the GSP and SEC2 Falcons that will be required to boot the GSP. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/falcon.rs | 540 ++++++++++++++++++++++++++++++ drivers/gpu/nova-core/falcon/gsp.rs | 22 ++ drivers/gpu/nova-core/falcon/hal.rs | 56 ++++ drivers/gpu/nova-core/falcon/hal/ga102.rs | 120 +++++++ drivers/gpu/nova-core/falcon/sec2.rs | 8 + drivers/gpu/nova-core/gpu.rs | 11 + drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 125 +++++++ drivers/gpu/nova-core/util.rs | 1 - 9 files changed, 883 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs new file mode 100644 index 0000000000000000000000000000000000000000..b6552496ad9c5f8eeda4beaa757c7393f495072b --- /dev/null +++ b/drivers/gpu/nova-core/falcon.rs @@ -0,0 +1,540 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Falcon microprocessor base support + +// To be removed when all code is used. +#![expect(dead_code)] + +use core::time::Duration; +use hal::FalconHal; +use kernel::bindings; +use kernel::device; +use kernel::prelude::*; +use kernel::sync::Arc; +use kernel::types::ARef; + +use crate::driver::Bar0; +use crate::gpu::Chipset; +use crate::regs; +use crate::util; + +pub(crate) mod gsp; +mod hal; +pub(crate) mod sec2; + +/// Revision number of a falcon core, used in the [`crate::regs::NV_PFALCON_FALCON_HWCFG1`] +/// register. +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)] +pub(crate) enum FalconCoreRev { + #[default] + Rev1 = 1, + Rev2 = 2, + Rev3 = 3, + Rev4 = 4, + Rev5 = 5, + Rev6 = 6, + Rev7 = 7, +} + +impl TryFrom<u8> for FalconCoreRev { + type Error = Error; + + fn try_from(value: u8) -> core::result::Result<Self, Self::Error> { + use FalconCoreRev::*; + + let rev = match value { + 1 => Rev1, + 2 => Rev2, + 3 => Rev3, + 4 => Rev4, + 5 => Rev5, + 6 => Rev6, + 7 => Rev7, + _ => return Err(EINVAL), + }; + + Ok(rev) + } +} + +/// Revision subversion number of a falcon core, used in the +/// [`crate::regs::NV_PFALCON_FALCON_HWCFG1`] register. +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)] +pub(crate) enum FalconCoreRevSubversion { + #[default] + Subversion0 = 0, + Subversion1 = 1, + Subversion2 = 2, + Subversion3 = 3, +} + +impl TryFrom<u8> for FalconCoreRevSubversion { + type Error = Error; + + fn try_from(value: u8) -> Result<Self> { + use FalconCoreRevSubversion::*; + + let sub_version = match value & 0b11 { + 0 => Subversion0, + 1 => Subversion1, + 2 => Subversion2, + 3 => Subversion3, + _ => return Err(EINVAL), + }; + + Ok(sub_version) + } +} + +/// Security model of a falcon core, used in the [`crate::regs::NV_PFALCON_FALCON_HWCFG1`] +/// register. +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone)] +pub(crate) enum FalconSecurityModel { + /// Non-Secure: runs unsigned code without privileges. + #[default] + None = 0, + /// Low-secure: runs unsigned code with some privileges. Can only be entered from `Heavy` mode. + Light = 2, + /// High-Secure: runs signed code with full privileges. + Heavy = 3, +} + +impl TryFrom<u8> for FalconSecurityModel { + type Error = Error; + + fn try_from(value: u8) -> core::result::Result<Self, Self::Error> { + use FalconSecurityModel::*; + + let sec_model = match value { + 0 => None, + 2 => Light, + 3 => Heavy, + _ => return Err(EINVAL), + }; + + Ok(sec_model) + } +} + +/// Signing algorithm for a given firmware, used in the [`crate::regs::NV_PFALCON2_FALCON_MOD_SEL`] +/// register. +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)] +pub(crate) enum FalconModSelAlgo { + /// RSA3K. + #[default] + Rsa3k = 1, +} + +impl TryFrom<u8> for FalconModSelAlgo { + type Error = Error; + + fn try_from(value: u8) -> core::result::Result<Self, Self::Error> { + match value { + 1 => Ok(FalconModSelAlgo::Rsa3k), + _ => Err(EINVAL), + } + } +} + +/// Valid values for the `size` field of the [`crate::regs::NV_PFALCON_FALCON_DMATRFCMD`] register. +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)] +pub(crate) enum DmaTrfCmdSize { + /// 256 bytes transfer. + #[default] + Size256B = 0x6, +} + +impl TryFrom<u8> for DmaTrfCmdSize { + type Error = Error; + + fn try_from(value: u8) -> Result<Self> { + match value { + 0x6 => Ok(Self::Size256B), + _ => Err(EINVAL), + } + } +} + +/// Currently active core on a dual falcon/riscv (Peregrine) controller. +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub(crate) enum PeregrineCoreSelect { + /// Falcon core is active. + Falcon = 0, + /// RISC-V core is active. + Riscv = 1, +} + +impl From<bool> for PeregrineCoreSelect { + fn from(value: bool) -> Self { + match value { + false => PeregrineCoreSelect::Falcon, + true => PeregrineCoreSelect::Riscv, + } + } +} + +/// Different types of memory present in a falcon core. +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub(crate) enum FalconMem { + /// Instruction Memory. + Imem, + /// Data Memory. + Dmem, +} + +/// Target/source of a DMA transfer to/from falcon memory. +#[derive(Debug, Clone, Default)] +pub(crate) enum FalconFbifTarget { + /// VRAM. + #[default] + LocalFb = 0, + /// Coherent system memory. + CoherentSysmem = 1, + /// Non-coherent system memory. + NoncoherentSysmem = 2, +} + +impl TryFrom<u8> for FalconFbifTarget { + type Error = Error; + + fn try_from(value: u8) -> core::result::Result<Self, Self::Error> { + let res = match value { + 0 => Self::LocalFb, + 1 => Self::CoherentSysmem, + 2 => Self::NoncoherentSysmem, + _ => return Err(EINVAL), + }; + + Ok(res) + } +} + +/// Type of memory addresses to use. +#[derive(Debug, Clone, Default)] +pub(crate) enum FalconFbifMemType { + /// Physical memory addresses. + #[default] + Virtual = 0, + /// Virtual memory addresses. + Physical = 1, +} + +impl From<bool> for FalconFbifMemType { + fn from(value: bool) -> Self { + match value { + false => Self::Virtual, + true => Self::Physical, + } + } +} + +/// Trait defining the parameters of a given Falcon instance. +pub(crate) trait FalconEngine: Sync { + /// Base I/O address for the falcon, relative from which its registers are accessed. + const BASE: usize; +} + +/// Represents a portion of the firmware to be loaded into a particular memory (e.g. IMEM or DMEM). +#[derive(Debug)] +pub(crate) struct FalconLoadTarget { + /// Offset from the start of the source object to copy from. + pub(crate) src_start: u32, + /// Offset from the start of the destination memory to copy into. + pub(crate) dst_start: u32, + /// Number of bytes to copy. + pub(crate) len: u32, +} + +/// Parameters for the falcon boot ROM. +#[derive(Debug)] +pub(crate) struct FalconBromParams { + /// Offset in `DMEM`` of the firmware's signature. + pub(crate) pkc_data_offset: u32, + /// Mask of engines valid for this firmware. + pub(crate) engine_id_mask: u16, + /// ID of the ucode used to infer a fuse register to validate the signature. + pub(crate) ucode_id: u8, +} + +/// Trait for a falcon firmware. +pub(crate) trait FalconFirmware { + /// Engine on which this firmware is to be loaded. + type Target: FalconEngine; + + /// Returns the DMA handle of the object containing the firmware. + fn dma_handle(&self) -> bindings::dma_addr_t; + + /// Returns the load parameters for `IMEM`. + fn imem_load(&self) -> FalconLoadTarget; + + /// Returns the load parameters for `DMEM`. + fn dmem_load(&self) -> FalconLoadTarget; + + /// Returns the parameters to write into the BROM registers. + fn brom_params(&self) -> FalconBromParams; + + /// Returns the start address of the firmware. + fn boot_addr(&self) -> u32; +} + +/// Contains the base parameters common to all Falcon instances. +pub(crate) struct Falcon<E: FalconEngine> { + hal: Arc<dyn FalconHal<E>>, + dev: ARef<device::Device>, +} + +impl<E: FalconEngine + 'static> Falcon<E> { + /// Create a new falcon instance. + /// + /// `need_riscv` is set to `true` if the caller expects the falcon to be a dual falcon/riscv + /// controller. + pub(crate) fn new( + dev: &device::Device, + chipset: Chipset, + bar: &Bar0, + need_riscv: bool, + ) -> Result<Self> { + let hwcfg1 = regs::NV_PFALCON_FALCON_HWCFG1::read(bar, E::BASE); + // Ensure that the revision and security model contain valid values. + let _rev = hwcfg1.core_rev()?; + let _sec_model = hwcfg1.security_model()?; + + if need_riscv { + let hwcfg2 = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE); + if !hwcfg2.riscv() { + dev_err!( + dev, + "riscv support requested on a controller that does not support it\n" + ); + return Err(EINVAL); + } + } + + Ok(Self { + hal: hal::create_falcon_hal(chipset)?, + dev: dev.into(), + }) + } + + /// Wait for memory scrubbing to complete. + fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result<()> { + util::wait_on(Duration::from_millis(20), || { + let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE); + if r.mem_scrubbing() { + Some(()) + } else { + None + } + }) + } + + /// Reset the falcon engine. + fn reset_eng(&self, bar: &Bar0) -> Result<()> { + let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE); + + // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set + // RESET_READY so a non-failing timeout is used. + let _ = util::wait_on(Duration::from_micros(150), || { + let r = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, E::BASE); + if r.reset_ready() { + Some(()) + } else { + None + } + }); + + regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(true)); + + let _: Result<()> = util::wait_on(Duration::from_micros(10), || None); + + regs::NV_PFALCON_FALCON_ENGINE::alter(bar, E::BASE, |v| v.set_reset(false)); + + self.reset_wait_mem_scrubbing(bar)?; + + Ok(()) + } + + /// Reset the controller, select the falcon core, and wait for memory scrubbing to complete. + pub(crate) fn reset(&self, bar: &Bar0) -> Result<()> { + self.reset_eng(bar)?; + self.hal.select_core(self, bar)?; + self.reset_wait_mem_scrubbing(bar)?; + + regs::NV_PFALCON_FALCON_RM::default() + .set_value(regs::NV_PMC_BOOT_0::read(bar).into()) + .write(bar, E::BASE); + + Ok(()) + } + + /// Perform a DMA write according to `load_offsets` from `dma_handle` into the falcon's + /// `target_mem`. + /// + /// `sec` is set if the loaded firmware is expected to run in secure mode. + fn dma_wr( + &self, + bar: &Bar0, + dma_handle: bindings::dma_addr_t, + target_mem: FalconMem, + load_offsets: FalconLoadTarget, + sec: bool, + ) -> Result<()> { + const DMA_LEN: u32 = 256; + + // For IMEM, we want to use the start offset as a virtual address tag for each page, since + // code addresses in the firmware (and the boot vector) are virtual. + // + // For DMEM we can fold the start offset into the DMA handle. + let (src_start, dma_start) = match target_mem { + FalconMem::Imem => (load_offsets.src_start, dma_handle), + FalconMem::Dmem => ( + 0, + dma_handle + load_offsets.src_start as bindings::dma_addr_t, + ), + }; + if dma_start % DMA_LEN as bindings::dma_addr_t > 0 { + dev_err!( + self.dev, + "DMA transfer start addresses must be a multiple of {}", + DMA_LEN + ); + return Err(EINVAL); + } + if load_offsets.len % DMA_LEN > 0 { + dev_err!( + self.dev, + "DMA transfer length must be a multiple of {}", + DMA_LEN + ); + return Err(EINVAL); + } + + // Set up the base source DMA address. + + regs::NV_PFALCON_FALCON_DMATRFBASE::default() + .set_base((dma_start >> 8) as u32) + .write(bar, E::BASE); + regs::NV_PFALCON_FALCON_DMATRFBASE1::default() + .set_base((dma_start >> 40) as u16) + .write(bar, E::BASE); + + let cmd = regs::NV_PFALCON_FALCON_DMATRFCMD::default() + .set_size(DmaTrfCmdSize::Size256B) + .set_imem(target_mem == FalconMem::Imem) + .set_sec(if sec { 1 } else { 0 }); + + for pos in (0..load_offsets.len).step_by(DMA_LEN as usize) { + // Perform a transfer of size `DMA_LEN`. + regs::NV_PFALCON_FALCON_DMATRFMOFFS::default() + .set_offs(load_offsets.dst_start + pos) + .write(bar, E::BASE); + regs::NV_PFALCON_FALCON_DMATRFFBOFFS::default() + .set_offs(src_start + pos) + .write(bar, E::BASE); + cmd.write(bar, E::BASE); + + // Wait for the transfer to complete. + util::wait_on(Duration::from_millis(2000), || { + let r = regs::NV_PFALCON_FALCON_DMATRFCMD::read(bar, E::BASE); + if r.idle() { + Some(()) + } else { + None + } + })?; + } + + Ok(()) + } + + /// Perform a DMA load into `IMEM` and `DMEM` of `fw`, and prepare the falcon to run it. + pub(crate) fn dma_load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result<()> { + let dma_handle = fw.dma_handle(); + + regs::NV_PFALCON_FBIF_CTL::alter(bar, E::BASE, |v| v.set_allow_phys_no_ctx(true)); + regs::NV_PFALCON_FALCON_DMACTL::default().write(bar, E::BASE); + regs::NV_PFALCON_FBIF_TRANSCFG::alter(bar, E::BASE, |v| { + v.set_target(FalconFbifTarget::CoherentSysmem) + .set_mem_type(FalconFbifMemType::Physical) + }); + + self.dma_wr(bar, dma_handle, FalconMem::Imem, fw.imem_load(), true)?; + self.dma_wr(bar, dma_handle, FalconMem::Dmem, fw.dmem_load(), true)?; + + self.hal.program_brom(self, bar, &fw.brom_params())?; + + // Set `BootVec` to start of non-secure code. + regs::NV_PFALCON_FALCON_BOOTVEC::default() + .set_value(fw.boot_addr()) + .write(bar, E::BASE); + + Ok(()) + } + + /// Start running the loaded firmware. + /// + /// `mbox0` and `mbox1` are optional parameters to write into the `MBOX0` and `MBOX1` registers + /// prior to running. + /// + /// Returns `MBOX0` and `MBOX1` after the firmware has stopped running. + pub(crate) fn boot( + &self, + bar: &Bar0, + mbox0: Option<u32>, + mbox1: Option<u32>, + ) -> Result<(u32, u32)> { + if let Some(mbox0) = mbox0 { + regs::NV_PFALCON_FALCON_MAILBOX0::default() + .set_value(mbox0) + .write(bar, E::BASE); + } + + if let Some(mbox1) = mbox1 { + regs::NV_PFALCON_FALCON_MAILBOX1::default() + .set_value(mbox1) + .write(bar, E::BASE); + } + + match regs::NV_PFALCON_FALCON_CPUCTL::read(bar, E::BASE).alias_en() { + true => regs::NV_PFALCON_FALCON_CPUCTL_ALIAS::default() + .set_startcpu(true) + .write(bar, E::BASE), + false => regs::NV_PFALCON_FALCON_CPUCTL::default() + .set_startcpu(true) + .write(bar, E::BASE), + } + + util::wait_on(Duration::from_secs(2), || { + let r = regs::NV_PFALCON_FALCON_CPUCTL::read(bar, E::BASE); + if r.halted() { + Some(()) + } else { + None + } + })?; + + let (mbox0, mbox1) = ( + regs::NV_PFALCON_FALCON_MAILBOX0::read(bar, E::BASE).value(), + regs::NV_PFALCON_FALCON_MAILBOX1::read(bar, E::BASE).value(), + ); + + Ok((mbox0, mbox1)) + } + + /// Returns the fused version of the signature to use in order to run a HS firmware on this + /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header. + pub(crate) fn get_signature_reg_fuse_version( + &self, + bar: &Bar0, + engine_id_mask: u16, + ucode_id: u8, + ) -> Result<u32> { + self.hal + .get_signature_reg_fuse_version(self, bar, engine_id_mask, ucode_id) + } +} diff --git a/drivers/gpu/nova-core/falcon/gsp.rs b/drivers/gpu/nova-core/falcon/gsp.rs new file mode 100644 index 0000000000000000000000000000000000000000..f74aeadaee9ae96bb1961d3c55b2cf1999943377 --- /dev/null +++ b/drivers/gpu/nova-core/falcon/gsp.rs @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0 + +use crate::{ + driver::Bar0, + falcon::{Falcon, FalconEngine}, + regs, +}; + +pub(crate) struct Gsp; +impl FalconEngine for Gsp { + const BASE: usize = 0x00110000; +} + +impl Falcon<Gsp> { + /// Clears the SWGEN0 bit in the Falcon's IRQ status clear register to + /// allow GSP to signal CPU for processing new messages in message queue. + pub(crate) fn clear_swgen0_intr(&self, bar: &Bar0) { + regs::NV_PFALCON_FALCON_IRQSCLR::default() + .set_swgen0(true) + .write(bar, Gsp::BASE); + } +} diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs new file mode 100644 index 0000000000000000000000000000000000000000..0b0ab8174caafeec3a2a2ba8211a740b7feadb75 --- /dev/null +++ b/drivers/gpu/nova-core/falcon/hal.rs @@ -0,0 +1,56 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::prelude::*; +use kernel::sync::Arc; + +use crate::driver::Bar0; +use crate::falcon::{Falcon, FalconBromParams, FalconEngine}; +use crate::gpu::Chipset; + +mod ga102; + +/// Hardware Abstraction Layer for Falcon cores. +/// +/// Implements chipset-specific low-level operations. The trait is generic against [`FalconEngine`] +/// so its `BASE` parameter can be used in order to avoid runtime bound checks when accessing +/// registers. +pub(crate) trait FalconHal<E: FalconEngine>: Sync { + // Activates the Falcon core if the engine is a risvc/falcon dual engine. + fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result<()> { + Ok(()) + } + + /// Returns the fused version of the signature to use in order to run a HS firmware on this + /// falcon instance. `engine_id_mask` and `ucode_id` are obtained from the firmware header. + fn get_signature_reg_fuse_version( + &self, + falcon: &Falcon<E>, + bar: &Bar0, + engine_id_mask: u16, + ucode_id: u8, + ) -> Result<u32>; + + // Program the boot ROM registers prior to starting a secure firmware. + fn program_brom(&self, falcon: &Falcon<E>, bar: &Bar0, params: &FalconBromParams) + -> Result<()>; +} + +/// Returns a boxed falcon HAL adequate for the passed `chipset`. +/// +/// We use this function and a heap-allocated trait object instead of statically defined trait +/// objects because of the two-dimensional (Chipset, Engine) lookup required to return the +/// requested HAL. +/// +/// TODO: replace the return type with `KBox` once it gains the ability to host trait objects. +pub(crate) fn create_falcon_hal<E: FalconEngine + 'static>( + chipset: Chipset, +) -> Result<Arc<dyn FalconHal<E>>> { + let hal = match chipset { + Chipset::GA102 | Chipset::GA103 | Chipset::GA104 | Chipset::GA106 | Chipset::GA107 => { + Arc::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as Arc<dyn FalconHal<E>> + } + _ => return Err(ENOTSUPP), + }; + + Ok(hal) +} diff --git a/drivers/gpu/nova-core/falcon/hal/ga102.rs b/drivers/gpu/nova-core/falcon/hal/ga102.rs new file mode 100644 index 0000000000000000000000000000000000000000..68d807c192a16a908f40b65f9b03d107f4042b63 --- /dev/null +++ b/drivers/gpu/nova-core/falcon/hal/ga102.rs @@ -0,0 +1,120 @@ +// SPDX-License-Identifier: GPL-2.0 + +use core::marker::PhantomData; +use core::time::Duration; + +use kernel::{device, prelude::*}; + +use crate::driver::Bar0; +use crate::falcon::{ + Falcon, FalconBromParams, FalconEngine, FalconModSelAlgo, PeregrineCoreSelect, +}; +use crate::regs; +use crate::util; + +use super::FalconHal; + +fn select_core_ga102<E: FalconEngine>(bar: &Bar0) -> Result<()> { + let bcr_ctrl = regs::NV_PRISCV_RISCV_BCR_CTRL::read(bar, E::BASE); + if bcr_ctrl.core_select() != PeregrineCoreSelect::Falcon { + regs::NV_PRISCV_RISCV_BCR_CTRL::default() + .set_core_select(PeregrineCoreSelect::Falcon) + .write(bar, E::BASE); + + util::wait_on(Duration::from_millis(10), || { + let r = regs::NV_PRISCV_RISCV_BCR_CTRL::read(bar, E::BASE); + if r.valid() { + Some(()) + } else { + None + } + })?; + } + + Ok(()) +} + +fn get_signature_reg_fuse_version_ga102( + dev: &device::Device, + bar: &Bar0, + engine_id_mask: u16, + ucode_id: u8, +) -> Result<u32> { + // The ucode fuse versions are contained in the FUSE_OPT_FPF_<ENGINE>_UCODE<X>_VERSION + // registers, which are an array. Our register definition macros do not allow us to manage them + // properly, so we need to hardcode their addresses for now. + + // Each engine has 16 ucode version registers numbered from 1 to 16. + if ucode_id == 0 || ucode_id > 16 { + dev_warn!(dev, "invalid ucode id {:#x}", ucode_id); + return Err(EINVAL); + } + let reg_fuse = if engine_id_mask & 0x0001 != 0 { + // NV_FUSE_OPT_FPF_SEC2_UCODE1_VERSION + 0x824140 + } else if engine_id_mask & 0x0004 != 0 { + // NV_FUSE_OPT_FPF_NVDEC_UCODE1_VERSION + 0x824100 + } else if engine_id_mask & 0x0400 != 0 { + // NV_FUSE_OPT_FPF_GSP_UCODE1_VERSION + 0x8241c0 + } else { + dev_warn!(dev, "unexpected engine_id_mask {:#x}", engine_id_mask); + return Err(EINVAL); + } + ((ucode_id - 1) as usize * core::mem::size_of::<u32>()); + + let reg_fuse_version = bar.read32(reg_fuse); + + // Equivalent of Find Last Set bit. + Ok(u32::BITS - reg_fuse_version.leading_zeros()) +} + +fn program_brom_ga102<E: FalconEngine>(bar: &Bar0, params: &FalconBromParams) -> Result<()> { + regs::NV_PFALCON2_FALCON_BROM_PARAADDR::default() + .set_value(params.pkc_data_offset) + .write(bar, E::BASE); + regs::NV_PFALCON2_FALCON_BROM_ENGIDMASK::default() + .set_value(params.engine_id_mask as u32) + .write(bar, E::BASE); + regs::NV_PFALCON2_FALCON_BROM_CURR_UCODE_ID::default() + .set_ucode_id(params.ucode_id) + .write(bar, E::BASE); + regs::NV_PFALCON2_FALCON_MOD_SEL::default() + .set_algo(FalconModSelAlgo::Rsa3k) + .write(bar, E::BASE); + + Ok(()) +} + +pub(super) struct Ga102<E: FalconEngine>(PhantomData<E>); + +impl<E: FalconEngine> Ga102<E> { + pub(super) fn new() -> Self { + Self(PhantomData) + } +} + +impl<E: FalconEngine> FalconHal<E> for Ga102<E> { + fn select_core(&self, _falcon: &Falcon<E>, bar: &Bar0) -> Result<()> { + select_core_ga102::<E>(bar) + } + + fn get_signature_reg_fuse_version( + &self, + falcon: &Falcon<E>, + bar: &Bar0, + engine_id_mask: u16, + ucode_id: u8, + ) -> Result<u32> { + get_signature_reg_fuse_version_ga102(&falcon.dev, bar, engine_id_mask, ucode_id) + } + + fn program_brom( + &self, + _falcon: &Falcon<E>, + bar: &Bar0, + params: &FalconBromParams, + ) -> Result<()> { + program_brom_ga102::<E>(bar, params) + } +} diff --git a/drivers/gpu/nova-core/falcon/sec2.rs b/drivers/gpu/nova-core/falcon/sec2.rs new file mode 100644 index 0000000000000000000000000000000000000000..c1efdaa7c4e1b8c04c4e041aae3b61a8b65f656b --- /dev/null +++ b/drivers/gpu/nova-core/falcon/sec2.rs @@ -0,0 +1,8 @@ +// SPDX-License-Identifier: GPL-2.0 + +use crate::falcon::FalconEngine; + +pub(crate) struct Sec2; +impl FalconEngine for Sec2 { + const BASE: usize = 0x00840000; +} diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index c338da69ecbc2200f1ef3061a4d62971b021e3eb..ece13594fba687f3f714e255b5436e72d80dece3 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -5,6 +5,7 @@ use crate::devinit; use crate::dma::DmaObject; use crate::driver::Bar0; +use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon}; use crate::firmware::Firmware; use crate::regs; use crate::util; @@ -227,6 +228,16 @@ pub(crate) fn new( page }; + let gsp_falcon = Falcon::<Gsp>::new( + pdev.as_ref(), + spec.chipset, + bar, + spec.chipset > Chipset::GA100, + )?; + gsp_falcon.clear_swgen0_intr(bar); + + let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?; + Ok(pin_init!(Self { spec, bar: devres_bar, diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 1c7333e9fabe357f3ecbc6944ca98b66fa17c9a5..8342482a1aa16da2e69f7d99143c1549a82c969e 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -5,6 +5,7 @@ mod devinit; mod dma; mod driver; +mod falcon; mod firmware; mod gpu; mod regs; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 218cb6441eb0e5c6e5b52eabba006163eec0c8b4..b5c6eeb6ed873a06b4aefcb375f4944eb0b20597 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -9,6 +9,10 @@ #[macro_use] mod macros; +use crate::falcon::{ + DmaTrfCmdSize, FalconCoreRev, FalconCoreRevSubversion, FalconFbifMemType, FalconFbifTarget, + FalconModSelAlgo, FalconSecurityModel, PeregrineCoreSelect, +}; use crate::gpu::{Architecture, Chipset}; /* PMC */ @@ -59,3 +63,124 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> { register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05 @ 0x00118234 { 31:0 value as u32; }); + +/* PFALCON */ + +register!(NV_PFALCON_FALCON_IRQSCLR @ +0x00000004 { + 4:4 halt as bool; + 6:6 swgen0 as bool; +}); + +register!(NV_PFALCON_FALCON_MAILBOX0 @ +0x00000040 { + 31:0 value as u32; +}); + +register!(NV_PFALCON_FALCON_MAILBOX1 @ +0x00000044 { + 31:0 value as u32; +}); + +register!(NV_PFALCON_FALCON_RM @ +0x00000084 { + 31:0 value as u32; +}); + +register!(NV_PFALCON_FALCON_HWCFG2 @ +0x000000f4 { + 10:10 riscv as bool; + 12:12 mem_scrubbing as bool; + 31:31 reset_ready as bool, "Signal indicating that reset is completed (GA102+)"; +}); + +register!(NV_PFALCON_FALCON_CPUCTL @ +0x00000100 { + 1:1 startcpu as bool; + 4:4 halted as bool; + 6:6 alias_en as bool; +}); + +register!(NV_PFALCON_FALCON_BOOTVEC @ +0x00000104 { + 31:0 value as u32; +}); + +register!(NV_PFALCON_FALCON_DMACTL @ +0x0000010c { + 0:0 require_ctx as bool; + 1:1 dmem_scrubbing as bool; + 2:2 imem_scrubbing as bool; + 6:3 dmaq_num as u8; + 7:7 secure_stat as bool; +}); + +register!(NV_PFALCON_FALCON_DMATRFBASE @ +0x00000110 { + 31:0 base as u32; +}); + +register!(NV_PFALCON_FALCON_DMATRFMOFFS @ +0x00000114 { + 23:0 offs as u32; +}); + +register!(NV_PFALCON_FALCON_DMATRFCMD @ +0x00000118 { + 0:0 full as bool; + 1:1 idle as bool; + 3:2 sec as u8; + 4:4 imem as bool; + 5:5 is_write as bool; + 10:8 size as u8 ?=> DmaTrfCmdSize; + 14:12 ctxdma as u8; + 16:16 set_dmtag as u8; +}); + +register!(NV_PFALCON_FALCON_DMATRFFBOFFS @ +0x0000011c { + 31:0 offs as u32; +}); + +register!(NV_PFALCON_FALCON_DMATRFBASE1 @ +0x00000128 { + 8:0 base as u16; +}); + +register!(NV_PFALCON_FALCON_HWCFG1 @ +0x0000012c { + 3:0 core_rev as u8 ?=> FalconCoreRev, "Core revision"; + 5:4 security_model as u8 ?=> FalconSecurityModel, "Security model"; + 7:6 core_rev_subversion as u8 ?=> FalconCoreRevSubversion, "Core revision subversion"; +}); + +register!(NV_PFALCON_FALCON_CPUCTL_ALIAS @ +0x00000130 { + 1:1 startcpu as bool; +}); + +// Actually known as `NV_PSEC_FALCON_ENGINE` and `NV_PGSP_FALCON_ENGINE` depending on the falcon +// instance. +register!(NV_PFALCON_FALCON_ENGINE @ +0x000003c0 { + 0:0 reset as bool; +}); + +// TODO: this is an array of registers. +register!(NV_PFALCON_FBIF_TRANSCFG @ +0x00000600 { + 1:0 target as u8 ?=> FalconFbifTarget; + 2:2 mem_type as bool => FalconFbifMemType; +}); + +register!(NV_PFALCON_FBIF_CTL @ +0x00000624 { + 7:7 allow_phys_no_ctx as bool; +}); + +register!(NV_PFALCON2_FALCON_MOD_SEL @ +0x00001180 { + 7:0 algo as u8 ?=> FalconModSelAlgo; +}); + +register!(NV_PFALCON2_FALCON_BROM_CURR_UCODE_ID @ +0x00001198 { + 7:0 ucode_id as u8; +}); + +register!(NV_PFALCON2_FALCON_BROM_ENGIDMASK @ +0x0000119c { + 31:0 value as u32; +}); + +// TODO: this is an array of registers. +register!(NV_PFALCON2_FALCON_BROM_PARAADDR @ +0x00001210 { + 31:0 value as u32; +}); + +/* PRISCV */ + +register!(NV_PRISCV_RISCV_BCR_CTRL @ +0x00001668 { + 0:0 valid as bool; + 4:4 core_select as bool => PeregrineCoreSelect; + 8:8 br_fetch as bool; +}); diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs index afb525228431a2645afe7bb34988e9537757b1d7..81fcfff1f6f437d2f6a2130ce2249fbf4c1501be 100644 --- a/drivers/gpu/nova-core/util.rs +++ b/drivers/gpu/nova-core/util.rs @@ -34,7 +34,6 @@ pub(crate) const fn const_bytes_to_str(bytes: &[u8]) -> &str { /// /// TODO: replace with `read_poll_timeout` once it is available. /// (https://lore.kernel.org/lkml/20250220070611.214262-8-fujita.tomonori at gmail.com/) -#[expect(dead_code)] pub(crate) fn wait_on<R, F: Fn() -> Option<R>>(timeout: Duration, cond: F) -> Result<R> { let start_time = Ktime::ktime_get(); -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 14/19] gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS
FWSEC-FRTS is the first firmware we need to run on the GSP falcon in order to initiate the GSP boot process. Introduce the structure that describes it. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/firmware.rs | 43 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index cb79d039948858e657c9a23a62ed27ff780ac169..1eb216307cd01d975b3d5beda1dc516f34b4b3f2 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -45,6 +45,49 @@ pub(crate) fn new( } } +/// Structure used to describe some firmwares, notably FWSEC-FRTS. +#[repr(C)] +#[derive(Debug, Clone)] +pub(crate) struct FalconUCodeDescV3 { + /// Header defined by `NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*` in OpenRM. + /// + /// Bits `31:16` contain the size of the header, after which the actual ucode data starts. + hdr: u32, + /// Stored size of the ucode after the header. + stored_size: u32, + /// Offset in `DMEM` at which the signature is expected to be found. + pub(crate) pkc_data_offset: u32, + /// Offset after the code segment at which the app headers are located. + pub(crate) interface_offset: u32, + /// Base address at which to load the code segment into `IMEM`. + pub(crate) imem_phys_base: u32, + /// Size in bytes of the code to copy into `IMEM`. + pub(crate) imem_load_size: u32, + /// Virtual `IMEM` address (i.e. `tag`) at which the code should start. + pub(crate) imem_virt_base: u32, + /// Base address at which to load the data segment into `DMEM`. + pub(crate) dmem_phys_base: u32, + /// Size in bytes of the data to copy into `DMEM`. + pub(crate) dmem_load_size: u32, + /// Mask of the falcon engines on which this firmware can run. + pub(crate) engine_id_mask: u16, + /// ID of the ucode used to infer a fuse register to validate the signature. + pub(crate) ucode_id: u8, + /// Number of signatures in this firmware. + pub(crate) signature_count: u8, + /// Versions of the signatures, used to infer a valid signature to use. + pub(crate) signature_versions: u16, + _reserved: u16, +} + +// To be removed once that code is used. +#[expect(dead_code)] +impl FalconUCodeDescV3 { + pub(crate) fn size(&self) -> usize { + ((self.hdr & 0xffff0000) >> 16) as usize + } +} + pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>); impl<const N: usize> ModInfoBuilder<N> { -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 15/19] rust: num: Add an upward alignment helper for usize
From: Joel Fernandes <joelagnelf at nvidia.com> This will be used in the nova-core driver where we need to upward-align the image size to get to the next image in the VBIOS ROM. [acourbot at nvidia.com: handled conflicts due to removal of patch creating num.rs] Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- rust/kernel/lib.rs | 1 + rust/kernel/num.rs | 44 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index ab0286857061d2de1be0279cbd2cd3490e5a48c3..be75b196aa7a29cf3eed7c902ed8fb98689bbb50 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -67,6 +67,7 @@ pub mod miscdevice; #[cfg(CONFIG_NET)] pub mod net; +pub mod num; pub mod of; pub mod page; #[cfg(CONFIG_PCI)] diff --git a/rust/kernel/num.rs b/rust/kernel/num.rs new file mode 100644 index 0000000000000000000000000000000000000000..7d69cbd957546ab75fcf677b851f102c3a5e5c39 --- /dev/null +++ b/rust/kernel/num.rs @@ -0,0 +1,44 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Numerical and binary utilities for primitive types. + +/// A trait providing alignment operations for `usize`. +use core::ops::{Add, BitAnd, BitOr, Not, Sub}; + +/// Traits for unsigned integers +pub trait Unsigned: + Copy + + BitAnd<Output = Self> + + BitOr<Output = Self> + + Not<Output = Self> + + Add<Output = Self> + + Sub<Output = Self> + + From<u8> +{ +} + +macro_rules! unsigned_trait_impl { + ($($t:ty),+) => { + $( + impl Unsigned for $t {} + )+ + }; +} +unsigned_trait_impl!(usize, u8, u16, u32, u64, u128); + +/// Trait for unsigned integer alignment +pub trait NumAlign { + /// Implement upward power-of-2 alignment for unsigned ints + fn align_up(self, alignment: Self) -> Self; +} + +impl<T> NumAlign for T +where + T: Unsigned, +{ + #[inline] + fn align_up(self, alignment: Self) -> Self { + let one = T::from(1u8); + (self + alignment - one) & !(alignment - one) + } +} -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 16/19] nova-core: Add support for VBIOS ucode extraction for boot
From: Joel Fernandes <joelagnelf at nvidia.com> Add support for navigating and setting up vBIOS ucode data required for GSP to boot. The main data extracted from the vBIOS is the FWSEC-FRTS firmware which runs on the GSP processor. This firmware runs in high secure mode, and sets up the WPR2 (Write protected region) before the Booter runs on the SEC2 processor. Also add log messages to show the BIOS images. [102141.013287] NovaCore: Found BIOS image at offset 0x0, size: 0xfe00, type: PciAt [102141.080692] NovaCore: Found BIOS image at offset 0xfe00, size: 0x14800, type: Efi [102141.098443] NovaCore: Found BIOS image at offset 0x24600, size: 0x5600, type: FwSec [102141.415095] NovaCore: Found BIOS image at offset 0x29c00, size: 0x60800, type: FwSec Tested on my Ampere GA102 and boot is successful. [applied changes by Alex Courbot for fwsec signatures] [applied feedback from Alex Courbot and Timur Tabi] [applied changes related to code reorg, prints etc from Danilo Krummrich] [acourbot at nvidia.com: fix clippy warnings] [acourbot at nvidia.com: remove now-unneeded Devres acquisition] [acourbot at nvidia.com: fix read_more to read `len` bytes, not u32s] Cc: Alexandre Courbot <acourbot at nvidia.com> Cc: John Hubbard <jhubbard at nvidia.com> Cc: Shirish Baskaran <sbaskaran at nvidia.com> Cc: Alistair Popple <apopple at nvidia.com> Cc: Timur Tabi <ttabi at nvidia.com> Cc: Ben Skeggs <bskeggs at nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/firmware.rs | 2 - drivers/gpu/nova-core/gpu.rs | 3 + drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/vbios.rs | 1147 ++++++++++++++++++++++++++++++++++++ 4 files changed, 1151 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 1eb216307cd01d975b3d5beda1dc516f34b4b3f2..960982174d834c7c66a47ecfb3a15bf47116b2c5 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -80,8 +80,6 @@ pub(crate) struct FalconUCodeDescV3 { _reserved: u16, } -// To be removed once that code is used. -#[expect(dead_code)] impl FalconUCodeDescV3 { pub(crate) fn size(&self) -> usize { ((self.hdr & 0xffff0000) >> 16) as usize diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index ece13594fba687f3f714e255b5436e72d80dece3..4bf7f72247e5320935a517270b5a0e1ec2becfec 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -9,6 +9,7 @@ use crate::firmware::Firmware; use crate::regs; use crate::util; +use crate::vbios::Vbios; use core::fmt; macro_rules! define_chipset { @@ -238,6 +239,8 @@ pub(crate) fn new( let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?; + let _bios = Vbios::new(pdev, bar)?; + Ok(pin_init!(Self { spec, bar: devres_bar, diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 8342482a1aa16da2e69f7d99143c1549a82c969e..ff6d0b40c18f36af4c7e2d5c839fdf77dba23321 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -10,6 +10,7 @@ mod gpu; mod regs; mod util; +mod vbios; kernel::module_pci_driver! { type: driver::NovaCore, diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs new file mode 100644 index 0000000000000000000000000000000000000000..cd55d8dbf8e12d532f776d7544c7e5f2a865d6f8 --- /dev/null +++ b/drivers/gpu/nova-core/vbios.rs @@ -0,0 +1,1147 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! VBIOS extraction and parsing. + +// To be removed when all code is used. +#![expect(dead_code)] + +use crate::driver::Bar0; +use crate::firmware::FalconUCodeDescV3; +use core::convert::TryFrom; +use kernel::device; +use kernel::error::Result; +use kernel::num::NumAlign; +use kernel::pci; +use kernel::prelude::*; + +/// The offset of the VBIOS ROM in the BAR0 space. +const ROM_OFFSET: usize = 0x300000; +/// The maximum length of the VBIOS ROM to scan into. +const BIOS_MAX_SCAN_LEN: usize = 0x100000; +/// The size to read ahead when parsing initial BIOS image headers. +const BIOS_READ_AHEAD_SIZE: usize = 1024; +/// The bit in the last image indicator byte for the PCI Data Structure that +/// indicates the last image. Bit 0-6 are reserved, bit 7 is last image bit. +const LAST_IMAGE_BIT_MASK: u8 = 0x80; + +// PMU lookup table entry types. Used to locate PMU table entries +// in the Fwsec image, corresponding to falcon ucodes. +#[expect(dead_code)] +const FALCON_UCODE_ENTRY_APPID_FIRMWARE_SEC_LIC: u8 = 0x05; +#[expect(dead_code)] +const FALCON_UCODE_ENTRY_APPID_FWSEC_DBG: u8 = 0x45; +const FALCON_UCODE_ENTRY_APPID_FWSEC_PROD: u8 = 0x85; + +/// Vbios Reader for constructing the VBIOS data +struct VbiosIterator<'a> { + pdev: &'a pci::Device, + bar0: &'a Bar0, + // VBIOS data vector: As BIOS images are scanned, they are added to this vector + // for reference or copying into other data structures. It is the entire + // scanned contents of the VBIOS which progressively extends. It is used + // so that we do not re-read any contents that are already read as we use + // the cumulative length read so far, and re-read any gaps as we extend + // the length. + data: KVec<u8>, + current_offset: usize, // Current offset for iterator + last_found: bool, // Whether the last image has been found +} + +impl<'a> VbiosIterator<'a> { + fn new(pdev: &'a pci::Device, bar0: &'a Bar0) -> Result<Self> { + Ok(Self { + pdev, + bar0, + data: KVec::new(), + current_offset: 0, + last_found: false, + }) + } + + /// Read bytes from the ROM at the current end of the data vector + fn read_more(&mut self, len: usize) -> Result { + let current_len = self.data.len(); + let start = ROM_OFFSET + current_len; + + // Ensure length is a multiple of 4 for 32-bit reads + if len % core::mem::size_of::<u32>() != 0 { + dev_err!( + self.pdev.as_ref(), + "VBIOS read length {} is not a multiple of 4\n", + len + ); + return Err(EINVAL); + } + + self.data.reserve(len, GFP_KERNEL)?; + // Read ROM data bytes and push directly to vector + for i in (0..len).step_by(core::mem::size_of::<u32>()) { + // Read 32-bit word from the VBIOS ROM + let word = self.bar0.try_read32(start + i)?; + + // Convert the u32 to a 4 byte array and push each byte + word.to_ne_bytes() + .iter() + .try_for_each(|&b| self.data.push(b, GFP_KERNEL))?; + } + + Ok(()) + } + + /// Read bytes at a specific offset, filling any gap + fn read_more_at_offset(&mut self, offset: usize, len: usize) -> Result { + if offset > BIOS_MAX_SCAN_LEN { + dev_err!(self.pdev.as_ref(), "Error: exceeded BIOS scan limit.\n"); + return Err(EINVAL); + } + + // If offset is beyond current data size, fill the gap first + let current_len = self.data.len(); + let gap_bytes = offset.saturating_sub(current_len); + + // Now read the requested bytes at the offset + self.read_more(gap_bytes + len) + } + + /// Read a BIOS image at a specific offset and create a BiosImage from it. + /// self.data is extended as needed and a new BiosImage is returned. + /// @context is a string describing the operation for error reporting + fn read_bios_image_at_offset( + &mut self, + offset: usize, + len: usize, + context: &str, + ) -> Result<BiosImage> { + let data_len = self.data.len(); + if offset + len > data_len { + self.read_more_at_offset(offset, len).inspect_err(|e| { + dev_err!( + self.pdev.as_ref(), + "Failed to read more at offset {:#x}: {:?}\n", + offset, + e + ) + })?; + } + + BiosImage::new(self.pdev, &self.data[offset..offset + len]).inspect_err(|err| { + dev_err!( + self.pdev.as_ref(), + "Failed to {} at offset {:#x}: {:?}\n", + context, + offset, + err + ) + }) + } +} + +impl<'a> Iterator for VbiosIterator<'a> { + type Item = Result<BiosImage>; + + /// Iterate over all VBIOS images until the last image is detected or offset + /// exceeds scan limit. + fn next(&mut self) -> Option<Self::Item> { + if self.last_found { + return None; + } + + if self.current_offset > BIOS_MAX_SCAN_LEN { + dev_err!( + self.pdev.as_ref(), + "Error: exceeded BIOS scan limit, stopping scan\n" + ); + return None; + } + + // Parse image headers first to get image size + let image_size = match self + .read_bios_image_at_offset( + self.current_offset, + BIOS_READ_AHEAD_SIZE, + "parse initial BIOS image headers", + ) + .and_then(|image| image.image_size_bytes()) + { + Ok(size) => size, + Err(e) => return Some(Err(e)), + }; + + // Now create a new BiosImage with the full image data + let full_image = match self.read_bios_image_at_offset( + self.current_offset, + image_size, + "parse full BIOS image", + ) { + Ok(image) => image, + Err(e) => return Some(Err(e)), + }; + + self.last_found = full_image.is_last(); + + // Advance to next image (aligned to 512 bytes) + self.current_offset += image_size; + self.current_offset = self.current_offset.align_up(512); + + Some(Ok(full_image)) + } +} + +pub(crate) struct Vbios { + pub fwsec_image: Option<FwSecBiosImage>, +} + +impl Vbios { + /// Probe for VBIOS extraction + /// Once the VBIOS object is built, bar0 is not read for vbios purposes anymore. + pub(crate) fn new(pdev: &pci::Device, bar0: &Bar0) -> Result<Vbios> { + // Images to extract from iteration + let mut pci_at_image: Option<PciAtBiosImage> = None; + let mut first_fwsec_image: Option<FwSecBiosImage> = None; + let mut second_fwsec_image: Option<FwSecBiosImage> = None; + + // Parse all VBIOS images in the ROM + for image_result in VbiosIterator::new(pdev, bar0)? { + let full_image = image_result?; + + dev_info!( + pdev.as_ref(), + "Found BIOS image: size: {:#x}, type: {}, last: {}\n", + full_image.image_size_bytes()?, + full_image.image_type_str(), + full_image.is_last() + ); + + // Get references to images we will need after the loop, in order to + // setup the falcon data offset. + match full_image { + BiosImage::PciAt(image) => { + pci_at_image = Some(image); + } + BiosImage::FwSec(image) => { + if first_fwsec_image.is_none() { + first_fwsec_image = Some(image); + } else { + second_fwsec_image = Some(image); + } + } + // For now we don't need to handle these + BiosImage::Efi(_image) => {} + BiosImage::Nbsi(_image) => {} + } + } + + // Using all the images, setup the falcon data pointer in Fwsec. + // We need mutable access here, so we handle the Option manually. + let final_fwsec_image = { + let mut second = second_fwsec_image; // Take ownership of the option + + if let (Some(second), Some(first), Some(pci_at)) + (second.as_mut(), first_fwsec_image, pci_at_image) + { + second + .setup_falcon_data(pdev, &pci_at, &first) + .inspect_err(|e| { + dev_err!(pdev.as_ref(), "Falcon data setup failed: {:?}\n", e) + })?; + } else { + dev_err!( + pdev.as_ref(), + "Missing required images for falcon data setup, skipping\n" + ); + return Err(EINVAL); + } + second + }; + + Ok(Vbios { + fwsec_image: final_fwsec_image, + }) + } + + pub(crate) fn fwsec_header(&self, pdev: &device::Device) -> Result<&FalconUCodeDescV3> { + let image = self.fwsec_image.as_ref().ok_or(EINVAL)?; + image.fwsec_header(pdev) + } + + pub(crate) fn fwsec_ucode(&self, pdev: &device::Device) -> Result<&[u8]> { + let image = self.fwsec_image.as_ref().ok_or(EINVAL)?; + image.fwsec_ucode(pdev, image.fwsec_header(pdev)?) + } + + pub(crate) fn fwsec_sigs(&self, pdev: &device::Device) -> Result<&[u8]> { + let image = self.fwsec_image.as_ref().ok_or(EINVAL)?; + image.fwsec_sigs(pdev, image.fwsec_header(pdev)?) + } +} + +/// PCI Data Structure as defined in PCI Firmware Specification +#[derive(Debug, Clone)] +#[repr(C)] +struct PcirStruct { + /// PCI Data Structure signature ("PCIR" or "NPDS") + pub signature: [u8; 4], + /// PCI Vendor ID (e.g., 0x10DE for NVIDIA) + pub vendor_id: u16, + /// PCI Device ID + pub device_id: u16, + /// Device List Pointer + pub device_list_ptr: u16, + /// PCI Data Structure Length + pub pci_data_struct_len: u16, + /// PCI Data Structure Revision + pub pci_data_struct_rev: u8, + /// Class code (3 bytes, 0x03 for display controller) + pub class_code: [u8; 3], + /// Size of this image in 512-byte blocks + pub image_len: u16, + /// Revision Level of the Vendor's ROM + pub vendor_rom_rev: u16, + /// ROM image type (0x00 = PC-AT compatible, 0x03 = EFI, 0x70 = NBSI) + pub code_type: u8, + /// Last image indicator (0x00 = Not last image, 0x80 = Last image) + pub last_image: u8, + /// Maximum Run-time Image Length (units of 512 bytes) + pub max_runtime_image_len: u16, +} + +impl PcirStruct { + fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> { + if data.len() < core::mem::size_of::<PcirStruct>() { + dev_err!(pdev.as_ref(), "Not enough data for PcirStruct\n"); + return Err(EINVAL); + } + + let mut signature = [0u8; 4]; + signature.copy_from_slice(&data[0..4]); + + // Signature should be "PCIR" (0x52494350) or "NPDS" (0x5344504e) + if &signature != b"PCIR" && &signature != b"NPDS" { + dev_err!( + pdev.as_ref(), + "Invalid signature for PcirStruct: {:?}\n", + signature + ); + return Err(EINVAL); + } + + let mut class_code = [0u8; 3]; + class_code.copy_from_slice(&data[13..16]); + + Ok(PcirStruct { + signature, + vendor_id: u16::from_le_bytes([data[4], data[5]]), + device_id: u16::from_le_bytes([data[6], data[7]]), + device_list_ptr: u16::from_le_bytes([data[8], data[9]]), + pci_data_struct_len: u16::from_le_bytes([data[10], data[11]]), + pci_data_struct_rev: data[12], + class_code, + image_len: u16::from_le_bytes([data[16], data[17]]), + vendor_rom_rev: u16::from_le_bytes([data[18], data[19]]), + code_type: data[20], + last_image: data[21], + max_runtime_image_len: u16::from_le_bytes([data[22], data[23]]), + }) + } + + /// Check if this is the last image in the ROM + fn is_last(&self) -> bool { + self.last_image & LAST_IMAGE_BIT_MASK != 0 + } + + /// Calculate image size in bytes + fn image_size_bytes(&self) -> Result<usize> { + if self.image_len > 0 { + // Image size is in 512-byte blocks + Ok(self.image_len as usize * 512) + } else { + Err(EINVAL) + } + } +} + +/// BIOS Information Table (BIT) Header +/// This is the head of the BIT table, that is used to locate the Falcon data. +/// The BIT table (with its header) is in the PciAtBiosImage and the falcon data +/// it is pointing to is in the FwSecBiosImage. +#[derive(Debug, Clone, Copy)] +#[expect(dead_code)] +struct BitHeader { + /// 0h: BIT Header Identifier (BMP=0x7FFF/BIT=0xB8FF) + pub id: u16, + /// 2h: BIT Header Signature ("BIT\0") + pub signature: [u8; 4], + /// 6h: Binary Coded Decimal Version, ex: 0x0100 is 1.00. + pub bcd_version: u16, + /// 8h: Size of BIT Header (in bytes) + pub header_size: u8, + /// 9h: Size of BIT Tokens (in bytes) + pub token_size: u8, + /// 10h: Number of token entries that follow + pub token_entries: u8, + /// 11h: BIT Header Checksum + pub checksum: u8, +} + +impl BitHeader { + fn new(data: &[u8]) -> Result<Self> { + if data.len() < 12 { + return Err(EINVAL); + } + + let mut signature = [0u8; 4]; + signature.copy_from_slice(&data[2..6]); + + // Check header ID and signature + let id = u16::from_le_bytes([data[0], data[1]]); + if id != 0xB8FF || &signature != b"BIT\0" { + return Err(EINVAL); + } + + Ok(BitHeader { + id, + signature, + bcd_version: u16::from_le_bytes([data[6], data[7]]), + header_size: data[8], + token_size: data[9], + token_entries: data[10], + checksum: data[11], + }) + } +} + +/// BIT Token Entry: Records in the BIT table followed by the BIT header +#[derive(Debug, Clone, Copy)] +#[expect(dead_code)] +struct BitToken { + /// 00h: Token identifier + pub id: u8, + /// 01h: Version of the token data + pub data_version: u8, + /// 02h: Size of token data in bytes + pub data_size: u16, + /// 04h: Offset to the token data + pub data_offset: u16, +} + +// Define the token ID for the Falcon data +pub(in crate::vbios) const BIT_TOKEN_ID_FALCON_DATA: u8 = 0x70; + +impl BitToken { + /// Find a BIT token entry by BIT ID in a PciAtBiosImage + pub(in crate::vbios) fn from_id(image: &PciAtBiosImage, token_id: u8) -> Result<Self> { + let header = image.bit_header.as_ref().ok_or(EINVAL)?; + + // Offset to the first token entry + let tokens_start = image.bit_offset.ok_or(EINVAL)? + header.header_size as usize; + + for i in 0..header.token_entries as usize { + let entry_offset = tokens_start + (i * header.token_size as usize); + + // Make sure we don't go out of bounds + if entry_offset + header.token_size as usize > image.base.data.len() { + return Err(EINVAL); + } + + // Check if this token has the requested ID + if image.base.data[entry_offset] == token_id { + return Ok(BitToken { + id: image.base.data[entry_offset], + data_version: image.base.data[entry_offset + 1], + data_size: u16::from_le_bytes([ + image.base.data[entry_offset + 2], + image.base.data[entry_offset + 3], + ]), + data_offset: u16::from_le_bytes([ + image.base.data[entry_offset + 4], + image.base.data[entry_offset + 5], + ]), + }); + } + } + + // Token not found + Err(ENOENT) + } +} + +/// PCI ROM Expansion Header as defined in PCI Firmware Specification. +/// This is header is at the beginning of every image in the set of +/// images in the ROM. It contains a pointer to the PCI Data Structure +/// which describes the image. +/// For "NBSI" images (NoteBook System Information), the ROM +/// header deviates from the standard and contains an offset to the +/// NBSI image however we do not yet parse that in this module and keep +/// it for future reference. +#[derive(Debug, Clone, Copy)] +#[expect(dead_code)] +struct PciRomHeader { + /// 00h: Signature (0xAA55) + pub signature: u16, + /// 02h: Reserved bytes for processor architecture unique data (20 bytes) + pub reserved: [u8; 20], + /// 16h: NBSI Data Offset (NBSI-specific, offset from header to NBSI image) + pub nbsi_data_offset: Option<u16>, + /// 18h: Pointer to PCI Data Structure (offset from start of ROM image) + pub pci_data_struct_offset: u16, + /// 1Ah: Size of block (this is NBSI-specific) + pub size_of_block: Option<u32>, +} + +impl PciRomHeader { + fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> { + if data.len() < 26 { + // Need at least 26 bytes to read pciDataStrucPtr and sizeOfBlock + return Err(EINVAL); + } + + let signature = u16::from_le_bytes([data[0], data[1]]); + + // Check for valid ROM signatures + match signature { + 0xAA55 | 0xBB77 | 0x4E56 => {} + _ => { + dev_err!(pdev.as_ref(), "ROM signature unknown {:#x}\n", signature); + return Err(EINVAL); + } + } + + // Read the pointer to the PCI Data Structure at offset 0x18 + let pci_data_struct_ptr = u16::from_le_bytes([data[24], data[25]]); + + // Try to read optional fields if enough data + let mut size_of_block = None; + let mut nbsi_data_offset = None; + + if data.len() >= 30 { + // Read size_of_block at offset 0x1A + size_of_block = Some( + (data[29] as u32) << 24 + | (data[28] as u32) << 16 + | (data[27] as u32) << 8 + | (data[26] as u32), + ); + } + + // For NBSI images, try to read the nbsiDataOffset at offset 0x16 + if data.len() >= 24 { + nbsi_data_offset = Some(u16::from_le_bytes([data[22], data[23]])); + } + + Ok(PciRomHeader { + signature, + reserved: [0u8; 20], + pci_data_struct_offset: pci_data_struct_ptr, + size_of_block, + nbsi_data_offset, + }) + } +} + +/// NVIDIA PCI Data Extension Structure. This is similar to the +/// PCI Data Structure, but is Nvidia-specific and is placed right after +/// the PCI Data Structure. It contains some fields that are redundant +/// with the PCI Data Structure, but are needed for traversing the +/// BIOS images. It is expected to be present in all BIOS images except +/// for NBSI images. +#[derive(Debug, Clone)] +#[expect(dead_code)] +struct NpdeStruct { + /// 00h: Signature ("NPDE") + pub signature: [u8; 4], + /// 04h: NVIDIA PCI Data Extension Revision + pub npci_data_ext_rev: u16, + /// 06h: NVIDIA PCI Data Extension Length + pub npci_data_ext_len: u16, + /// 08h: Sub-image Length (in 512-byte units) + pub subimage_len: u16, + /// 0Ah: Last image indicator flag + pub last_image: u8, +} + +impl NpdeStruct { + fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> { + if data.len() < 11 { + dev_err!(pdev.as_ref(), "Not enough data for NpdeStruct\n"); + return Err(EINVAL); + } + + let mut signature = [0u8; 4]; + signature.copy_from_slice(&data[0..4]); + + // Signature should be "NPDE" (0x4544504E) + if &signature != b"NPDE" { + dev_err!( + pdev.as_ref(), + "Invalid signature for NpdeStruct: {:?}\n", + signature + ); + return Err(EINVAL); + } + + Ok(NpdeStruct { + signature, + npci_data_ext_rev: u16::from_le_bytes([data[4], data[5]]), + npci_data_ext_len: u16::from_le_bytes([data[6], data[7]]), + subimage_len: u16::from_le_bytes([data[8], data[9]]), + last_image: data[10], + }) + } + + /// Check if this is the last image in the ROM + fn is_last(&self) -> bool { + self.last_image & LAST_IMAGE_BIT_MASK != 0 + } + + /// Calculate image size in bytes + fn image_size_bytes(&self) -> Result<usize> { + if self.subimage_len > 0 { + // Image size is in 512-byte blocks + Ok(self.subimage_len as usize * 512) + } else { + Err(EINVAL) + } + } + + /// Try to find NPDE in the data, the NPDE is right after the PCIR. + fn find_in_data( + pdev: &pci::Device, + data: &[u8], + rom_header: &PciRomHeader, + pcir: &PcirStruct, + ) -> Option<Self> { + // Calculate the offset where NPDE might be located + // NPDE should be right after the PCIR structure, aligned to 16 bytes + let pcir_offset = rom_header.pci_data_struct_offset as usize; + let npde_start = (pcir_offset + pcir.pci_data_struct_len as usize + 0x0F) & !0x0F; + + // Check if we have enough data + if npde_start + 11 > data.len() { + dev_err!(pdev.as_ref(), "Not enough data for NPDE\n"); + return None; + } + + // Try to create NPDE from the data + NpdeStruct::new(pdev, &data[npde_start..]) + .inspect_err(|e| { + dev_err!(pdev.as_ref(), "Error creating NpdeStruct: {:?}\n", e); + }) + .ok() + } +} + +// Use a macro to implement BiosImage enum and methods. This avoids having to +// repeat each enum type when implementing functions like base() in BiosImage. +macro_rules! bios_image { + ( + $($variant:ident $class:ident),* $(,)? + ) => { + // BiosImage enum with variants for each image type + enum BiosImage { + $($variant($class)),* + } + + impl BiosImage { + /// Get a reference to the common BIOS image data regardless of type + fn base(&self) -> &BiosImageBase { + match self { + $(Self::$variant(img) => &img.base),* + } + } + + /// Returns a string representing the type of BIOS image + fn image_type_str(&self) -> &'static str { + match self { + $(Self::$variant(_) => stringify!($variant)),* + } + } + } + } +} + +impl BiosImage { + /// Check if this is the last image + fn is_last(&self) -> bool { + let base = self.base(); + + // For NBSI images (type == 0x70), return true as they're + // considered the last image + if matches!(self, Self::Nbsi(_)) { + return true; + } + + // For other image types, check the NPDE first if available + if let Some(ref npde) = base.npde { + return npde.is_last(); + } + + // Otherwise, fall back to checking the PCIR last_image flag + base.pcir.is_last() + } + + /// Get the image size in bytes + fn image_size_bytes(&self) -> Result<usize> { + let base = self.base(); + + // Prefer NPDE image size if available + if let Some(ref npde) = base.npde { + return npde.image_size_bytes(); + } + + // Otherwise, fall back to the PCIR image size + base.pcir.image_size_bytes() + } + + /// Create a BiosImageBase from a byte slice and convert it to a BiosImage + /// which triggers the constructor of the specific BiosImage enum variant. + fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> { + let base = BiosImageBase::new(pdev, data)?; + let image = base.into_image().inspect_err(|e| { + dev_err!(pdev.as_ref(), "Failed to create BiosImage: {:?}\n", e); + })?; + + image.image_size_bytes().inspect_err(|_| { + dev_err!( + pdev.as_ref(), + "Invalid image size computed during BiosImage creation\n" + ) + })?; + + Ok(image) + } +} + +bios_image! { + PciAt PciAtBiosImage, // PCI-AT compatible BIOS image + Efi EfiBiosImage, // EFI (Extensible Firmware Interface) + Nbsi NbsiBiosImage, // NBSI (Nvidia Bios System Interface) + FwSec FwSecBiosImage // FWSEC (Firmware Security) +} + +struct PciAtBiosImage { + base: BiosImageBase, + bit_header: Option<BitHeader>, + bit_offset: Option<usize>, +} + +struct EfiBiosImage { + base: BiosImageBase, + // EFI-specific fields can be added here in the future. +} + +struct NbsiBiosImage { + base: BiosImageBase, + // NBSI-specific fields can be added here in the future. +} + +pub(crate) struct FwSecBiosImage { + base: BiosImageBase, + // FWSEC-specific fields + // The offset of the Falcon data from the start of Fwsec image + falcon_data_offset: Option<usize>, + // The PmuLookupTable starts at the offset of the falcon data pointer + pmu_lookup_table: Option<PmuLookupTable>, + // The offset of the Falcon ucode + falcon_ucode_offset: Option<usize>, +} + +// Convert from BiosImageBase to BiosImage +impl TryFrom<BiosImageBase> for BiosImage { + type Error = Error; + + fn try_from(base: BiosImageBase) -> Result<Self> { + match base.pcir.code_type { + 0x00 => Ok(BiosImage::PciAt(base.try_into()?)), + 0x03 => Ok(BiosImage::Efi(EfiBiosImage { base })), + 0x70 => Ok(BiosImage::Nbsi(NbsiBiosImage { base })), + 0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage { + base, + falcon_data_offset: None, + pmu_lookup_table: None, + falcon_ucode_offset: None, + })), + _ => Err(EINVAL), + } + } +} + +/// BIOS Image structure containing various headers and references +/// fields base to all BIOS images. Each BiosImage type has a +/// BiosImageBase type along with other image-specific fields. +/// Note that Rust favors composition of types over inheritance. +#[derive(Debug)] +#[expect(dead_code)] +struct BiosImageBase { + /// PCI ROM Expansion Header + rom_header: PciRomHeader, + /// PCI Data Structure + pcir: PcirStruct, + /// NVIDIA PCI Data Extension (optional) + npde: Option<NpdeStruct>, + /// Image data (includes ROM header and PCIR) + data: KVec<u8>, +} + +impl BiosImageBase { + fn into_image(self) -> Result<BiosImage> { + BiosImage::try_from(self) + } + + /// Creates a new BiosImageBase from raw byte data. + fn new(pdev: &pci::Device, data: &[u8]) -> Result<Self> { + // Ensure we have enough data for the ROM header + if data.len() < 26 { + dev_err!(pdev.as_ref(), "Not enough data for ROM header\n"); + return Err(EINVAL); + } + + // Parse the ROM header + let rom_header = PciRomHeader::new(pdev, &data[0..26]) + .inspect_err(|e| dev_err!(pdev.as_ref(), "Failed to create PciRomHeader: {:?}\n", e))?; + + // Get the PCI Data Structure using the pointer from the ROM header + let pcir_offset = rom_header.pci_data_struct_offset as usize; + let pcir_data = data + .get(pcir_offset..pcir_offset + core::mem::size_of::<PcirStruct>()) + .ok_or(EINVAL) + .inspect_err(|_| { + dev_err!( + pdev.as_ref(), + "PCIR offset {:#x} out of bounds (data length: {})\n", + pcir_offset, + data.len() + ); + dev_err!( + pdev.as_ref(), + "Consider reading more data for construction of BiosImage\n" + ); + })?; + + let pcir = PcirStruct::new(pdev, pcir_data) + .inspect_err(|e| dev_err!(pdev.as_ref(), "Failed to create PcirStruct: {:?}\n", e))?; + + // Look for NPDE structure if this is not an NBSI image (type != 0x70) + let npde = NpdeStruct::find_in_data(pdev, data, &rom_header, &pcir); + + // Create a copy of the data + let mut data_copy = KVec::new(); + data_copy.extend_with(data.len(), 0, GFP_KERNEL)?; + data_copy.copy_from_slice(data); + + Ok(BiosImageBase { + rom_header, + pcir, + npde, + data: data_copy, + }) + } +} + +/// The PciAt BIOS image is typically the first BIOS image type found in the +/// BIOS image chain. It contains the BIT header and the BIT tokens. +impl PciAtBiosImage { + /// Find a byte pattern in a slice + fn find_byte_pattern(haystack: &[u8], needle: &[u8]) -> Option<usize> { + haystack + .windows(needle.len()) + .position(|window| window == needle) + } + + /// Find the BIT header in the PciAtBiosImage + fn find_bit_header(data: &[u8]) -> Result<(BitHeader, usize)> { + let bit_pattern = [0xff, 0xb8, b'B', b'I', b'T', 0x00]; + let bit_offset = Self::find_byte_pattern(data, &bit_pattern); + if bit_offset.is_none() { + return Err(EINVAL); + } + + let bit_header = BitHeader::new(&data[bit_offset.ok_or(EINVAL)?..])?; + Ok((bit_header, bit_offset.ok_or(EINVAL)?)) + } + + /// Get a BIT token entry from the BIT table in the PciAtBiosImage + fn get_bit_token(&self, token_id: u8) -> Result<BitToken> { + BitToken::from_id(self, token_id) + } + + /// Find the Falcon data pointer structure in the PciAtBiosImage + /// This is just a 4 byte structure that contains a pointer to the + /// Falcon data in the FWSEC image. + fn falcon_data_ptr(&self, pdev: &pci::Device) -> Result<u32> { + let token = self.get_bit_token(BIT_TOKEN_ID_FALCON_DATA)?; + + // Make sure we don't go out of bounds + if token.data_offset as usize + 4 > self.base.data.len() { + return Err(EINVAL); + } + + // read the 4 bytes at the offset specified in the token + let offset = token.data_offset as usize; + let bytes: [u8; 4] = self.base.data[offset..offset + 4].try_into().map_err(|_| { + dev_err!(pdev.as_ref(), "Failed to convert data slice to array"); + EINVAL + })?; + + let data_ptr = u32::from_le_bytes(bytes); + + if (data_ptr as usize) < self.base.data.len() { + dev_err!(pdev.as_ref(), "Falcon data pointer out of bounds\n"); + return Err(EINVAL); + } + + Ok(data_ptr) + } +} + +impl TryFrom<BiosImageBase> for PciAtBiosImage { + type Error = Error; + + fn try_from(base: BiosImageBase) -> Result<Self> { + let data_slice = &base.data; + let (bit_header, bit_offset) = PciAtBiosImage::find_bit_header(data_slice)?; + + Ok(PciAtBiosImage { + base, + bit_header: Some(bit_header), + bit_offset: Some(bit_offset), + }) + } +} + +/// The PmuLookupTableEntry structure is a single entry in the PmuLookupTable. +/// See the PmuLookupTable description for more information. +#[expect(dead_code)] +struct PmuLookupTableEntry { + application_id: u8, + target_id: u8, + data: u32, +} + +impl PmuLookupTableEntry { + fn new(data: &[u8]) -> Result<Self> { + if data.len() < 5 { + return Err(EINVAL); + } + + Ok(PmuLookupTableEntry { + application_id: data[0], + target_id: data[1], + data: u32::from_le_bytes(data[2..6].try_into().map_err(|_| EINVAL)?), + }) + } +} + +/// The PmuLookupTableEntry structure is used to find the PmuLookupTableEntry +/// for a given application ID. The table of entries is pointed to by the falcon +/// data pointer in the BIT table, and is used to locate the Falcon Ucode. +#[expect(dead_code)] +struct PmuLookupTable { + version: u8, + header_len: u8, + entry_len: u8, + entry_count: u8, + table_data: KVec<u8>, +} + +impl PmuLookupTable { + fn new(data: &[u8]) -> Result<Self> { + if data.len() < 4 { + return Err(EINVAL); + } + + let header_len = data[1] as usize; + let entry_len = data[2] as usize; + let entry_count = data[3] as usize; + + let required_bytes = header_len + (entry_count * entry_len); + + if data.len() < required_bytes { + return Err(EINVAL); + } + + // Create a copy of only the table data + let mut table_data = KVec::new(); + + // "last_entry_bytes" is a debugging aid. + let mut last_entry_bytes: Option<KVec<u8>> = if cfg!(debug_assertions) { + Some(KVec::new()) + } else { + None + }; + + for &byte in &data[header_len..required_bytes] { + table_data.push(byte, GFP_KERNEL)?; + + if cfg!(debug_assertions) { + // Debugging (dumps the table data to dmesg): + if let Some(ref mut last_entry_bytes) = last_entry_bytes { + last_entry_bytes.push(byte, GFP_KERNEL)?; + + let last_entry_bytes_len = last_entry_bytes.len(); + if last_entry_bytes_len == entry_len { + pr_info!("Last entry bytes: {:02x?}\n", &last_entry_bytes[..]); + *last_entry_bytes = KVec::new(); + } + } + } + } + + Ok(PmuLookupTable { + version: data[0], + header_len: header_len as u8, + entry_len: entry_len as u8, + entry_count: entry_count as u8, + table_data, + }) + } + + fn lookup_index(&self, idx: u8) -> Result<PmuLookupTableEntry> { + if idx >= self.entry_count { + return Err(EINVAL); + } + + let index = (idx as usize) * self.entry_len as usize; + PmuLookupTableEntry::new(&self.table_data[index..]) + } + + // find entry by type value + fn find_entry_by_type(&self, entry_type: u8) -> Result<PmuLookupTableEntry> { + for i in 0..self.entry_count { + let entry = self.lookup_index(i)?; + if entry.application_id == entry_type { + return Ok(entry); + } + } + + Err(EINVAL) + } +} + +/// The FwSecBiosImage structure contains the PMU table and the Falcon Ucode. +/// The PMU table contains voltage/frequency tables as well as a pointer to the +/// Falcon Ucode. +impl FwSecBiosImage { + fn setup_falcon_data( + &mut self, + pdev: &pci::Device, + pci_at_image: &PciAtBiosImage, + first_fwsec_image: &FwSecBiosImage, + ) -> Result<()> { + let mut offset = pci_at_image.falcon_data_ptr(pdev)? as usize; + + // The falcon data pointer assumes that the PciAt and FWSEC images + // are contiguous in memory. However, testing shows the EFI image sits in + // between them. So calculate the offset from the end of the PciAt image + // rather than the start of it. Compensate. + offset -= pci_at_image.base.data.len(); + + // The offset is now from the start of the first Fwsec image, however + // the offset points to a location in the second Fwsec image. Since + // the fwsec images are contiguous, subtract the length of the first Fwsec + // image from the offset to get the offset to the start of the second + // Fwsec image. + offset -= first_fwsec_image.base.data.len(); + + self.falcon_data_offset = Some(offset); + + // The PmuLookupTable starts at the offset of the falcon data pointer + self.pmu_lookup_table = Some(PmuLookupTable::new(&self.base.data[offset..])?); + + match self + .pmu_lookup_table + .as_ref() + .ok_or(EINVAL)? + .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD) + { + Ok(entry) => { + let mut ucode_offset = entry.data as usize; + ucode_offset -= pci_at_image.base.data.len(); + ucode_offset -= first_fwsec_image.base.data.len(); + self.falcon_ucode_offset = Some(ucode_offset); + if cfg!(debug_assertions) { + // Print the v3_desc header for debugging + let v3_desc = self.fwsec_header(pdev.as_ref())?; + pr_info!("PmuLookupTableEntry v3_desc: {:#?}\n", v3_desc); + } + } + Err(e) => { + dev_err!( + pdev.as_ref(), + "PmuLookupTableEntry not found, error: {:?}\n", + e + ); + } + } + Ok(()) + } + + /// TODO: These were borrowed from the old code for integrating this module + /// with the outside world. They should be cleaned up and integrated properly. + /// + /// Get the FwSec header (FalconUCodeDescV3) + fn fwsec_header(&self, dev: &device::Device) -> Result<&FalconUCodeDescV3> { + // Get the falcon ucode offset that was found in setup_falcon_data + let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)?; + + // Make sure the offset is within the data bounds + if falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>() > self.base.data.len() { + dev_err!(dev, "fwsec-frts header not contained within BIOS bounds\n"); + return Err(ERANGE); + } + + // Read the first 4 bytes to get the version + let hdr_bytes: [u8; 4] = self.base.data[falcon_ucode_offset..falcon_ucode_offset + 4] + .try_into() + .map_err(|_| EINVAL)?; + let hdr = u32::from_le_bytes(hdr_bytes); + let ver = (hdr & 0xff00) >> 8; + + if ver != 3 { + dev_err!(dev, "invalid fwsec firmware version\n"); + return Err(EINVAL); + } + + // Return a reference to the FalconUCodeDescV3 structure SAFETY: we have checked that + // `falcon_ucode_offset + size_of::<FalconUCodeDescV3` is within the bounds of `data.` + Ok(unsafe { + &*(self.base.data.as_ptr().add(falcon_ucode_offset) as *const FalconUCodeDescV3) + }) + } + /// Get the ucode data as a byte slice + fn fwsec_ucode(&self, dev: &device::Device, v3_desc: &FalconUCodeDescV3) -> Result<&[u8]> { + let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)?; + + // The ucode data follows the descriptor + let ucode_data_offset = falcon_ucode_offset + v3_desc.size(); + let size = (v3_desc.imem_load_size + v3_desc.dmem_load_size) as usize; + + // Get the data slice, checking bounds in a single operation + self.base + .data + .get(ucode_data_offset..ucode_data_offset + size) + .ok_or(ERANGE) + .inspect_err(|_| dev_err!(dev, "fwsec ucode data not contained within BIOS bounds\n")) + } + + /// Get the signatures as a byte slice + fn fwsec_sigs(&self, dev: &device::Device, v3_desc: &FalconUCodeDescV3) -> Result<&[u8]> { + const SIG_SIZE: usize = 96 * 4; + + let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)?; + + // The signatures data follows the descriptor + let sigs_data_offset = falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>(); + let size = v3_desc.signature_count as usize * SIG_SIZE; + + // Make sure the data is within bounds + if sigs_data_offset + size > self.base.data.len() { + dev_err!( + dev, + "fwsec signatures data not contained within BIOS bounds\n" + ); + return Err(ERANGE); + } + + Ok(&self.base.data[sigs_data_offset..sigs_data_offset + size]) + } +} -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 17/19] gpu: nova-core: compute layout of the FRTS region
FWSEC-FRTS is run with the desired address of the FRTS region as parameter, which we need to compute depending on some hardware parameters. Do this in a `FbLayout` structure, that will be later extended to describe more memory regions used to boot the GSP. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 4 ++ drivers/gpu/nova-core/gsp.rs | 3 ++ drivers/gpu/nova-core/gsp/fb.rs | 108 +++++++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 27 ++++++++++ 5 files changed, 143 insertions(+) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 4bf7f72247e5320935a517270b5a0e1ec2becfec..a3d96639706e808305cce66416778d2bf6e7e683 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -7,6 +7,7 @@ use crate::driver::Bar0; use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon}; use crate::firmware::Firmware; +use crate::gsp::fb::FbLayout; use crate::regs; use crate::util; use crate::vbios::Vbios; @@ -239,6 +240,9 @@ pub(crate) fn new( let _sec2_falcon = Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset, bar, true)?; + let fb_layout = FbLayout::new(spec.chipset, bar)?; + dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout); + let _bios = Vbios::new(pdev, bar)?; Ok(pin_init!(Self { diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs new file mode 100644 index 0000000000000000000000000000000000000000..27616a9d2b7069b18661fc97811fa1cac285b8f8 --- /dev/null +++ b/drivers/gpu/nova-core/gsp.rs @@ -0,0 +1,3 @@ +// SPDX-License-Identifier: GPL-2.0 + +pub(crate) mod fb; diff --git a/drivers/gpu/nova-core/gsp/fb.rs b/drivers/gpu/nova-core/gsp/fb.rs new file mode 100644 index 0000000000000000000000000000000000000000..f28ded59469d52daf39e5d19c09efd7bf08fee92 --- /dev/null +++ b/drivers/gpu/nova-core/gsp/fb.rs @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 + +use core::ops::Range; + +use kernel::prelude::*; + +use crate::driver::Bar0; +use crate::gpu::Chipset; +use crate::regs; + +fn align_down(value: u64, align: u64) -> u64 { + value & !(align - 1) +} + +/// Layout of the GPU framebuffer memory. +/// +/// Contains ranges of GPU memory reserved for a given purpose during the GSP bootup process. +#[derive(Debug)] +#[expect(dead_code)] +pub(crate) struct FbLayout { + pub fb: Range<u64>, + + pub vga_workspace: Range<u64>, + pub bios: Range<u64>, + + pub frts: Range<u64>, +} + +impl FbLayout { + pub(crate) fn new(chipset: Chipset, bar: &Bar0) -> Result<Self> { + let fb = { + let fb_size = vidmem_size(bar, chipset); + + 0..fb_size + }; + let fb_len = fb.end - fb.start; + + let vga_workspace = { + let vga_base = vga_workspace_addr(bar, fb_len, chipset); + + vga_base..fb.end + }; + + let bios = vga_workspace.clone(); + + let frts = { + const FRTS_DOWN_ALIGN: u64 = 0x20000; + const FRTS_SIZE: u64 = 0x100000; + let frts_base = align_down(vga_workspace.start, FRTS_DOWN_ALIGN) - FRTS_SIZE; + + frts_base..frts_base + FRTS_SIZE + }; + + Ok(Self { + fb, + vga_workspace, + bios, + frts, + }) + } +} + +/// Returns `true` if the display is disabled. +fn display_disabled(bar: &Bar0, chipset: Chipset) -> bool { + if chipset >= Chipset::GA100 { + regs::NV_FUSE_STATUS_OPT_DISPLAY_MAXWELL::read(bar).display_disabled() + } else { + regs::NV_FUSE_STATUS_OPT_DISPLAY_AMPERE::read(bar).display_disabled() + } +} + +/// Returns the video memory size in bytes. +fn vidmem_size(bar: &Bar0, chipset: Chipset) -> u64 { + if chipset >= Chipset::GA102 { + (regs::NV_PGC6_AON_SECURE_SCRATCH_GROUP_42::read(bar).value() as u64) << 20 + } else { + let local_mem_range = regs::NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE::read(bar); + let size + (local_mem_range.lower_mag() as u64) << ((local_mem_range.lower_scale() as u64) + 20); + + if local_mem_range.ecc_mode_enabled() { + size / 16 * 15 + } else { + size + } + } +} + +/// Returns the vga workspace address. +fn vga_workspace_addr(bar: &Bar0, fb_size: u64, chipset: Chipset) -> u64 { + let base = fb_size - 0x100000; + let vga_workspace_base = if display_disabled(bar, chipset) { + regs::NV_PDISP_VGA_WORKSPACE_BASE::read(bar) + } else { + return base; + }; + + if !vga_workspace_base.status_valid() { + return base; + } + + let addr = (vga_workspace_base.addr() as u64) << 16; + if addr < base { + fb_size - 0x20000 + } else { + addr + } +} diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index ff6d0b40c18f36af4c7e2d5c839fdf77dba23321..202e978e56f024de3ae8b178e65b63c2cea244e1 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -8,6 +8,7 @@ mod falcon; mod firmware; mod gpu; +mod gsp; mod regs; mod util; mod vbios; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index b5c6eeb6ed873a06b4aefcb375f4944eb0b20597..15ec9b7e69694ff198b5353d562fc1aff5eefd3f 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -53,6 +53,12 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> { 23:0 adr_63_40 as u32; }); +register!(NV_PFB_PRI_MMU_LOCAL_MEMORY_RANGE @ 0x00100ce0 { + 3:0 lower_scale as u8; + 9:4 lower_mag as u8; + 30:30 ecc_mode_enabled as bool; +}); + /* PGC6 */ register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 { @@ -64,6 +70,27 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> { 31:0 value as u32; }); +register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_42 @ 0x001183a4 { + 31:0 value as u32; +}); + +/* PDISP */ + +register!(NV_PDISP_VGA_WORKSPACE_BASE @ 0x00625f04 { + 3:3 status_valid as bool; + 31:8 addr as u32; +}); + +/* FUSE */ + +register!(NV_FUSE_STATUS_OPT_DISPLAY_MAXWELL @ 0x00021c04 { + 0:0 display_disabled as bool; +}); + +register!(NV_FUSE_STATUS_OPT_DISPLAY_AMPERE @ 0x00820c04 { + 0:0 display_disabled as bool; +}); + /* PFALCON */ register!(NV_PFALCON_FALCON_IRQSCLR @ +0x00000004 { -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 18/19] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS
The FWSEC firmware needs to be extracted from the VBIOS and patched with the desired command, as well as the right signature. Do this so we are ready to load and run this firmware into the GSP falcon and create the FRTS region. [joelagnelf at nvidia.com: give better names to FalconAppifHdrV1's fields] Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/dma.rs | 3 - drivers/gpu/nova-core/firmware.rs | 18 ++ drivers/gpu/nova-core/firmware/fwsec.rs | 359 ++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/gpu.rs | 20 +- drivers/gpu/nova-core/vbios.rs | 3 - 5 files changed, 395 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs index 9d90ae01d0044eaab4ddbc3eba216741d7a623ef..a12d0dff574aa38fb5eb8f4d759611af2f8ba3ec 100644 --- a/drivers/gpu/nova-core/dma.rs +++ b/drivers/gpu/nova-core/dma.rs @@ -2,9 +2,6 @@ //! Simple DMA object wrapper. -// To be removed when all code is used. -#![expect(dead_code)] - use core::ops::{Deref, DerefMut}; use kernel::device; diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 960982174d834c7c66a47ecfb3a15bf47116b2c5..3945fd18499555ddd6fb2e0ea69535b40fcc4b08 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -8,9 +8,12 @@ use kernel::prelude::*; use kernel::str::CString; +use crate::dma::DmaObject; use crate::gpu; use crate::gpu::Chipset; +pub(crate) mod fwsec; + pub(crate) const FIRMWARE_VERSION: &str = "535.113.01"; /// Structure encapsulating the firmware blobs required for the GPU to operate. @@ -86,6 +89,21 @@ pub(crate) fn size(&self) -> usize { } } +/// Patch the `ucode_dma` firmware at offset `sig_base_img` with `signature`. +fn patch_signature(ucode_dma: &mut DmaObject, signature: &[u8], sig_base_img: usize) -> Result<()> { + if sig_base_img + signature.len() > ucode_dma.size() { + return Err(EINVAL); + } + + // SAFETY: we are the only user of this object, so there cannot be any race. + let dst = unsafe { ucode_dma.start_ptr_mut().add(sig_base_img) }; + + // SAFETY: `signature` and `dst` are valid, properly aligned, and do not overlap. + unsafe { core::ptr::copy_nonoverlapping(signature.as_ptr(), dst, signature.len()) }; + + Ok(()) +} + pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>); impl<const N: usize> ModInfoBuilder<N> { diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs b/drivers/gpu/nova-core/firmware/fwsec.rs new file mode 100644 index 0000000000000000000000000000000000000000..d7e3418b49ab4ad478e73c0d447a832ed68ecd0a --- /dev/null +++ b/drivers/gpu/nova-core/firmware/fwsec.rs @@ -0,0 +1,359 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! FWSEC is a High Secure firmware that is extracted from the BIOS and performs the first step of +//! the GSP startup by creating the WPR2 memory region and copying critical areas of the VBIOS into +//! it after authenticating them, ensuring they haven't been tampered with. It runs on the GSP +//! falcon. +//! +//! Before being run, it needs to be patched in two areas: +//! +//! - The command to be run, as this firmware can perform several tasks ; +//! - The ucode signature, so the GSP falcon can run FWSEC in HS mode. + +use core::alloc::Layout; + +use kernel::bindings; +use kernel::device::{self, Device}; +use kernel::prelude::*; +use kernel::transmute::FromBytes; + +use crate::dma::DmaObject; +use crate::driver::Bar0; +use crate::falcon::gsp::Gsp; +use crate::falcon::{Falcon, FalconBromParams, FalconFirmware, FalconLoadTarget}; +use crate::firmware::FalconUCodeDescV3; +use crate::vbios::Vbios; + +const NVFW_FALCON_APPIF_ID_DMEMMAPPER: u32 = 0x4; + +#[repr(C)] +#[derive(Debug)] +struct FalconAppifHdrV1 { + version: u8, + header_size: u8, + entry_size: u8, + entry_count: u8, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FalconAppifHdrV1 {} + +#[repr(C, packed)] +#[derive(Debug)] +struct FalconAppifV1 { + id: u32, + dmem_base: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FalconAppifV1 {} + +#[derive(Debug)] +#[repr(C, packed)] +struct FalconAppifDmemmapperV3 { + signature: u32, + version: u16, + size: u16, + cmd_in_buffer_offset: u32, + cmd_in_buffer_size: u32, + cmd_out_buffer_offset: u32, + cmd_out_buffer_size: u32, + nvf_img_data_buffer_offset: u32, + nvf_img_data_buffer_size: u32, + printf_buffer_hdr: u32, + ucode_build_time_stamp: u32, + ucode_signature: u32, + init_cmd: u32, + ucode_feature: u32, + ucode_cmd_mask0: u32, + ucode_cmd_mask1: u32, + multi_tgt_tbl: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FalconAppifDmemmapperV3 {} + +#[derive(Debug)] +#[repr(C, packed)] +struct ReadVbios { + ver: u32, + hdr: u32, + addr: u64, + size: u32, + flags: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for ReadVbios {} + +#[derive(Debug)] +#[repr(C, packed)] +struct FrtsRegion { + ver: u32, + hdr: u32, + addr: u32, + size: u32, + ftype: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FrtsRegion {} + +const NVFW_FRTS_CMD_REGION_TYPE_FB: u32 = 2; + +#[repr(C, packed)] +struct FrtsCmd { + read_vbios: ReadVbios, + frts_region: FrtsRegion, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FrtsCmd {} + +const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS: u32 = 0x15; +const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB: u32 = 0x19; + +/// Command for the [`FwsecFirmware`] to execute. +pub(crate) enum FwsecCommand { + /// Asks [`FwsecFirmware`] to carve out the WPR2 area and place a verified copy of the VBIOS + /// image into it. + Frts { frts_addr: u64, frts_size: u64 }, + /// Asks [`FwsecFirmware`] to load pre-OS apps on the PMU. + #[expect(dead_code)] + Sb, +} + +/// Reinterpret the area starting from `offset` in `fw` as an instance of `T` (which must implement +/// [`FromBytes`]) and return a reference to it. +/// +/// # Safety +/// +/// Callers must ensure that the region of memory returned is not written for as long as the +/// returned reference is alive. +/// +/// TODO: Remove this and `transmute_mut` once we have a way to transmute objects implementing +/// FromBytes, e.g.: +/// https://lore.kernel.org/lkml/20250330234039.29814-1-christiansantoslima21 at gmail.com/ +unsafe fn transmute<'a, 'b, T: Sized + FromBytes>( + fw: &'a DmaObject, + offset: usize, +) -> Result<&'b T> { + if offset + core::mem::size_of::<T>() > fw.size() { + return Err(EINVAL); + } + if (fw.start_ptr() as usize + offset) % core::mem::align_of::<T>() != 0 { + return Err(EINVAL); + } + + // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is + // large enough the contains an instance of `T`, which implements `FromBytes`. + Ok(unsafe { &*(fw.start_ptr().add(offset) as *const T) }) +} + +/// Reinterpret the area starting from `offset` in `fw` as a mutable instance of `T` (which must +/// implement [`FromBytes`]) and return a reference to it. +/// +/// # Safety +/// +/// Callers must ensure that the region of memory returned is not read or written for as long as +/// the returned reference is alive. +unsafe fn transmute_mut<'a, 'b, T: Sized + FromBytes>( + fw: &'a mut DmaObject, + offset: usize, +) -> Result<&'b mut T> { + if offset + core::mem::size_of::<T>() > fw.size() { + return Err(EINVAL); + } + if (fw.start_ptr_mut() as usize + offset) % core::mem::align_of::<T>() != 0 { + return Err(EINVAL); + } + + // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is + // large enough the contains an instance of `T`, which implements `FromBytes`. + Ok(unsafe { &mut *(fw.start_ptr_mut().add(offset) as *mut T) }) +} + +/// Patch the Fwsec firmware image in `fw` to run the command `cmd`. +fn patch_command(fw: &mut DmaObject, v3_desc: &FalconUCodeDescV3, cmd: FwsecCommand) -> Result<()> { + let hdr_offset = (v3_desc.imem_load_size + v3_desc.interface_offset) as usize; + // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared `fw` with + // the hardware yet. + let hdr: &FalconAppifHdrV1 = unsafe { transmute(fw, hdr_offset) }?; + + if hdr.version != 1 { + return Err(EINVAL); + } + + // Find the DMEM mapper section in the firmware. + for i in 0..hdr.entry_count as usize { + let app: &FalconAppifV1 + // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared + // `fw` with the hardware yet. + unsafe { + transmute( + fw, + hdr_offset + hdr.header_size as usize + i * hdr.entry_size as usize + ) + }?; + + if app.id != NVFW_FALCON_APPIF_ID_DMEMMAPPER { + continue; + } + + let dmem_mapper: &mut FalconAppifDmemmapperV3 + // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared + // `fw` with the hardware yet. + unsafe { transmute_mut(fw, (v3_desc.imem_load_size + app.dmem_base) as usize) }?; + + // SAFETY: we have an exclusive reference to `fw`, and no caller should have shared `fw` + // with the hardware yet. + let frts_cmd: &mut FrtsCmd = unsafe { + transmute_mut( + fw, + (v3_desc.imem_load_size + dmem_mapper.cmd_in_buffer_offset) as usize, + ) + }?; + + frts_cmd.read_vbios = ReadVbios { + ver: 1, + hdr: core::mem::size_of::<ReadVbios>() as u32, + addr: 0, + size: 0, + flags: 2, + }; + + dmem_mapper.init_cmd = match cmd { + FwsecCommand::Frts { + frts_addr, + frts_size, + } => { + frts_cmd.frts_region = FrtsRegion { + ver: 1, + hdr: core::mem::size_of::<FrtsRegion>() as u32, + addr: (frts_addr >> 12) as u32, + size: (frts_size >> 12) as u32, + ftype: NVFW_FRTS_CMD_REGION_TYPE_FB, + }; + + NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS + } + FwsecCommand::Sb => NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB, + }; + + // Return early as we found and patched the DMEMMAPPER region. + return Ok(()); + } + + Err(ENOTSUPP) +} + +/// Firmware extracted from the VBIOS and responsible for e.g. carving out the WPR2 region as the +/// first step of the GSP bootflow. +pub(crate) struct FwsecFirmware { + desc: FalconUCodeDescV3, + ucode: DmaObject, +} + +impl FalconFirmware for FwsecFirmware { + type Target = Gsp; + + fn dma_handle(&self) -> bindings::dma_addr_t { + self.ucode.dma_handle() + } + + fn imem_load(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: 0, + dst_start: self.desc.imem_phys_base, + len: self.desc.imem_load_size, + } + } + + fn dmem_load(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: self.desc.imem_load_size, + dst_start: self.desc.dmem_phys_base, + len: Layout::from_size_align(self.desc.dmem_load_size as usize, 256) + // Cannot panic, as 256 is non-zero and a power of 2. + .unwrap() + .pad_to_align() + .size() as u32, + } + } + + fn brom_params(&self) -> FalconBromParams { + FalconBromParams { + pkc_data_offset: self.desc.pkc_data_offset, + engine_id_mask: self.desc.engine_id_mask, + ucode_id: self.desc.ucode_id, + } + } + + fn boot_addr(&self) -> u32 { + 0 + } +} + +impl FwsecFirmware { + /// Extract the Fwsec firmware from `bios` and patch it to run with the `cmd` command. + pub(crate) fn new( + falcon: &Falcon<Gsp>, + dev: &Device<device::Bound>, + bar: &Bar0, + bios: &Vbios, + cmd: FwsecCommand, + ) -> Result<Self> { + let v3_desc = bios.fwsec_header(dev)?; + let ucode = bios.fwsec_ucode(dev)?; + + let mut ucode_dma = DmaObject::from_data(dev, ucode)?; + patch_command(&mut ucode_dma, v3_desc, cmd)?; + + const SIG_SIZE: usize = 96 * 4; + let signatures = bios.fwsec_sigs(dev)?; + let sig_base_img = (v3_desc.imem_load_size + v3_desc.pkc_data_offset) as usize; + + if v3_desc.signature_count != 0 { + // Patch signature. + let desc_sig_versions = v3_desc.signature_versions as u32; + let reg_fuse_version = falcon.get_signature_reg_fuse_version( + bar, + v3_desc.engine_id_mask, + v3_desc.ucode_id, + )?; + dev_dbg!( + dev, + "desc_sig_versions: {:#x}, reg_fuse_version: {}\n", + desc_sig_versions, + reg_fuse_version + ); + let signature_idx = { + let reg_fuse_version_bit = 1 << reg_fuse_version; + + // Check if the fuse version is supported by the firmware. + if desc_sig_versions & reg_fuse_version_bit == 0 { + dev_warn!( + dev, + "no matching signature: {:#x} {:#x}\n", + reg_fuse_version_bit, + v3_desc.signature_versions + ); + return Err(EINVAL); + } + + // `desc_sig_versions` has one bit set per included signature. Thus, the index of + // the signature to patch is the number of bits in `desc_sig_versions` set to `1` + // before `reg_fuse_version_bit`. + + // Mask of the bits of `desc_sig_versions` to preserve. + let reg_fuse_version_mask = reg_fuse_version_bit.wrapping_sub(1); + + (desc_sig_versions & reg_fuse_version_mask).count_ones() + }; + + dev_dbg!(dev, "patching signature with index {}\n", signature_idx); + let signature_start = signature_idx as usize * SIG_SIZE; + let signature = &signatures[signature_start..signature_start + SIG_SIZE]; + super::patch_signature(&mut ucode_dma, signature, sig_base_img)?; + } + + Ok(FwsecFirmware { + desc: v3_desc.clone(), + ucode: ucode_dma, + }) + } +} diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index a3d96639706e808305cce66416778d2bf6e7e683..ad71402dbfced2aaa84800a32dbac276b9706803 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -6,6 +6,7 @@ use crate::dma::DmaObject; use crate::driver::Bar0; use crate::falcon::{gsp::Gsp, sec2::Sec2, Falcon}; +use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware}; use crate::firmware::Firmware; use crate::gsp::fb::FbLayout; use crate::regs; @@ -197,7 +198,11 @@ pub(crate) fn new( ) -> Result<impl PinInit<Self>> { let bar = devres_bar.access(pdev.as_ref())?; let spec = Spec::new(bar)?; - let fw = Firmware::new(pdev.as_ref(), spec.chipset, "535.113.01")?; + let fw = Firmware::new( + pdev.as_ref(), + spec.chipset, + crate::firmware::FIRMWARE_VERSION, + )?; dev_info!( pdev.as_ref(), @@ -243,7 +248,18 @@ pub(crate) fn new( let fb_layout = FbLayout::new(spec.chipset, bar)?; dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout); - let _bios = Vbios::new(pdev, bar)?; + let bios = Vbios::new(pdev, bar)?; + + let _fwsec_frts = FwsecFirmware::new( + &gsp_falcon, + pdev.as_ref(), + bar, + &bios, + FwsecCommand::Frts { + frts_addr: fb_layout.frts.start, + frts_size: fb_layout.frts.end - fb_layout.frts.start, + }, + )?; Ok(pin_init!(Self { spec, diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs index cd55d8dbf8e12d532f776d7544c7e5f2a865d6f8..71846dbbd512b6996d9f3a3c76aecde9d62460c0 100644 --- a/drivers/gpu/nova-core/vbios.rs +++ b/drivers/gpu/nova-core/vbios.rs @@ -2,9 +2,6 @@ //! VBIOS extraction and parsing. -// To be removed when all code is used. -#![expect(dead_code)] - use crate::driver::Bar0; use crate::firmware::FalconUCodeDescV3; use core::convert::TryFrom; -- 2.49.0
Alexandre Courbot
2025-May-07 13:52 UTC
[PATCH v3 19/19] gpu: nova-core: load and run FWSEC-FRTS
With all the required pieces in place, load FWSEC-FRTS onto the GSP falcon, run it, and check that it successfully carved out the WPR2 region out of framebuffer memory. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/falcon.rs | 3 --- drivers/gpu/nova-core/gpu.rs | 57 ++++++++++++++++++++++++++++++++++++++++- drivers/gpu/nova-core/regs.rs | 15 +++++++++++ 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs index b6552496ad9c5f8eeda4beaa757c7393f495072b..810a7b4ef0d55fb7bd4bfb7833d5da7df091203c 100644 --- a/drivers/gpu/nova-core/falcon.rs +++ b/drivers/gpu/nova-core/falcon.rs @@ -2,9 +2,6 @@ //! Falcon microprocessor base support -// To be removed when all code is used. -#![expect(dead_code)] - use core::time::Duration; use hal::FalconHal; use kernel::bindings; diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index ad71402dbfced2aaa84800a32dbac276b9706803..896986333b76d1192c07093289eddab98fe3f26d 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -250,7 +250,7 @@ pub(crate) fn new( let bios = Vbios::new(pdev, bar)?; - let _fwsec_frts = FwsecFirmware::new( + let fwsec_frts = FwsecFirmware::new( &gsp_falcon, pdev.as_ref(), bar, @@ -261,6 +261,61 @@ pub(crate) fn new( }, )?; + // Check that the WPR2 region does not already exists - if it does, the GPU needs to be + // reset. + if regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() != 0 { + dev_err!( + pdev.as_ref(), + "WPR2 region already exists - GPU needs to be reset to proceed\n" + ); + return Err(EBUSY); + } + + // Reset falcon, load FWSEC-FRTS, and run it. + gsp_falcon.reset(bar)?; + gsp_falcon.dma_load(bar, &fwsec_frts)?; + let (mbox0, _) = gsp_falcon.boot(bar, Some(0), None)?; + if mbox0 != 0 { + dev_err!(pdev.as_ref(), "FWSEC firmware returned error {}\n", mbox0); + return Err(EINVAL); + } + + // SCRATCH_E contains FWSEC-FRTS' error code, if any. + let frts_status = regs::NV_PBUS_SW_SCRATCH_0E::read(bar).frts_err_code(); + if frts_status != 0 { + dev_err!( + pdev.as_ref(), + "FWSEC-FRTS returned with error code {:#x}", + frts_status + ); + return Err(EINVAL); + } + + // Check the WPR2 has been created as we requested. + let (wpr2_lo, wpr2_hi) = ( + (regs::NV_PFB_PRI_MMU_WPR2_ADDR_LO::read(bar).lo_val() as u64) << 12, + (regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() as u64) << 12, + ); + if wpr2_hi == 0 { + dev_err!( + pdev.as_ref(), + "WPR2 region not created after running FWSEC-FRTS\n" + ); + + return Err(ENOTTY); + } else if wpr2_lo != fb_layout.frts.start { + dev_err!( + pdev.as_ref(), + "WPR2 region created at unexpected address {:#x} ; expected {:#x}\n", + wpr2_lo, + fb_layout.frts.start, + ); + return Err(EINVAL); + } + + dev_info!(pdev.as_ref(), "WPR2: {:#x}-{:#x}\n", wpr2_lo, wpr2_hi); + dev_info!(pdev.as_ref(), "GPU instance built\n"); + Ok(pin_init!(Self { spec, bar: devres_bar, diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 15ec9b7e69694ff198b5353d562fc1aff5eefd3f..3acec36f2d5701af4a752808e86d71d5f200359b 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -43,6 +43,13 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> { } } +/* PBUS */ + +// TODO: this is an array of registers. +register!(NV_PBUS_SW_SCRATCH_0E at 0x00001438 { + 31:16 frts_err_code as u16; +}); + /* PFB */ register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR @ 0x00100c10 { @@ -59,6 +66,14 @@ pub(crate) fn chipset(self) -> Result<Chipset, Error> { 30:30 ecc_mode_enabled as bool; }); +register!(NV_PFB_PRI_MMU_WPR2_ADDR_LO at 0x001fa824 { + 31:4 lo_val as u32; +}); + +register!(NV_PFB_PRI_MMU_WPR2_ADDR_HI at 0x001fa828 { + 31:4 hi_val as u32; +}); + /* PGC6 */ register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 { -- 2.49.0
Danilo Krummrich
2025-May-13 13:10 UTC
[PATCH v3 00/19] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization
On Wed, May 07, 2025 at 10:52:27PM +0900, Alexandre Courbot wrote:> gpu: nova-core: derive useful traits for Chipset > gpu: nova-core: add missing GA100 definition > gpu: nova-core: take bound device in Gpu::new > gpu: nova-core: define registers layout using helper macro[ Fix typo in commit message. - Danilo ]> gpu: nova-core: fix layout of NV_PMC_BOOT_0[ Use Result from kernel::prelude. - Danilo ]> gpu: nova-core: move Firmware to firmware module[ Don't require a bound device, remove pub visibility from Firmware fields, use FIRMWARE_VERSION consistently. - Danilo ] Applied the patches above to nova-next, thanks!