Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 00/16] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization
Hi everyone, This series is a continuation of my previous RFCs [1] to complete the first step of GSP booting (running the FWSEC-FRTS firmware extracted from the BIOS) on Ampere devices. While it is still far from bringing the GPU into a state where it can do anything useful, it sets up the basic layout of the driver upon which we can build in order to continue with the next steps of GSP booting, as well as supporting more chipsets. Upon successful probe, the driver will display the range of the WPR2 region constructed by FWSEC-FRTS: [ 95.436000] NovaCore 0000:01:00.0: WPR2: 0xffc00000-0xffce0000 [ 95.436002] NovaCore 0000:01:00.0: GPU instance built This code is based on nova-next with the try_access_with patch [2]. There is still a bit of unsafe code where it is not desired, notably to transmute byte slices into types that implement FromBytes - this is because support for doing such transmute operations safely are not in the kernel crate yet. [1] https://lore.kernel.org/rust-for-linux/20250320-nova_timer-v3-0-79aa2ad25a79 at nvidia.com/ [2] https://lore.kernel.org/rust-for-linux/20250411-try_with-v4-0-f470ac79e2e2 at nvidia.com/ Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- Alexandre Courbot (15): rust: add useful ops for u64 rust: make ETIMEDOUT error available gpu: nova-core: derive useful traits for Chipset gpu: nova-core: add missing GA100 definition gpu: nova-core: take bound device in Gpu::new gpu: nova-core: define registers layout using helper macro gpu: nova-core: move Firmware to firmware module gpu: nova-core: wait for GFW_BOOT completion gpu: nova-core: register sysmem flush page gpu: nova-core: add basic timer device gpu: nova-core: add falcon register definitions and base code gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS gpu: nova-core: compute layout of the FRTS region gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS gpu: nova-core: load and run FWSEC-FRTS Joel Fernandes (1): gpu: nova-core: Add support for VBIOS ucode extraction for boot Documentation/gpu/nova/core/todo.rst | 6 + drivers/gpu/nova-core/devinit.rs | 40 ++ drivers/gpu/nova-core/dma.rs | 54 ++ drivers/gpu/nova-core/driver.rs | 2 +- drivers/gpu/nova-core/falcon.rs | 466 ++++++++++++ drivers/gpu/nova-core/falcon/gsp.rs | 27 + drivers/gpu/nova-core/falcon/hal.rs | 54 ++ drivers/gpu/nova-core/falcon/hal/ga102.rs | 111 +++ drivers/gpu/nova-core/falcon/sec2.rs | 9 + drivers/gpu/nova-core/firmware.rs | 90 ++- drivers/gpu/nova-core/firmware/fwsec.rs | 340 +++++++++ drivers/gpu/nova-core/gpu.rs | 211 ++++-- drivers/gpu/nova-core/gsp.rs | 3 + drivers/gpu/nova-core/gsp/fb.rs | 109 +++ drivers/gpu/nova-core/nova_core.rs | 24 + drivers/gpu/nova-core/regs.rs | 304 ++++++-- drivers/gpu/nova-core/regs/macros.rs | 297 ++++++++ drivers/gpu/nova-core/timer.rs | 130 ++++ drivers/gpu/nova-core/vbios.rs | 1100 +++++++++++++++++++++++++++++ rust/kernel/error.rs | 1 + rust/kernel/lib.rs | 1 + rust/kernel/num.rs | 52 ++ 22 files changed, 3347 insertions(+), 84 deletions(-) --- base-commit: 96609a1969f4ade45351ec368c65580c77592e8b change-id: 20250417-nova-frts-96ef299abe2c prerequisite-change-id: 20250313-try_with-cc9f91dd3b60:v4 prerequisite-patch-id: b0c2d08bdea8193307c43c04aa9ff96baf6b00e1 prerequisite-patch-id: b6d1232c2dfef24e4d3f8753a198eb6c427c3486 Best regards, -- Alexandre Courbot <acourbot at nvidia.com>
It is common to build a u64 from its high and low parts obtained from two 32-bit registers. Conversely, it is also common to split a u64 into two u32s to write them into registers. Add an extension trait for u64 that implement these methods in a new `num` module. It is expected that this trait will be extended with other useful operations, and similar extension traits implemented for other types. Reviewed-by: Sergio Gonz?lez Collado <sergio.collado at gmail.com> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- rust/kernel/lib.rs | 1 + rust/kernel/num.rs | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+) diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index 55a8dfeece0b27f188456a9eaebd1045c4cafbcb..e30d2c075a3607f6ea40c901b3281e8798e81260 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -65,6 +65,7 @@ pub mod miscdevice; #[cfg(CONFIG_NET)] pub mod net; +pub mod num; pub mod of; pub mod page; #[cfg(CONFIG_PCI)] diff --git a/rust/kernel/num.rs b/rust/kernel/num.rs new file mode 100644 index 0000000000000000000000000000000000000000..9b93db6528eef131fb74c1289f1e152cc2a13168 --- /dev/null +++ b/rust/kernel/num.rs @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Numerical and binary utilities for primitive types. + +/// Useful operations for `u64`. +pub trait U64Ext { + /// Build a `u64` by combining its `high` and `low` parts. + /// + /// ``` + /// use kernel::num::U64Ext; + /// assert_eq!(u64::from_u32s(0x01234567, 0x89abcdef), 0x01234567_89abcdef); + /// ``` + fn from_u32s(high: u32, low: u32) -> Self; + + /// Returns the upper 32 bits of `self`. + fn upper_32_bits(self) -> u32; + + /// Returns the lower 32 bits of `self`. + fn lower_32_bits(self) -> u32; +} + +impl U64Ext for u64 { + fn from_u32s(high: u32, low: u32) -> Self { + ((high as u64) << u32::BITS) | low as u64 + } + + fn upper_32_bits(self) -> u32 { + (self >> u32::BITS) as u32 + } + + fn lower_32_bits(self) -> u32 { + self as u32 + } +} + +/// Same as [`U64Ext::upper_32_bits`], but defined outside of the trait so it can be used in a +/// `const` context. +pub const fn upper_32_bits(v: u64) -> u32 { + (v >> u32::BITS) as u32 +} + +/// Same as [`U64Ext::lower_32_bits`], but defined outside of the trait so it can be used in a +/// `const` context. +pub const fn lower_32_bits(v: u64) -> u32 { + v as u32 +} + +/// Same as [`U64Ext::from_u32s`], but defined outside of the trait so it can be used in a `const` +/// context. +pub const fn u64_from_u32s(high: u32, low: u32) -> u64 { + ((high as u64) << u32::BITS) | low as u64 +} -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 02/16] rust: make ETIMEDOUT error available
We will use this error in the nova-core driver. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- rust/kernel/error.rs | 1 + 1 file changed, 1 insertion(+) diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs index 3dee3139fcd4379b94748c0ba1965f4e1865b633..083c7b068cf4e185100de96e520c54437898ee72 100644 --- a/rust/kernel/error.rs +++ b/rust/kernel/error.rs @@ -65,6 +65,7 @@ macro_rules! declare_err { declare_err!(EDOM, "Math argument out of domain of func."); declare_err!(ERANGE, "Math result not representable."); declare_err!(EOVERFLOW, "Value too large for defined data type."); + declare_err!(ETIMEDOUT, "Connection timed out."); declare_err!(ERESTARTSYS, "Restart the system call."); declare_err!(ERESTARTNOINTR, "System call was interrupted by a signal and will be restarted."); declare_err!(ERESTARTNOHAND, "Restart if no handler."); -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 03/16] gpu: nova-core: derive useful traits for Chipset
We will commonly need to compare chipset versions, so derive the ordering traits to make that possible. Also derive Copy and Clone since passing Chipset by value will be more efficient than by reference. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 17c9660da45034762edaa78e372d8821144cdeb7..4de67a2dc16302c00530026156d7264cbc7e5b32 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -13,7 +13,7 @@ macro_rules! define_chipset { ({ $($variant:ident = $value:expr),* $(,)* }) => { /// Enum representation of the GPU chipset. - #[derive(fmt::Debug)] + #[derive(fmt::Debug, Copy, Clone, PartialOrd, Ord, PartialEq, Eq)] pub(crate) enum Chipset { $($variant = $value),*, } -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 04/16] gpu: nova-core: add missing GA100 definition
linux-firmware contains a directory for GA100, and it is a defined chipset in Nouveau. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 4de67a2dc16302c00530026156d7264cbc7e5b32..9fe6aedaa9563799c2624d461d4e37ee9b094909 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -54,6 +54,7 @@ fn try_from(value: u32) -> Result<Self, Self::Error> { TU117 = 0x167, TU116 = 0x168, // Ampere + GA100 = 0x170, GA102 = 0x172, GA103 = 0x173, GA104 = 0x174, @@ -73,7 +74,7 @@ pub(crate) fn arch(&self) -> Architecture { Self::TU102 | Self::TU104 | Self::TU106 | Self::TU117 | Self::TU116 => { Architecture::Turing } - Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => { + Self::GA100 | Self::GA102 | Self::GA103 | Self::GA104 | Self::GA106 | Self::GA107 => { Architecture::Ampere } Self::AD102 | Self::AD103 | Self::AD104 | Self::AD106 | Self::AD107 => { -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 05/16] gpu: nova-core: take bound device in Gpu::new
We will need to perform things like allocating DMA memory during device creation, so make sure to take the device context that will allow us to perform these actions. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 9fe6aedaa9563799c2624d461d4e37ee9b094909..19a17cdc204b013482c0d307c5838cf3044c8cc8 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -183,7 +183,10 @@ pub(crate) struct Gpu { } impl Gpu { - pub(crate) fn new(pdev: &pci::Device, bar: Devres<Bar0>) -> Result<impl PinInit<Self>> { + pub(crate) fn new( + pdev: &pci::Device<device::Bound>, + bar: Devres<Bar0>, + ) -> Result<impl PinInit<Self>> { let spec = Spec::new(&bar)?; let fw = Firmware::new(pdev.as_ref(), &spec, "535.113.01")?; -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 06/16] gpu: nova-core: define registers layout using helper macro
Add the register!() macro, which defines a given register's layout and provide bit-field accessors with a way to convert them to a given type. This macro will allow us to make clear definitions of the registers and manipulate their fields safely. The long-term goal is to eventually move it to the kernel crate so it can be used my other drivers as well, but it was agreed to first land it into nova-core and make it mature there. To illustrate its usage, use it to define the layout for the Boot0 register and use its accessors through the use of the convenience with_bar!() macro, which uses Revocable::try_access() and converts its returned Option into the proper error as needed. Suggested-by: Danilo Krummrich <dakr at kernel.org> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- Documentation/gpu/nova/core/todo.rst | 6 + drivers/gpu/nova-core/gpu.rs | 5 +- drivers/gpu/nova-core/nova_core.rs | 18 +++ drivers/gpu/nova-core/regs.rs | 60 ++----- drivers/gpu/nova-core/regs/macros.rs | 297 +++++++++++++++++++++++++++++++++++ 5 files changed, 333 insertions(+), 53 deletions(-) diff --git a/Documentation/gpu/nova/core/todo.rst b/Documentation/gpu/nova/core/todo.rst index 234d753d3eacc709b928b1ccbfc9750ef36ec4ed..8a459fc088121f770bfcda5dfb4ef51c712793ce 100644 --- a/Documentation/gpu/nova/core/todo.rst +++ b/Documentation/gpu/nova/core/todo.rst @@ -102,7 +102,13 @@ Usage: let boot0 = Boot0::read(&bar); pr_info!("Revision: {}\n", boot0.revision()); +Note: a work-in-progress implementation currently resides in +`drivers/gpu/nova-core/regs/macros.rs` and is used in nova-core. It would be +nice to improve it (possibly using proc macros) and move it to the `kernel` +crate so it can be used by other components as well. + | Complexity: Advanced +| Contact: Alexandre Courbot Delay / Sleep abstractions -------------------------- diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 19a17cdc204b013482c0d307c5838cf3044c8cc8..891b59fe7255b3951962e30819145e686253706a 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -135,11 +135,10 @@ pub(crate) struct Spec { impl Spec { fn new(bar: &Devres<Bar0>) -> Result<Spec> { - let bar = bar.try_access().ok_or(ENXIO)?; - let boot0 = regs::Boot0::read(&bar); + let boot0 = with_bar!(bar, |b| regs::Boot0::read(b))?; Ok(Self { - chipset: boot0.chipset().try_into()?, + chipset: boot0.chipset()?, revision: Revision::from_boot0(boot0), }) } diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index a91cd924054b49966937a8db6aab9cd0614f10de..0eecd612e34efc046dad852e6239de6ffa5fdd62 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -2,6 +2,24 @@ //! Nova Core GPU Driver +#[macro_use] +mod macros { + /// Convenience macro to run a closure while holding [`crate::driver::Bar0`]. + /// + /// If the bar cannot be acquired, then `ENXIO` is returned. + /// + /// If a `?` is present before the `bar` argument, then the `Result` returned by the closure is + /// merged into the `Result` of the macro itself to avoid having a `Result<Result<>>`. + macro_rules! with_bar { + ($bar:expr, $closure:expr) => { + $bar.try_access_with($closure).ok_or(ENXIO) + }; + (? $bar:expr, $closure:expr) => { + with_bar!($bar, $closure).and_then(|r| r) + }; + } +} + mod driver; mod firmware; mod gpu; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index b1a25b86ef17a6710e6236d5e7f1f26cd4407ce3..e315a3011660df7f18c0a3e0582b5845545b36e2 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -1,55 +1,15 @@ // SPDX-License-Identifier: GPL-2.0 -use crate::driver::Bar0; +use core::ops::Deref; +use kernel::io::Io; -// TODO -// -// Create register definitions via generic macros. See task "Generic register -// abstraction" in Documentation/gpu/nova/core/todo.rst. +#[macro_use] +mod macros; -const BOOT0_OFFSET: usize = 0x00000000; +use crate::gpu::Chipset; -// 3:0 - chipset minor revision -const BOOT0_MINOR_REV_SHIFT: u8 = 0; -const BOOT0_MINOR_REV_MASK: u32 = 0x0000000f; - -// 7:4 - chipset major revision -const BOOT0_MAJOR_REV_SHIFT: u8 = 4; -const BOOT0_MAJOR_REV_MASK: u32 = 0x000000f0; - -// 23:20 - chipset implementation Identifier (depends on architecture) -const BOOT0_IMPL_SHIFT: u8 = 20; -const BOOT0_IMPL_MASK: u32 = 0x00f00000; - -// 28:24 - chipset architecture identifier -const BOOT0_ARCH_MASK: u32 = 0x1f000000; - -// 28:20 - chipset identifier (virtual register field combining BOOT0_IMPL and -// BOOT0_ARCH) -const BOOT0_CHIPSET_SHIFT: u8 = BOOT0_IMPL_SHIFT; -const BOOT0_CHIPSET_MASK: u32 = BOOT0_IMPL_MASK | BOOT0_ARCH_MASK; - -#[derive(Copy, Clone)] -pub(crate) struct Boot0(u32); - -impl Boot0 { - #[inline] - pub(crate) fn read(bar: &Bar0) -> Self { - Self(bar.read32(BOOT0_OFFSET)) - } - - #[inline] - pub(crate) fn chipset(&self) -> u32 { - (self.0 & BOOT0_CHIPSET_MASK) >> BOOT0_CHIPSET_SHIFT - } - - #[inline] - pub(crate) fn minor_rev(&self) -> u8 { - ((self.0 & BOOT0_MINOR_REV_MASK) >> BOOT0_MINOR_REV_SHIFT) as u8 - } - - #[inline] - pub(crate) fn major_rev(&self) -> u8 { - ((self.0 & BOOT0_MAJOR_REV_MASK) >> BOOT0_MAJOR_REV_SHIFT) as u8 - } -} +register!(Boot0 at 0x00000000, "Basic revision information about the GPU"; + 3:0 minor_rev => as u8, "minor revision of the chip"; + 7:4 major_rev => as u8, "major revision of the chip"; + 28:20 chipset => try_into Chipset, "chipset model" +); diff --git a/drivers/gpu/nova-core/regs/macros.rs b/drivers/gpu/nova-core/regs/macros.rs new file mode 100644 index 0000000000000000000000000000000000000000..fa9bd6b932048113de997658b112885666e694c9 --- /dev/null +++ b/drivers/gpu/nova-core/regs/macros.rs @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Types and macros to define register layout and accessors. +//! +//! A single register typically includes several fields, which are accessed through a combination +//! of bit-shift and mask operations that introduce a class of potential mistakes, notably because +//! not all possible field values are necessarily valid. +//! +//! The macros in this module allow to define, using an intruitive and readable syntax, a dedicated +//! type for each register with its own field accessors that can return an error is a field's value +//! is invalid. They also provide a builder type allowing to construct a register value to be +//! written by combining valid values for its fields. + +/// Helper macro for the `register` macro. +/// +/// Defines the wrapper `$name` type, as well as its relevant implementations (`Debug`, `BitOr`, +/// and conversion to regular `u32`). +macro_rules! __reg_def_common { + ($name:ident $(, $type_comment:expr)?) => { + $( + #[doc=$type_comment] + )? + #[repr(transparent)] + #[derive(Clone, Copy, Default)] + pub(crate) struct $name(u32); + + // TODO: should we display the raw hex value, then the value of all its fields? + impl ::core::fmt::Debug for $name { + fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { + f.debug_tuple(stringify!($name)) + .field(&format_args!("0x{0:x}", &self.0)) + .finish() + } + } + + impl core::ops::BitOr for $name { + type Output = Self; + + fn bitor(self, rhs: Self) -> Self::Output { + Self(self.0 | rhs.0) + } + } + + impl From<$name> for u32 { + fn from(reg: $name) -> u32 { + reg.0 + } + } + }; +} + +/// Helper macro for the `register` macro. +/// +/// Defines the getter method for $field. +macro_rules! __reg_def_field_getter { + ( + $hi:tt:$lo:tt $field:ident + $(=> as $as_type:ty)? + $(=> as_bit $bit_type:ty)? + $(=> into $type:ty)? + $(=> try_into $try_type:ty)? + $(, $comment:expr)? + ) => { + $( + #[doc=concat!("Returns the ", $comment)] + )? + #[inline] + pub(crate) fn $field(self) -> $( $as_type )? $( $bit_type )? $( $type )? $( core::result::Result<$try_type, <$try_type as TryFrom<u32>>::Error> )? { + const MASK: u32 = ((((1 << $hi) - 1) << 1) + 1) - ((1 << $lo) - 1); + const SHIFT: u32 = MASK.trailing_zeros(); + let field = (self.0 & MASK) >> SHIFT; + + $( field as $as_type )? + $( + // TODO: it would be nice to throw a compile-time error if $hi != $lo as this means we + // are considering more than one bit but returning a bool... + <$bit_type>::from(if field != 0 { true } else { false }) as $bit_type + )? + $( <$type>::from(field) )? + $( <$try_type>::try_from(field) )? + } + } +} + +/// Helper macro for the `register` macro. +/// +/// Defines all the field getter methods for `$name`. +macro_rules! __reg_def_getters { + ( + $name:ident + $(; $hi:tt:$lo:tt $field:ident + $(=> as $as_type:ty)? + $(=> as_bit $bit_type:ty)? + $(=> into $type:ty)? + $(=> try_into $try_type:ty)? + $(, $field_comment:expr)?)* $(;)? + ) => { + #[allow(dead_code)] + impl $name { + $( + __reg_def_field_getter!($hi:$lo $field $(=> as $as_type)? $(=> as_bit $bit_type)? $(=> into $type)? $(=> try_into $try_type)? $(, $field_comment)?); + )* + } + }; +} + +/// Helper macro for the `register` macro. +/// +/// Defines the setter method for $field. +macro_rules! __reg_def_field_setter { + ( + $hi:tt:$lo:tt $field:ident + $(=> as $as_type:ty)? + $(=> as_bit $bit_type:ty)? + $(=> into $type:ty)? + $(=> try_into $try_type:ty)? + $(, $comment:expr)? + ) => { + kernel::macros::paste! { + $( + #[doc=concat!("Sets the ", $comment)] + )? + #[inline] + pub(crate) fn [<set_ $field>](mut self, value: $( $as_type)? $( $bit_type )? $( $type )? $( $try_type)? ) -> Self { + const MASK: u32 = ((((1 << $hi) - 1) << 1) + 1) - ((1 << $lo) - 1); + const SHIFT: u32 = MASK.trailing_zeros(); + + let value = ((value as u32) << SHIFT) & MASK; + self.0 = (self.0 & !MASK) | value; + self + } + } + }; +} + +/// Helper macro for the `register` macro. +/// +/// Defines all the field setter methods for `$name`. +macro_rules! __reg_def_setters { + ( + $name:ident + $(; $hi:tt:$lo:tt $field:ident + $(=> as $as_type:ty)? + $(=> as_bit $bit_type:ty)? + $(=> into $type:ty)? + $(=> try_into $try_type:ty)? + $(, $field_comment:expr)?)* $(;)? + ) => { + #[allow(dead_code)] + impl $name { + $( + __reg_def_field_setter!($hi:$lo $field $(=> as $as_type)? $(=> as_bit $bit_type)? $(=> into $type)? $(=> try_into $try_type)? $(, $field_comment)?); + )* + } + }; +} + +/// Defines a dedicated type for a register with an absolute offset, alongside with getter and +/// setter methods for its fields and methods to read and write it from an `Io` region. +/// +/// Example: +/// +/// ```no_run +/// register!(Boot0 at 0x00000100, "Basic revision information about the chip"; +/// 3:0 minor_rev => as u8, "minor revision of the chip"; +/// 7:4 major_rev => as u8, "major revision of the chip"; +/// 28:20 chipset => try_into Chipset, "chipset model" +/// ); +/// ``` +/// +/// This defines a `Boot0` type which can be read or written from offset `0x100` of an `Io` region. +/// It is composed of 3 fields, for instance `minor_rev` is made of the 4 less significant bits of +/// the register. Each field can be accessed and modified using helper methods: +/// +/// ```no_run +/// // Read from offset 0x100. +/// let boot0 = Boot0::read(&bar); +/// pr_info!("chip revision: {}.{}", boot0.major_rev(), boot0.minor_rev()); +/// +/// // `Chipset::try_from` will be called with the value of the field and returns an error if the +/// // value is invalid. +/// let chipset = boot0.chipset()?; +/// +/// // Update some fields and write the value back. +/// boot0.set_major_rev(3).set_minor_rev(10).write(&bar); +/// +/// // Or just update the register value in a single step: +/// Boot0::alter(&bar, |r| r.set_major_rev(3).set_minor_rev(10)); +/// ``` +/// +/// Fields are made accessible using one of the following strategies: +/// +/// - `as <type>` simply casts the field value to the requested type. +/// - `as_bit <type>` turns the field into a boolean and calls `<type>::from()` with the obtained +/// value. To be used with single-bit fields. +/// - `into <type>` calls `<type>::from()` on the value of the field. It is expected to handle all +/// the possible values for the bit range selected. +/// - `try_into <type>` calls `<type>::try_from()` on the value of the field and returns its +/// result. +/// +/// The documentation strings are optional. If present, they will be added to the type or the field +/// getter and setter methods they are attached to. +/// +/// Putting a `+` before the address of the register makes it relative to a base: the `read` and +/// `write` methods take a `base` argument that is added to the specified address before access, +/// and adds `try_read` and `try_write` methods to allow access with offsets unknown at +/// compile-time. +/// +macro_rules! register { + // Create a register at a fixed offset of the MMIO space. + ( + $name:ident@$offset:expr $(, $type_comment:expr)? + $(; $hi:tt:$lo:tt $field:ident + $(=> as $as_type:ty)? + $(=> as_bit $bit_type:ty)? + $(=> into $type:ty)? + $(=> try_into $try_type:ty)? + $(, $field_comment:expr)?)* $(;)? + ) => { + __reg_def_common!($name); + + #[allow(dead_code)] + impl $name { + #[inline] + pub(crate) fn read<const SIZE: usize, T: Deref<Target=Io<SIZE>>>(bar: &T) -> Self { + Self(bar.read32($offset)) + } + + #[inline] + pub(crate) fn write<const SIZE: usize, T: Deref<Target=Io<SIZE>>>(self, bar: &T) { + bar.write32(self.0, $offset) + } + + #[inline] + pub(crate) fn alter<const SIZE: usize, T: Deref<Target=Io<SIZE>>, F: FnOnce(Self) -> Self>(bar: &T, f: F) { + let reg = f(Self::read(bar)); + reg.write(bar); + } + } + + __reg_def_getters!($name; $( $hi:$lo $field $(=> as $as_type)? $(=> as_bit $bit_type)? $(=> into $type)? $(=> try_into $try_type)? $(, $field_comment)? );*); + + __reg_def_setters!($name; $( $hi:$lo $field $(=> as $as_type)? $(=> as_bit $bit_type)? $(=> into $type)? $(=> try_into $try_type)? $(, $field_comment)? );*); + }; + + // Create a register at a relative offset from a base address. + ( + $name:ident at +$offset:expr $(, $type_comment:expr)? + $(; $hi:tt:$lo:tt $field:ident + $(=> as $as_type:ty)? + $(=> as_bit $bit_type:ty)? + $(=> into $type:ty)? + $(=> try_into $try_type:ty)? + $(, $field_comment:expr)?)* $(;)? + ) => { + __reg_def_common!($name); + + #[allow(dead_code)] + impl $name { + #[inline] + pub(crate) fn read<const SIZE: usize, T: Deref<Target=Io<SIZE>>>(bar: &T, base: usize) -> Self { + Self(bar.read32(base + $offset)) + } + + #[inline] + pub(crate) fn write<const SIZE: usize, T: Deref<Target=Io<SIZE>>>(self, bar: &T, base: usize) { + bar.write32(self.0, base + $offset) + } + + #[inline] + pub(crate) fn alter<const SIZE: usize, T: Deref<Target=Io<SIZE>>, F: FnOnce(Self) -> Self>(bar: &T, base: usize, f: F) { + let reg = f(Self::read(bar, base)); + reg.write(bar, base); + } + + #[inline] + pub(crate) fn try_read<const SIZE: usize, T: Deref<Target=Io<SIZE>>>(bar: &T, base: usize) -> ::kernel::error::Result<Self> { + bar.try_read32(base + $offset).map(Self) + } + + #[inline] + pub(crate) fn try_write<const SIZE: usize, T: Deref<Target=Io<SIZE>>>(self, bar: &T, base: usize) -> ::kernel::error::Result<()> { + bar.try_write32(self.0, base + $offset) + } + + #[inline] + pub(crate) fn try_alter<const SIZE: usize, T: Deref<Target=Io<SIZE>>, F: FnOnce(Self) -> Self>(bar: &T, base: usize, f: F) -> ::kernel::error::Result<()> { + let reg = f(Self::try_read(bar, base)?); + reg.try_write(bar, base) + } + } + + __reg_def_getters!($name; $( $hi:$lo $field $(=> as $as_type)? $(=> as_bit $bit_type)? $(=> into $type)? $(=> try_into $try_type)? $(, $field_comment)? );*); + + __reg_def_setters!($name; $( $hi:$lo $field $(=> as $as_type)? $(=> as_bit $bit_type)? $(=> into $type)? $(=> try_into $try_type)? $(, $field_comment)? );*); + }; +} -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 07/16] gpu: nova-core: move Firmware to firmware module
We will extend the firmware methods, so move it to its own module instead to keep gpu.rs focused. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/firmware.rs | 42 ++++++++++++++++++++++++++++++++++++++- drivers/gpu/nova-core/gpu.rs | 35 +++----------------------------- 2 files changed, 44 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 6e6361c59ca1ae9a52185e66e850ba1db93eb8ce..9bad7a86382af7917b3dce7bf3087d0002bd5971 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -1,7 +1,47 @@ // SPDX-License-Identifier: GPL-2.0 -use crate::gpu; +//! Contains structures and functions dedicated to the parsing, building and patching of firmwares +//! to be loaded into a given execution unit. + +use kernel::device; use kernel::firmware; +use kernel::prelude::*; +use kernel::str::CString; + +use crate::gpu; +use crate::gpu::Chipset; + +/// Structure encapsulating the firmware blobs required for the GPU to operate. +#[expect(dead_code)] +pub(crate) struct Firmware { + pub booter_load: firmware::Firmware, + pub booter_unload: firmware::Firmware, + pub bootloader: firmware::Firmware, + pub gsp: firmware::Firmware, +} + +impl Firmware { + pub(crate) fn new( + dev: &device::Device<device::Bound>, + chipset: Chipset, + ver: &str, + ) -> Result<Firmware> { + let mut chip_name = CString::try_from_fmt(fmt!("{}", chipset))?; + chip_name.make_ascii_lowercase(); + + let request = |name_| { + CString::try_from_fmt(fmt!("nvidia/{}/gsp/{}-{}.bin", &*chip_name, name_, ver)) + .and_then(|path| firmware::Firmware::request(&path, dev)) + }; + + Ok(Firmware { + booter_load: request("booter_load")?, + booter_unload: request("booter_unload")?, + bootloader: request("bootloader")?, + gsp: request("gsp")?, + }) + } +} pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>); diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 891b59fe7255b3951962e30819145e686253706a..866c5992b9eb27735975bb4948e522bc01fadaa2 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -1,10 +1,9 @@ // SPDX-License-Identifier: GPL-2.0 -use kernel::{ - device, devres::Devres, error::code::*, firmware, fmt, pci, prelude::*, str::CString, -}; +use kernel::{device, devres::Devres, error::code::*, pci, prelude::*}; use crate::driver::Bar0; +use crate::firmware::Firmware; use crate::regs; use crate::util; use core::fmt; @@ -144,34 +143,6 @@ fn new(bar: &Devres<Bar0>) -> Result<Spec> { } } -/// Structure encapsulating the firmware blobs required for the GPU to operate. -#[expect(dead_code)] -pub(crate) struct Firmware { - booter_load: firmware::Firmware, - booter_unload: firmware::Firmware, - bootloader: firmware::Firmware, - gsp: firmware::Firmware, -} - -impl Firmware { - fn new(dev: &device::Device, spec: &Spec, ver: &str) -> Result<Firmware> { - let mut chip_name = CString::try_from_fmt(fmt!("{}", spec.chipset))?; - chip_name.make_ascii_lowercase(); - - let request = |name_| { - CString::try_from_fmt(fmt!("nvidia/{}/gsp/{}-{}.bin", &*chip_name, name_, ver)) - .and_then(|path| firmware::Firmware::request(&path, dev)) - }; - - Ok(Firmware { - booter_load: request("booter_load")?, - booter_unload: request("booter_unload")?, - bootloader: request("bootloader")?, - gsp: request("gsp")?, - }) - } -} - /// Structure holding the resources required to operate the GPU. #[pin_data] pub(crate) struct Gpu { @@ -187,7 +158,7 @@ pub(crate) fn new( bar: Devres<Bar0>, ) -> Result<impl PinInit<Self>> { let spec = Spec::new(&bar)?; - let fw = Firmware::new(pdev.as_ref(), &spec, "535.113.01")?; + let fw = Firmware::new(pdev.as_ref(), spec.chipset, "535.113.01")?; dev_info!( pdev.as_ref(), -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 08/16] gpu: nova-core: wait for GFW_BOOT completion
Upon reset, the GPU executes the GFW_BOOT firmware in order to initialize its base parameters such as clocks. The driver must ensure that this step is completed before using the hardware. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/devinit.rs | 40 ++++++++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/driver.rs | 2 +- drivers/gpu/nova-core/gpu.rs | 5 +++++ drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 11 +++++++++++ 5 files changed, 58 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/nova-core/devinit.rs b/drivers/gpu/nova-core/devinit.rs new file mode 100644 index 0000000000000000000000000000000000000000..ee5685aff845aa97d6b0fbe9528df9a7ba274b2c --- /dev/null +++ b/drivers/gpu/nova-core/devinit.rs @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Methods for device initialization. + +use kernel::bindings; +use kernel::devres::Devres; +use kernel::prelude::*; + +use crate::driver::Bar0; +use crate::regs; + +/// Wait for devinit FW completion. +/// +/// Upon reset, the GPU runs some firmware code to setup its core parameters. Most of the GPU is +/// considered unusable until this step is completed, so it must be waited on very early during +/// driver initialization. +pub(crate) fn wait_gfw_boot_completion(bar: &Devres<Bar0>) -> Result<()> { + let mut timeout = 2000; + + loop { + let gfw_booted = with_bar!( + bar, + |b| regs::Pgc6AonSecureScratchGroup05PrivLevelMask::read(b) + .read_protection_level0_enabled() + && (regs::Pgc6AonSecureScratchGroup05::read(b).value() & 0xff) == 0xff + )?; + + if gfw_booted { + return Ok(()); + } + + if timeout == 0 { + return Err(ETIMEDOUT); + } + timeout -= 1; + + // SAFETY: msleep should be safe to call with any parameter. + unsafe { bindings::msleep(2) }; + } +} diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs index a08fb6599267a960f0e07b6efd0e3b6cdc296aa4..752ba4b0fcfe8d835d366570bb2f807840a196da 100644 --- a/drivers/gpu/nova-core/driver.rs +++ b/drivers/gpu/nova-core/driver.rs @@ -10,7 +10,7 @@ pub(crate) struct NovaCore { pub(crate) gpu: Gpu, } -const BAR0_SIZE: usize = 8; +const BAR0_SIZE: usize = 0x1000000; pub(crate) type Bar0 = pci::Bar<BAR0_SIZE>; kernel::pci_device_table!( diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 866c5992b9eb27735975bb4948e522bc01fadaa2..1f7799692a0ab042f2540e01414f5ca347ae9ecc 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -2,6 +2,7 @@ use kernel::{device, devres::Devres, error::code::*, pci, prelude::*}; +use crate::devinit; use crate::driver::Bar0; use crate::firmware::Firmware; use crate::regs; @@ -168,6 +169,10 @@ pub(crate) fn new( spec.revision ); + // We must wait for GFW_BOOT completion before doing any significant setup on the GPU. + devinit::wait_gfw_boot_completion(&bar) + .inspect_err(|_| pr_err!("GFW boot did not complete"))?; + Ok(pin_init!(Self { spec, bar, fw })) } } diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 0eecd612e34efc046dad852e6239de6ffa5fdd62..878161e060f54da7738c656f6098936a62dcaa93 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -20,6 +20,7 @@ macro_rules! with_bar { } } +mod devinit; mod driver; mod firmware; mod gpu; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index e315a3011660df7f18c0a3e0582b5845545b36e2..fd7096f0ddd4af90114dd1119d9715d2cd3aa2ac 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -13,3 +13,14 @@ 7:4 major_rev => as u8, "major revision of the chip"; 28:20 chipset => try_into Chipset, "chipset model" ); + +/* GC6 */ + +register!(Pgc6AonSecureScratchGroup05PrivLevelMask at 0x00118128; + 0:0 read_protection_level0_enabled => as_bit bool +); + +/* TODO: This is an array of registers. */ +register!(Pgc6AonSecureScratchGroup05 at 0x00118234; + 31:0 value => as u32 +); -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 09/16] gpu: nova-core: register sysmem flush page
A page of system memory is reserved so sysmembar can perform a read on it if a system write occurred since the last flush. Do this early as it can be required to e.g. reset the GPU falcons. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/dma.rs | 54 ++++++++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/gpu.rs | 53 +++++++++++++++++++++++++++++++++++-- drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 10 +++++++ 4 files changed, 116 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/nova-core/dma.rs b/drivers/gpu/nova-core/dma.rs new file mode 100644 index 0000000000000000000000000000000000000000..a4162bff597132a04e002b2b910a4537bbabc287 --- /dev/null +++ b/drivers/gpu/nova-core/dma.rs @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Simple DMA object wrapper. + +// To be removed when all code is used. +#![allow(dead_code)] + +use kernel::device; +use kernel::dma::CoherentAllocation; +use kernel::page::PAGE_SIZE; +use kernel::prelude::*; + +pub(crate) struct DmaObject { + pub dma: CoherentAllocation<u8>, + pub len: usize, + #[allow(dead_code)] + pub name: &'static str, +} + +impl DmaObject { + pub(crate) fn new( + dev: &device::Device<device::Bound>, + len: usize, + name: &'static str, + ) -> Result<Self> { + let len = core::alloc::Layout::from_size_align(len, PAGE_SIZE) + .map_err(|_| EINVAL)? + .pad_to_align() + .size(); + let dma = CoherentAllocation::alloc_coherent(dev, len, GFP_KERNEL | __GFP_ZERO)?; + + Ok(Self { dma, len, name }) + } + + pub(crate) fn from_data( + dev: &device::Device<device::Bound>, + data: &[u8], + name: &'static str, + ) -> Result<Self> { + Self::new(dev, data.len(), name).and_then(|mut dma_obj| { + // SAFETY: + // - The copied data fits within the size of the allocated object. + // - We have just created this object and there is no other user at this stage. + unsafe { + core::ptr::copy_nonoverlapping( + data.as_ptr(), + dma_obj.dma.start_ptr_mut(), + data.len(), + ); + } + Ok(dma_obj) + }) + } +} diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 1f7799692a0ab042f2540e01414f5ca347ae9ecc..d43e710cc983d51f053dacbd77cbbfb79fa882c3 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -3,6 +3,7 @@ use kernel::{device, devres::Devres, error::code::*, pci, prelude::*}; use crate::devinit; +use crate::dma::DmaObject; use crate::driver::Bar0; use crate::firmware::Firmware; use crate::regs; @@ -145,12 +146,30 @@ fn new(bar: &Devres<Bar0>) -> Result<Spec> { } /// Structure holding the resources required to operate the GPU. -#[pin_data] +#[pin_data(PinnedDrop)] pub(crate) struct Gpu { spec: Spec, /// MMIO mapping of PCI BAR 0 bar: Devres<Bar0>, fw: Firmware, + sysmem_flush: DmaObject, +} + +#[pinned_drop] +impl PinnedDrop for Gpu { + fn drop(self: Pin<&mut Self>) { + // Unregister the sysmem flush page before we release it. + let _ = with_bar!(&self.bar, |b| { + regs::PfbNisoFlushSysmemAddr::default() + .set_adr_39_08(0) + .write(b); + if self.spec.chipset >= Chipset::GA102 { + regs::PfbNisoFlushSysmemAddrHi::default() + .set_adr_63_40(0) + .write(b); + } + }); + } } impl Gpu { @@ -173,6 +192,36 @@ pub(crate) fn new( devinit::wait_gfw_boot_completion(&bar) .inspect_err(|_| pr_err!("GFW boot did not complete"))?; - Ok(pin_init!(Self { spec, bar, fw })) + // System memory page required for sysmembar to properly flush into system memory. + let sysmem_flush = { + let page = DmaObject::new( + pdev.as_ref(), + kernel::bindings::PAGE_SIZE, + "sysmem flush page", + )?; + + // Register the sysmem flush page. + with_bar!(bar, |b| { + let handle = page.dma.dma_handle(); + + regs::PfbNisoFlushSysmemAddr::default() + .set_adr_39_08((handle >> 8) as u32) + .write(b); + if spec.chipset >= Chipset::GA102 { + regs::PfbNisoFlushSysmemAddrHi::default() + .set_adr_63_40((handle >> 40) as u32) + .write(b); + } + })?; + + page + }; + + Ok(pin_init!(Self { + spec, + bar, + fw, + sysmem_flush, + })) } } diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 878161e060f54da7738c656f6098936a62dcaa93..37c7eb0ea7a926bee4e3c661028847291bf07fa2 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -21,6 +21,7 @@ macro_rules! with_bar { } mod devinit; +mod dma; mod driver; mod firmware; mod gpu; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index fd7096f0ddd4af90114dd1119d9715d2cd3aa2ac..1e24787c4b5f432ac25fe399c8cb38b7350e44ae 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -14,6 +14,16 @@ 28:20 chipset => try_into Chipset, "chipset model" ); +/* PFB */ + +register!(PfbNisoFlushSysmemAddr at 0x00100c10; + 31:0 adr_39_08 => as u32 +); + +register!(PfbNisoFlushSysmemAddrHi at 0x00100c40; + 23:0 adr_63_40 => as u32 +); + /* GC6 */ register!(Pgc6AonSecureScratchGroup05PrivLevelMask at 0x00118128; -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 10/16] gpu: nova-core: add basic timer device
Add a timer that works with GPU time and provides the ability to wait on a condition with a specific timeout. The `Duration` Rust type is used to keep track is differences between timestamps ; this will be replaced by the equivalent kernel type once it lands. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 5 ++ drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 10 +++ drivers/gpu/nova-core/timer.rs | 133 +++++++++++++++++++++++++++++++++++++ 4 files changed, 149 insertions(+) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index d43e710cc983d51f053dacbd77cbbfb79fa882c3..1b3e43e0412e2a2ea178c7404ea647c9e38d4e04 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -7,6 +7,7 @@ use crate::driver::Bar0; use crate::firmware::Firmware; use crate::regs; +use crate::timer::Timer; use crate::util; use core::fmt; @@ -153,6 +154,7 @@ pub(crate) struct Gpu { bar: Devres<Bar0>, fw: Firmware, sysmem_flush: DmaObject, + timer: Timer, } #[pinned_drop] @@ -217,11 +219,14 @@ pub(crate) fn new( page }; + let timer = Timer::new(); + Ok(pin_init!(Self { spec, bar, fw, sysmem_flush, + timer, })) } } diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 37c7eb0ea7a926bee4e3c661028847291bf07fa2..df3468c92c6081b3e2db218d92fbe1c40a0a75c3 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -26,6 +26,7 @@ macro_rules! with_bar { mod firmware; mod gpu; mod regs; +mod timer; mod util; kernel::module_pci_driver! { diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 1e24787c4b5f432ac25fe399c8cb38b7350e44ae..f191cf4eb44c2b950e5cfcc6d04f95c122ce29d3 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -14,6 +14,16 @@ 28:20 chipset => try_into Chipset, "chipset model" ); +/* PTIMER */ + +register!(PtimerTime0 at 0x00009400; + 31:0 lo => as u32, "low 32-bits of the timer" +); + +register!(PtimerTime1 at 0x00009410; + 31:0 hi => as u32, "high 32 bits of the timer" +); + /* PFB */ register!(PfbNisoFlushSysmemAddr at 0x00100c10; diff --git a/drivers/gpu/nova-core/timer.rs b/drivers/gpu/nova-core/timer.rs new file mode 100644 index 0000000000000000000000000000000000000000..8987352f4192bc9b4b2fc0fb5f2e8e62ff27be68 --- /dev/null +++ b/drivers/gpu/nova-core/timer.rs @@ -0,0 +1,133 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Nova Core Timer subdevice + +// To be removed when all code is used. +#![allow(dead_code)] + +use core::fmt::Display; +use core::ops::{Add, Sub}; +use core::time::Duration; + +use kernel::devres::Devres; +use kernel::num::U64Ext; +use kernel::prelude::*; + +use crate::driver::Bar0; +use crate::regs; + +/// A timestamp with nanosecond granularity obtained from the GPU timer. +/// +/// A timestamp can also be substracted to another in order to obtain a [`Duration`]. +#[derive(Debug, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)] +pub(crate) struct Timestamp(u64); + +impl Display for Timestamp { + fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { + write!(f, "{}", self.0) + } +} + +impl Add<Duration> for Timestamp { + type Output = Self; + + fn add(mut self, rhs: Duration) -> Self::Output { + let mut nanos = rhs.as_nanos(); + while nanos > u64::MAX as u128 { + self.0 = self.0.wrapping_add(nanos as u64); + nanos -= u64::MAX as u128; + } + + Timestamp(self.0.wrapping_add(nanos as u64)) + } +} + +impl Sub for Timestamp { + type Output = Duration; + + fn sub(self, rhs: Self) -> Self::Output { + Duration::from_nanos(self.0.wrapping_sub(rhs.0)) + } +} + +pub(crate) struct Timer {} + +impl Timer { + pub(crate) fn new() -> Self { + Self {} + } + + /// Read the current timer timestamp. + pub(crate) fn read(&self, bar: &Bar0) -> Timestamp { + loop { + let hi = regs::PtimerTime1::read(bar); + let lo = regs::PtimerTime0::read(bar); + + if hi.hi() == regs::PtimerTime1::read(bar).hi() { + return Timestamp(u64::from_u32s(hi.hi(), lo.lo())); + } + } + } + + #[allow(dead_code)] + pub(crate) fn time(bar: &Bar0, time: u64) { + regs::PtimerTime1::default() + .set_hi(time.upper_32_bits()) + .write(bar); + regs::PtimerTime0::default() + .set_lo(time.lower_32_bits()) + .write(bar); + } + + /// Wait until `cond` is true or `timeout` elapsed, based on GPU time. + /// + /// When `cond` evaluates to `Some`, its return value is returned. + /// + /// `Err(ETIMEDOUT)` is returned if `timeout` has been reached without `cond` evaluating to + /// `Some`, or if the timer device is stuck for some reason. + pub(crate) fn wait_on<R, F: Fn() -> Option<R>>( + &self, + bar: &Devres<Bar0>, + timeout: Duration, + cond: F, + ) -> Result<R> { + // Number of consecutive time reads after which we consider the timer frozen if it hasn't + // moved forward. + const MAX_STALLED_READS: usize = 16; + + let (mut cur_time, mut prev_time, deadline) = { + let cur_time = with_bar!(bar, |b| self.read(b))?; + let deadline = cur_time + timeout; + + (cur_time, cur_time, deadline) + }; + let mut num_reads = 0; + + loop { + if let Some(ret) = cond() { + return Ok(ret); + } + + (|| { + cur_time = with_bar!(bar, |b| self.read(b))?; + + /* Check if the timer is frozen for some reason. */ + if cur_time == prev_time { + if num_reads >= MAX_STALLED_READS { + return Err(ETIMEDOUT); + } + num_reads += 1; + } else { + if cur_time >= deadline { + return Err(ETIMEDOUT); + } + + num_reads = 0; + prev_time = cur_time; + } + + Ok(()) + })()?; + } + } +} -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 11/16] gpu: nova-core: add falcon register definitions and base code
Add the common Falcon code and HAL for Ampere GPUs, and instantiate the GSP and SEC2 Falcons that will be required to boot the GSP. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/falcon.rs | 469 ++++++++++++++++++++++++++++++ drivers/gpu/nova-core/falcon/gsp.rs | 27 ++ drivers/gpu/nova-core/falcon/hal.rs | 54 ++++ drivers/gpu/nova-core/falcon/hal/ga102.rs | 111 +++++++ drivers/gpu/nova-core/falcon/sec2.rs | 9 + drivers/gpu/nova-core/gpu.rs | 16 + drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 189 ++++++++++++ drivers/gpu/nova-core/timer.rs | 3 - 9 files changed, 876 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs new file mode 100644 index 0000000000000000000000000000000000000000..71f374445ff3277eac628e183942c79f557366d5 --- /dev/null +++ b/drivers/gpu/nova-core/falcon.rs @@ -0,0 +1,469 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Falcon microprocessor base support + +// To be removed when all code is used. +#![allow(dead_code)] + +use core::hint::unreachable_unchecked; +use core::time::Duration; +use hal::FalconHal; +use kernel::bindings; +use kernel::devres::Devres; +use kernel::sync::Arc; +use kernel::{pci, prelude::*}; + +use crate::driver::Bar0; +use crate::gpu::Chipset; +use crate::regs; +use crate::timer::Timer; + +pub(crate) mod gsp; +mod hal; +pub(crate) mod sec2; + +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)] +pub(crate) enum FalconCoreRev { + #[default] + Rev1 = 1, + Rev2 = 2, + Rev3 = 3, + Rev4 = 4, + Rev5 = 5, + Rev6 = 6, + Rev7 = 7, +} + +impl TryFrom<u32> for FalconCoreRev { + type Error = Error; + + fn try_from(value: u32) -> core::result::Result<Self, Self::Error> { + use FalconCoreRev::*; + + let rev = match value { + 1 => Rev1, + 2 => Rev2, + 3 => Rev3, + 4 => Rev4, + 5 => Rev5, + 6 => Rev6, + 7 => Rev7, + _ => return Err(EINVAL), + }; + + Ok(rev) + } +} + +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone)] +pub(crate) enum FalconSecurityModel { + #[default] + None = 0, + Light = 2, + Heavy = 3, +} + +impl TryFrom<u32> for FalconSecurityModel { + type Error = Error; + + fn try_from(value: u32) -> core::result::Result<Self, Self::Error> { + use FalconSecurityModel::*; + + let sec_model = match value { + 0 => None, + 2 => Light, + 3 => Heavy, + _ => return Err(EINVAL), + }; + + Ok(sec_model) + } +} + +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, PartialOrd, Ord)] +pub(crate) enum FalconCoreRevSubversion { + #[default] + Subversion0 = 0, + Subversion1 = 1, + Subversion2 = 2, + Subversion3 = 3, +} + +impl From<u32> for FalconCoreRevSubversion { + fn from(value: u32) -> Self { + use FalconCoreRevSubversion::*; + + match value & 0b11 { + 0 => Subversion0, + 1 => Subversion1, + 2 => Subversion2, + 3 => Subversion3, + // SAFETY: the `0b11` mask limits the possible values to `0..=3`. + 4..=u32::MAX => unsafe { unreachable_unchecked() }, + } + } +} + +#[repr(u8)] +#[derive(Debug, Default, Copy, Clone, PartialEq, Eq)] +pub(crate) enum FalconModSelAlgo { + #[default] + Rsa3k = 1, +} + +impl TryFrom<u32> for FalconModSelAlgo { + type Error = Error; + + fn try_from(value: u32) -> core::result::Result<Self, Self::Error> { + match value { + 1 => Ok(FalconModSelAlgo::Rsa3k), + _ => Err(EINVAL), + } + } +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub(crate) enum RiscvCoreSelect { + Falcon = 0, + Riscv = 1, +} + +impl From<bool> for RiscvCoreSelect { + fn from(value: bool) -> Self { + match value { + false => RiscvCoreSelect::Falcon, + true => RiscvCoreSelect::Riscv, + } + } +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub(crate) enum FalconMem { + Imem, + Dmem, +} + +#[derive(Debug, Clone, Default)] +pub(crate) enum FalconFbifTarget { + #[default] + LocalFb = 0, + CoherentSysmem = 1, + NoncoherentSysmem = 2, +} + +impl TryFrom<u32> for FalconFbifTarget { + type Error = Error; + + fn try_from(value: u32) -> core::result::Result<Self, Self::Error> { + let res = match value { + 0 => Self::LocalFb, + 1 => Self::CoherentSysmem, + 2 => Self::NoncoherentSysmem, + _ => return Err(EINVAL), + }; + + Ok(res) + } +} + +#[derive(Debug, Clone, Default)] +pub(crate) enum FalconFbifMemType { + #[default] + Virtual = 0, + Physical = 1, +} + +impl From<bool> for FalconFbifMemType { + fn from(value: bool) -> Self { + match value { + false => Self::Virtual, + true => Self::Physical, + } + } +} + +/// Trait defining the parameters of a given Falcon instance. +pub(crate) trait FalconEngine: Sync { + /// Base I/O address for the falcon, relative from which its registers are accessed. + const BASE: usize; +} + +/// Represents a portion of the firmware to be loaded into a particular memory (e.g. IMEM or DMEM). +#[derive(Debug)] +pub(crate) struct FalconLoadTarget { + /// Offset from the start of the source object to copy from. + pub(crate) src_start: u32, + /// Offset from the start of the destination memory to copy into. + pub(crate) dst_start: u32, + /// Number of bytes to copy. + pub(crate) len: u32, +} + +#[derive(Debug)] +pub(crate) struct FalconBromParams { + pub(crate) pkc_data_offset: u32, + pub(crate) engine_id_mask: u16, + pub(crate) ucode_id: u8, +} + +pub(crate) trait FalconFirmware { + type Target: FalconEngine; + + /// Returns the DMA handle of the object containing the firmware. + fn dma_handle(&self) -> bindings::dma_addr_t; + + /// Returns the load parameters for `IMEM`. + fn imem_load(&self) -> FalconLoadTarget; + + /// Returns the load parameters for `DMEM`. + fn dmem_load(&self) -> FalconLoadTarget; + + /// Returns the parameters to write into the BROM registers. + fn brom_params(&self) -> FalconBromParams; + + /// Returns the start address of the firmware. + fn boot_addr(&self) -> u32; +} + +/// Contains the base parameters common to all Falcon instances. +pub(crate) struct Falcon<E: FalconEngine> { + pub hal: Arc<dyn FalconHal<E>>, +} + +impl<E: FalconEngine + 'static> Falcon<E> { + pub(crate) fn new( + pdev: &pci::Device, + chipset: Chipset, + bar: &Devres<Bar0>, + need_riscv: bool, + ) -> Result<Self> { + let hwcfg1 = with_bar!(bar, |b| regs::FalconHwcfg1::read(b, E::BASE))?; + // Ensure that the revision and security model contain valid values. + let _rev = hwcfg1.core_rev()?; + let _sec_model = hwcfg1.security_model()?; + + if need_riscv { + let hwcfg2 = with_bar!(bar, |b| regs::FalconHwcfg2::read(b, E::BASE))?; + if !hwcfg2.riscv() { + dev_err!( + pdev.as_ref(), + "riscv support requested on falcon that does not support it\n" + ); + return Err(EINVAL); + } + } + + Ok(Self { + hal: hal::create_falcon_hal(chipset)?, + }) + } + + fn reset_wait_mem_scrubbing(&self, bar: &Devres<Bar0>, timer: &Timer) -> Result<()> { + timer.wait_on(bar, Duration::from_millis(20), || { + bar.try_access_with(|b| regs::FalconHwcfg2::read(b, E::BASE)) + .and_then(|r| if r.mem_scrubbing() { Some(()) } else { None }) + }) + } + + fn reset_eng(&self, bar: &Devres<Bar0>, timer: &Timer) -> Result<()> { + let _ = with_bar!(bar, |b| regs::FalconHwcfg2::read(b, E::BASE))?; + + // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set + // RESET_READY so a non-failing timeout is used. + let _ = timer.wait_on(bar, Duration::from_micros(150), || { + bar.try_access_with(|b| regs::FalconHwcfg2::read(b, E::BASE)) + .and_then(|r| if r.reset_ready() { Some(()) } else { None }) + }); + + with_bar!(bar, |b| regs::FalconEngine::alter(b, E::BASE, |v| v + .set_reset(true)))?; + + let _: Result<()> = timer.wait_on(bar, Duration::from_micros(10), || None); + + with_bar!(bar, |b| regs::FalconEngine::alter(b, E::BASE, |v| v + .set_reset(false)))?; + + self.reset_wait_mem_scrubbing(bar, timer)?; + + Ok(()) + } + + pub(crate) fn reset(&self, bar: &Devres<Bar0>, timer: &Timer) -> Result<()> { + self.reset_eng(bar, timer)?; + self.hal.select_core(bar, timer)?; + self.reset_wait_mem_scrubbing(bar, timer)?; + + with_bar!(bar, |b| { + regs::FalconRm::default() + .set_val(regs::Boot0::read(b).into()) + .write(b, E::BASE) + }) + } + + fn dma_wr( + &self, + bar: &Devres<Bar0>, + timer: &Timer, + dma_handle: bindings::dma_addr_t, + target_mem: FalconMem, + load_offsets: FalconLoadTarget, + sec: bool, + ) -> Result<()> { + const DMA_LEN: u32 = 256; + const DMA_LEN_ILOG2_MINUS2: u8 = (DMA_LEN.ilog2() - 2) as u8; + + // For IMEM, we want to use the start offset as a virtual address tag for each page, since + // code addresses in the firmware (and the boot vector) are virtual. + // + // For DMEM we can fold the start offset into the DMA handle. + let (src_start, dma_start) = match target_mem { + FalconMem::Imem => (load_offsets.src_start, dma_handle), + FalconMem::Dmem => ( + 0, + dma_handle + load_offsets.src_start as bindings::dma_addr_t, + ), + }; + if dma_start % DMA_LEN as bindings::dma_addr_t > 0 { + pr_err!( + "DMA transfer start addresses must be a multiple of {}", + DMA_LEN + ); + return Err(EINVAL); + } + if load_offsets.len % DMA_LEN > 0 { + pr_err!("DMA transfer length must be a multiple of {}", DMA_LEN); + return Err(EINVAL); + } + + // Set up the base source DMA address. + with_bar!(bar, |b| { + regs::FalconDmaTrfBase::default() + .set_base((dma_start >> 8) as u32) + .write(b, E::BASE); + regs::FalconDmaTrfBase1::default() + .set_base((dma_start >> 40) as u16) + .write(b, E::BASE) + })?; + + let cmd = regs::FalconDmaTrfCmd::default() + .set_size(DMA_LEN_ILOG2_MINUS2) + .set_imem(target_mem == FalconMem::Imem) + .set_sec(if sec { 1 } else { 0 }); + + for pos in (0..load_offsets.len).step_by(DMA_LEN as usize) { + // Perform a transfer of size `DMA_LEN`. + with_bar!(bar, |b| { + regs::FalconDmaTrfMOffs::default() + .set_offs(load_offsets.dst_start + pos) + .write(b, E::BASE); + regs::FalconDmaTrfBOffs::default() + .set_offs(src_start + pos) + .write(b, E::BASE); + cmd.write(b, E::BASE) + })?; + + // Wait for the transfer to complete. + timer.wait_on(bar, Duration::from_millis(2000), || { + bar.try_access_with(|b| regs::FalconDmaTrfCmd::read(b, E::BASE)) + .and_then(|v| if v.idle() { Some(()) } else { None }) + })?; + } + + Ok(()) + } + + pub(crate) fn dma_load<F: FalconFirmware<Target = E>>( + &self, + bar: &Devres<Bar0>, + timer: &Timer, + fw: &F, + ) -> Result<()> { + let dma_handle = fw.dma_handle(); + + with_bar!(bar, |b| { + regs::FalconFbifCtl::alter(b, E::BASE, |v| v.set_allow_phys_no_ctx(true)); + regs::FalconDmaCtl::default().write(b, E::BASE); + regs::FalconFbifTranscfg::alter(b, E::BASE, |v| { + v.set_target(FalconFbifTarget::CoherentSysmem) + .set_mem_type(FalconFbifMemType::Physical) + }); + })?; + + self.dma_wr( + bar, + timer, + dma_handle, + FalconMem::Imem, + fw.imem_load(), + true, + )?; + self.dma_wr( + bar, + timer, + dma_handle, + FalconMem::Dmem, + fw.dmem_load(), + true, + )?; + + self.hal.program_brom(bar, &fw.brom_params())?; + + with_bar!(bar, |b| { + // Set `BootVec` to start of non-secure code. + regs::FalconBootVec::default() + .set_boot_vec(fw.boot_addr()) + .write(b, E::BASE); + })?; + + Ok(()) + } + + pub(crate) fn boot( + &self, + bar: &Devres<Bar0>, + timer: &Timer, + mbox0: Option<u32>, + mbox1: Option<u32>, + ) -> Result<(u32, u32)> { + with_bar!(bar, |b| { + if let Some(mbox0) = mbox0 { + regs::FalconMailbox0::default() + .set_mailbox0(mbox0) + .write(b, E::BASE); + } + + if let Some(mbox1) = mbox1 { + regs::FalconMailbox1::default() + .set_mailbox1(mbox1) + .write(b, E::BASE); + } + + match regs::FalconCpuCtl::read(b, E::BASE).alias_en() { + true => regs::FalconCpuCtlAlias::default() + .set_start_cpu(true) + .write(b, E::BASE), + false => regs::FalconCpuCtl::default() + .set_start_cpu(true) + .write(b, E::BASE), + } + })?; + + timer.wait_on(bar, Duration::from_secs(2), || { + bar.try_access() + .map(|b| regs::FalconCpuCtl::read(&*b, E::BASE)) + .and_then(|v| if v.halted() { Some(()) } else { None }) + })?; + + let (mbox0, mbox1) = with_bar!(bar, |b| { + let mbox0 = regs::FalconMailbox0::read(b, E::BASE).mailbox0(); + let mbox1 = regs::FalconMailbox1::read(b, E::BASE).mailbox1(); + + (mbox0, mbox1) + })?; + + Ok((mbox0, mbox1)) + } +} diff --git a/drivers/gpu/nova-core/falcon/gsp.rs b/drivers/gpu/nova-core/falcon/gsp.rs new file mode 100644 index 0000000000000000000000000000000000000000..44b8dc118eda1263eaede466efd55408c6e7cded --- /dev/null +++ b/drivers/gpu/nova-core/falcon/gsp.rs @@ -0,0 +1,27 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::devres::Devres; +use kernel::prelude::*; + +use crate::{ + driver::Bar0, + falcon::{Falcon, FalconEngine}, + regs, +}; + +pub(crate) struct Gsp; +impl FalconEngine for Gsp { + const BASE: usize = 0x00110000; +} + +pub(crate) type GspFalcon = Falcon<Gsp>; + +impl Falcon<Gsp> { + /// Clears the SWGEN0 bit in the Falcon's IRQ status clear register to + /// allow GSP to signal CPU for processing new messages in message queue. + pub(crate) fn clear_swgen0_intr(&self, bar: &Devres<Bar0>) -> Result<()> { + with_bar!(bar, |b| regs::FalconIrqsclr::default() + .set_swgen0(true) + .write(b, Gsp::BASE)) + } +} diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs new file mode 100644 index 0000000000000000000000000000000000000000..5ebf4e88f1f25a13cf47859a53507be53e795d34 --- /dev/null +++ b/drivers/gpu/nova-core/falcon/hal.rs @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::devres::Devres; +use kernel::prelude::*; +use kernel::sync::Arc; + +use crate::driver::Bar0; +use crate::falcon::{FalconBromParams, FalconEngine}; +use crate::gpu::Chipset; +use crate::timer::Timer; + +mod ga102; + +/// Hardware Abstraction Layer for Falcon cores. +/// +/// Implements chipset-specific low-level operations. The trait is generic against [`FalconEngine`] +/// so its `BASE` parameter can be used in order to avoid runtime bound checks when accessing +/// registers. +pub(crate) trait FalconHal<E: FalconEngine>: Sync { + // Activates the Falcon core if the engine is a risvc/falcon dual engine. + fn select_core(&self, _bar: &Devres<Bar0>, _timer: &Timer) -> Result<()> { + Ok(()) + } + + fn get_signature_reg_fuse_version( + &self, + bar: &Devres<Bar0>, + engine_id_mask: u16, + ucode_id: u8, + ) -> Result<u32>; + + // Program the BROM registers prior to starting a secure firmware. + fn program_brom(&self, bar: &Devres<Bar0>, params: &FalconBromParams) -> Result<()>; +} + +/// Returns a boxed falcon HAL adequate for the passed `chipset`. +/// +/// We use this function and a heap-allocated trait object instead of statically defined trait +/// objects because of the two-dimensional (Chipset, Engine) lookup required to return the +/// requested HAL. +/// +/// TODO: replace the return type with `KBox` once it gains the ability to host trait objects. +pub(crate) fn create_falcon_hal<E: FalconEngine + 'static>( + chipset: Chipset, +) -> Result<Arc<dyn FalconHal<E>>> { + let hal = match chipset { + Chipset::GA102 | Chipset::GA103 | Chipset::GA104 | Chipset::GA106 | Chipset::GA107 => { + Arc::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as Arc<dyn FalconHal<E>> + } + _ => return Err(ENOTSUPP), + }; + + Ok(hal) +} diff --git a/drivers/gpu/nova-core/falcon/hal/ga102.rs b/drivers/gpu/nova-core/falcon/hal/ga102.rs new file mode 100644 index 0000000000000000000000000000000000000000..747b02ca671f7d4a97142665a9ba64807c87391e --- /dev/null +++ b/drivers/gpu/nova-core/falcon/hal/ga102.rs @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 + +use core::marker::PhantomData; +use core::time::Duration; + +use kernel::devres::Devres; +use kernel::prelude::*; + +use crate::driver::Bar0; +use crate::falcon::{FalconBromParams, FalconEngine, FalconModSelAlgo, RiscvCoreSelect}; +use crate::regs; +use crate::timer::Timer; + +use super::FalconHal; + +fn select_core_ga102<E: FalconEngine>(bar: &Devres<Bar0>, timer: &Timer) -> Result<()> { + let bcr_ctrl = with_bar!(bar, |b| regs::RiscvBcrCtrl::read(b, E::BASE))?; + if bcr_ctrl.core_select() != RiscvCoreSelect::Falcon { + with_bar!(bar, |b| regs::RiscvBcrCtrl::default() + .set_core_select(RiscvCoreSelect::Falcon) + .write(b, E::BASE))?; + + timer.wait_on(bar, Duration::from_millis(10), || { + bar.try_access_with(|b| regs::RiscvBcrCtrl::read(b, E::BASE)) + .and_then(|v| if v.valid() { Some(()) } else { None }) + })?; + } + + Ok(()) +} + +fn get_signature_reg_fuse_version_ga102( + bar: &Devres<Bar0>, + engine_id_mask: u16, + ucode_id: u8, +) -> Result<u32> { + // The ucode fuse versions are contained in the FUSE_OPT_FPF_<ENGINE>_UCODE<X>_VERSION + // registers, which are an array. Our register definition macros do not allow us to manage them + // properly, so we need to hardcode their addresses for now. + + // Each engine has 16 ucode version registers numbered from 1 to 16. + if ucode_id == 0 || ucode_id > 16 { + pr_warn!("invalid ucode id {:#x}", ucode_id); + return Err(EINVAL); + } + let reg_fuse = if engine_id_mask & 0x0001 != 0 { + // NV_FUSE_OPT_FPF_SEC2_UCODE1_VERSION + 0x824140 + } else if engine_id_mask & 0x0004 != 0 { + // NV_FUSE_OPT_FPF_NVDEC_UCODE1_VERSION + 0x824100 + } else if engine_id_mask & 0x0400 != 0 { + // NV_FUSE_OPT_FPF_GSP_UCODE1_VERSION + 0x8241c0 + } else { + pr_warn!("unexpected engine_id_mask {:#x}", engine_id_mask); + return Err(EINVAL); + } + ((ucode_id - 1) as usize * core::mem::size_of::<u32>()); + + let reg_fuse_version = with_bar!(bar, |b| { b.read32(reg_fuse) })?; + + // Equivalent of Find Last Set bit. + Ok(u32::BITS - reg_fuse_version.leading_zeros()) +} + +fn program_brom_ga102<E: FalconEngine>( + bar: &Devres<Bar0>, + params: &FalconBromParams, +) -> Result<()> { + with_bar!(bar, |b| { + regs::FalconBromParaaddr0::default() + .set_addr(params.pkc_data_offset) + .write(b, E::BASE); + regs::FalconBromEngidmask::default() + .set_mask(params.engine_id_mask as u32) + .write(b, E::BASE); + regs::FalconBromCurrUcodeId::default() + .set_ucode_id(params.ucode_id as u32) + .write(b, E::BASE); + regs::FalconModSel::default() + .set_algo(FalconModSelAlgo::Rsa3k) + .write(b, E::BASE) + }) +} + +pub(super) struct Ga102<E: FalconEngine>(PhantomData<E>); + +impl<E: FalconEngine> Ga102<E> { + pub(super) fn new() -> Self { + Self(PhantomData) + } +} + +impl<E: FalconEngine> FalconHal<E> for Ga102<E> { + fn select_core(&self, bar: &Devres<Bar0>, timer: &Timer) -> Result<()> { + select_core_ga102::<E>(bar, timer) + } + + fn get_signature_reg_fuse_version( + &self, + bar: &Devres<Bar0>, + engine_id_mask: u16, + ucode_id: u8, + ) -> Result<u32> { + get_signature_reg_fuse_version_ga102(bar, engine_id_mask, ucode_id) + } + + fn program_brom(&self, bar: &Devres<Bar0>, params: &FalconBromParams) -> Result<()> { + program_brom_ga102::<E>(bar, params) + } +} diff --git a/drivers/gpu/nova-core/falcon/sec2.rs b/drivers/gpu/nova-core/falcon/sec2.rs new file mode 100644 index 0000000000000000000000000000000000000000..85dda3e8380a3d31d34c92c4236c6f81c63ce772 --- /dev/null +++ b/drivers/gpu/nova-core/falcon/sec2.rs @@ -0,0 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 + +use crate::falcon::{Falcon, FalconEngine}; + +pub(crate) struct Sec2; +impl FalconEngine for Sec2 { + const BASE: usize = 0x00840000; +} +pub(crate) type Sec2Falcon = Falcon<Sec2>; diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 1b3e43e0412e2a2ea178c7404ea647c9e38d4e04..ec4c648c6e8b4aa7d06c627ed59c0e66a08c679e 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -5,6 +5,8 @@ use crate::devinit; use crate::dma::DmaObject; use crate::driver::Bar0; +use crate::falcon::gsp::GspFalcon; +use crate::falcon::sec2::Sec2Falcon; use crate::firmware::Firmware; use crate::regs; use crate::timer::Timer; @@ -221,6 +223,20 @@ pub(crate) fn new( let timer = Timer::new(); + let gsp_falcon = GspFalcon::new( + pdev, + spec.chipset, + &bar, + if spec.chipset > Chipset::GA100 { + true + } else { + false + }, + )?; + gsp_falcon.clear_swgen0_intr(&bar)?; + + let _sec2_falcon = Sec2Falcon::new(pdev, spec.chipset, &bar, true)?; + Ok(pin_init!(Self { spec, bar, diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index df3468c92c6081b3e2db218d92fbe1c40a0a75c3..4dde8004d24882c60669b5acd6af9d6988c66a9c 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -23,6 +23,7 @@ macro_rules! with_bar { mod devinit; mod dma; mod driver; +mod falcon; mod firmware; mod gpu; mod regs; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index f191cf4eb44c2b950e5cfcc6d04f95c122ce29d3..c76a16dc8e7267a4eb54cb71e1cca6fb9e00188f 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -6,6 +6,10 @@ #[macro_use] mod macros; +use crate::falcon::{ + FalconCoreRev, FalconCoreRevSubversion, FalconFbifMemType, FalconFbifTarget, FalconModSelAlgo, + FalconSecurityModel, RiscvCoreSelect, +}; use crate::gpu::Chipset; register!(Boot0 at 0x00000000, "Basic revision information about the GPU"; @@ -44,3 +48,188 @@ register!(Pgc6AonSecureScratchGroup05 at 0x00118234; 31:0 value => as u32 ); + +/* PFALCON */ + +register!(FalconIrqsclr at +0x00000004; + 4:4 halt => as_bit bool; + 6:6 swgen0 => as_bit bool; +); + +register!(FalconIrqstat at +0x00000008; + 4:4 halt => as_bit bool; + 6:6 swgen0 => as_bit bool; +); + +register!(FalconIrqmclr at +0x00000014; + 31:0 val => as u32 +); + +register!(FalconIrqmask at +0x00000018; + 31:0 val => as u32 +); + +register!(FalconRm at +0x00000084; + 31:0 val => as u32 +); + +register!(FalconIrqdest at +0x0000001c; + 31:0 val => as u32 +); + +register!(FalconMailbox0 at +0x00000040; + 31:0 mailbox0 => as u32 +); +register!(FalconMailbox1 at +0x00000044; + 31:0 mailbox1 => as u32 +); + +register!(FalconHwcfg2 at +0x000000f4; + 10:10 riscv => as_bit bool; + 12:12 mem_scrubbing => as_bit bool; + 31:31 reset_ready => as_bit bool; +); + +register!(FalconCpuCtl at +0x00000100; + 1:1 start_cpu => as_bit bool; + 4:4 halted => as_bit bool; + 6:6 alias_en => as_bit bool; +); + +register!(FalconBootVec at +0x00000104; + 31:0 boot_vec => as u32 +); + +register!(FalconHwCfg at +0x00000108; + 8:0 imem_size => as u32; + 17:9 dmem_size => as u32; +); + +register!(FalconDmaCtl at +0x0000010c; + 0:0 require_ctx => as_bit bool; + 1:1 dmem_scrubbing => as_bit bool; + 2:2 imem_scrubbing => as_bit bool; + 6:3 dmaq_num => as_bit u8; + 7:7 secure_stat => as_bit bool; +); + +register!(FalconDmaTrfBase at +0x00000110; + 31:0 base => as u32; +); + +register!(FalconDmaTrfMOffs at +0x00000114; + 23:0 offs => as u32; +); + +register!(FalconDmaTrfCmd at +0x00000118; + 0:0 full => as_bit bool; + 1:1 idle => as_bit bool; + 3:2 sec => as_bit u8; + 4:4 imem => as_bit bool; + 5:5 is_write => as_bit bool; + 10:8 size => as u8; + 14:12 ctxdma => as u8; + 16:16 set_dmtag => as u8; +); + +register!(FalconDmaTrfBOffs at +0x0000011c; + 31:0 offs => as u32; +); + +register!(FalconDmaTrfBase1 at +0x00000128; + 8:0 base => as u16; +); + +register!(FalconHwcfg1 at +0x0000012c; + 3:0 core_rev => try_into FalconCoreRev, "core revision of the falcon"; + 5:4 security_model => try_into FalconSecurityModel, "security model of the falcon"; + 7:6 core_rev_subversion => into FalconCoreRevSubversion; + 11:8 imem_ports => as u8; + 15:12 dmem_ports => as u8; +); + +register!(FalconCpuCtlAlias at +0x00000130; + 1:1 start_cpu => as_bit bool; +); + +/* TODO: this is an array of registers */ +register!(FalconImemC at +0x00000180; + 7:2 offs => as u8; + 23:8 blk => as u8; + 24:24 aincw => as_bit bool; + 25:25 aincr => as_bit bool; + 28:28 secure => as_bit bool; + 29:29 sec_atomic => as_bit bool; +); + +register!(FalconImemD at +0x00000184; + 31:0 data => as u32; +); + +register!(FalconImemT at +0x00000188; + 15:0 data => as u16; +); + +register!(FalconDmemC at +0x000001c0; + 7:2 offs => as u8; + 23:0 addr => as u32; + 23:8 blk => as u8; + 24:24 aincw => as_bit bool; + 25:25 aincr => as_bit bool; + 26:26 settag => as_bit bool; + 27:27 setlvl => as_bit bool; + 28:28 va => as_bit bool; + 29:29 miss => as_bit bool; +); + +register!(FalconDmemD at +0x000001c4; + 31:0 data => as u32; +); + +register!(FalconModSel at +0x00001180; + 7:0 algo => try_into FalconModSelAlgo; +); +register!(FalconBromCurrUcodeId at +0x00001198; + 31:0 ucode_id => as u32; +); +register!(FalconBromEngidmask at +0x0000119c; + 31:0 mask => as u32; +); +register!(FalconBromParaaddr0 at +0x00001210; + 31:0 addr => as u32; +); + +register!(RiscvCpuctl at +0x00000388; + 0:0 startcpu => as_bit bool; + 4:4 halted => as_bit bool; + 5:5 stopped => as_bit bool; + 7:7 active_stat => as_bit bool; +); + +register!(FalconEngine at +0x000003c0; + 0:0 reset => as_bit bool; +); + +register!(RiscvIrqmask at +0x00000528; + 31:0 mask => as u32; +); + +register!(RiscvIrqdest at +0x0000052c; + 31:0 dest => as u32; +); + +/* TODO: this is an array of registers */ +register!(FalconFbifTranscfg at +0x00000600; + 1:0 target => try_into FalconFbifTarget; + 2:2 mem_type => as_bit FalconFbifMemType; +); + +register!(FalconFbifCtl at +0x00000624; + 7:7 allow_phys_no_ctx => as_bit bool; +); + +register!(RiscvBcrCtrl at +0x00001668; + 0:0 valid => as_bit bool; + 4:4 core_select => as_bit RiscvCoreSelect; + 8:8 br_fetch => as_bit bool; +); diff --git a/drivers/gpu/nova-core/timer.rs b/drivers/gpu/nova-core/timer.rs index 8987352f4192bc9b4b2fc0fb5f2e8e62ff27be68..c03a5c36d1230dfbf2bd6e02a793264280c6d509 100644 --- a/drivers/gpu/nova-core/timer.rs +++ b/drivers/gpu/nova-core/timer.rs @@ -2,9 +2,6 @@ //! Nova Core Timer subdevice -// To be removed when all code is used. -#![allow(dead_code)] - use core::fmt::Display; use core::ops::{Add, Sub}; use core::time::Duration; -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 12/16] gpu: nova-core: firmware: add ucode descriptor used by FWSEC-FRTS
FWSEC-FRTS is the first firmware we need to run on the GSP falcon in order to initiate the GSP boot process. Introduce the structure that describes it. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/firmware.rs | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 9bad7a86382af7917b3dce7bf3087d0002bd5971..4ef5ba934b9d255635aa9a902e1d3a732d6e5568 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -43,6 +43,34 @@ pub(crate) fn new( } } +/// Structure used to describe some firmwares, notable fwsec-frts. +#[allow(dead_code)] +#[repr(C)] +#[derive(Debug, Clone)] +pub(crate) struct FalconUCodeDescV3 { + pub(crate) hdr: u32, + pub(crate) stored_size: u32, + pub(crate) pkc_data_offset: u32, + pub(crate) interface_offset: u32, + pub(crate) imem_phys_base: u32, + pub(crate) imem_load_size: u32, + pub(crate) imem_virt_base: u32, + pub(crate) dmem_phys_base: u32, + pub(crate) dmem_load_size: u32, + pub(crate) engine_id_mask: u16, + pub(crate) ucode_id: u8, + pub(crate) signature_count: u8, + pub(crate) signature_versions: u16, + _reserved: u16, +} + +#[allow(dead_code)] +impl FalconUCodeDescV3 { + pub(crate) fn size(&self) -> usize { + ((self.hdr & 0xffff0000) >> 16) as usize + } +} + pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>); impl<const N: usize> ModInfoBuilder<N> { -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 13/16] gpu: nova-core: Add support for VBIOS ucode extraction for boot
From: Joel Fernandes <joelagnelf at nvidia.com> Add support for navigating and setting up vBIOS ucode data required for GSP to boot. The main data extracted from the vBIOS is the FWSEC-FRTS firmware which runs on the GSP processor. This firmware runs in high secure mode, and sets up the WPR2 (Write protected region) before the Booter runs on the SEC2 processor. Also add log messages to show the BIOS images. [102141.013287] NovaCore: Found BIOS image at offset 0x0, size: 0xfe00, type: PciAt [102141.080692] NovaCore: Found BIOS image at offset 0xfe00, size: 0x14800, type: Efi [102141.098443] NovaCore: Found BIOS image at offset 0x24600, size: 0x5600, type: FwSec [102141.415095] NovaCore: Found BIOS image at offset 0x29c00, size: 0x60800, type: FwSec Tested on my Ampere GA102 and boot is successful. [applied changes by Alex Courbot for fwsec signatures] [applied feedback from Alex Courbot and Timur Tabi] Signed-off-by: Joel Fernandes <joelagnelf at nvidia.com> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/firmware.rs | 2 - drivers/gpu/nova-core/gpu.rs | 5 + drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/vbios.rs | 1103 ++++++++++++++++++++++++++++++++++++ 4 files changed, 1109 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 4ef5ba934b9d255635aa9a902e1d3a732d6e5568..58c0513d49e9a0cef36917c8e2b25c414f6fc596 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -44,7 +44,6 @@ pub(crate) fn new( } /// Structure used to describe some firmwares, notable fwsec-frts. -#[allow(dead_code)] #[repr(C)] #[derive(Debug, Clone)] pub(crate) struct FalconUCodeDescV3 { @@ -64,7 +63,6 @@ pub(crate) struct FalconUCodeDescV3 { _reserved: u16, } -#[allow(dead_code)] impl FalconUCodeDescV3 { pub(crate) fn size(&self) -> usize { ((self.hdr & 0xffff0000) >> 16) as usize diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index ec4c648c6e8b4aa7d06c627ed59c0e66a08c679e..2344dfc69fe4246644437d70572680a4450b5bd7 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -11,6 +11,7 @@ use crate::regs; use crate::timer::Timer; use crate::util; +use crate::vbios::Vbios; use core::fmt; macro_rules! define_chipset { @@ -157,6 +158,7 @@ pub(crate) struct Gpu { fw: Firmware, sysmem_flush: DmaObject, timer: Timer, + bios: Vbios, } #[pinned_drop] @@ -237,12 +239,15 @@ pub(crate) fn new( let _sec2_falcon = Sec2Falcon::new(pdev, spec.chipset, &bar, true)?; + let bios = Vbios::probe(&bar)?; + Ok(pin_init!(Self { spec, bar, fw, sysmem_flush, timer, + bios, })) } } diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 4dde8004d24882c60669b5acd6af9d6988c66a9c..2858f4a0dc35eb9d6547d5cbd81de44c8fc47bae 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -29,6 +29,7 @@ macro_rules! with_bar { mod regs; mod timer; mod util; +mod vbios; kernel::module_pci_driver! { type: driver::NovaCore, diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs new file mode 100644 index 0000000000000000000000000000000000000000..534107b708cab0eb8d0accf7daa5718edf030358 --- /dev/null +++ b/drivers/gpu/nova-core/vbios.rs @@ -0,0 +1,1103 @@ +// SPDX-License-Identifier: GPL-2.0 + +// To be removed when all code is used. +#![allow(dead_code)] + +//! VBIOS extraction and parsing. + +use crate::driver::Bar0; +use crate::firmware::FalconUCodeDescV3; +use core::convert::TryFrom; +use kernel::devres::Devres; +use kernel::error::Result; +use kernel::prelude::*; + +/// The offset of the VBIOS ROM in the BAR0 space. +const ROM_OFFSET: usize = 0x300000; +/// The maximum length of the VBIOS ROM to scan into. +const BIOS_MAX_SCAN_LEN: usize = 0x100000; +/// The size to read ahead when parsing initial BIOS image headers. +const BIOS_READ_AHEAD_SIZE: usize = 1024; + +// PMU lookup table entry types. Used to locate PMU table entries +// in the Fwsec image, corresponding to falcon ucodes. +#[allow(dead_code)] +const FALCON_UCODE_ENTRY_APPID_FIRMWARE_SEC_LIC: u8 = 0x05; +#[allow(dead_code)] +const FALCON_UCODE_ENTRY_APPID_FWSEC_DBG: u8 = 0x45; +const FALCON_UCODE_ENTRY_APPID_FWSEC_PROD: u8 = 0x85; + +pub(crate) struct Vbios { + pub fwsec_image: Option<FwSecBiosImage>, +} + +impl Vbios { + /// Read bytes from the ROM at the current end of the data vector + fn read_more(bar0: &Devres<Bar0>, data: &mut KVec<u8>, len: usize) -> Result { + let current_len = data.len(); + let start = ROM_OFFSET + current_len; + + // Ensure length is a multiple of 4 for 32-bit reads + if len % core::mem::size_of::<u32>() != 0 { + pr_err!("VBIOS read length {} is not a multiple of 4\n", len); + return Err(EINVAL); + } + + // Allocate and zero-initialize the required memory + data.extend_with(len, 0, GFP_KERNEL)?; + with_bar!(?bar0, |bar0_ref| { + let dst = &mut data[current_len..current_len + len]; + for (idx, chunk) in dst + .chunks_exact_mut(core::mem::size_of::<u32>()) + .enumerate() + { + let addr = start + (idx * core::mem::size_of::<u32>()); + // Convert the u32 to a 4 byte array. We use the .to_ne_bytes() + // method out of convenience to convert the 32-bit integer as it + // is in memory into a byte array without any endianness + // conversion or byte-swapping. + chunk.copy_from_slice(&bar0_ref.try_read32(addr)?.to_ne_bytes()); + } + Ok(()) + })?; + + Ok(()) + } + + /// Read bytes at a specific offset, filling any gap + fn read_more_at_offset( + bar0: &Devres<Bar0>, + data: &mut KVec<u8>, + offset: usize, + len: usize, + ) -> Result { + if offset > BIOS_MAX_SCAN_LEN { + pr_err!("Error: exceeded BIOS scan limit.\n"); + return Err(EINVAL); + } + + // If offset is beyond current data size, fill the gap first + let current_len = data.len(); + let gap_bytes = if offset > current_len { + offset - current_len + } else { + 0 + }; + + // Now read the requested bytes at the offset + Self::read_more(bar0, data, gap_bytes + len) + } + + /// Read a BIOS image at a specific offset and create a BiosImage from it. + /// @data is extended as needed and a new BiosImage is returned. + fn read_bios_image_at_offset( + bar0: &Devres<Bar0>, + data: &mut KVec<u8>, + offset: usize, + len: usize, + ) -> Result<BiosImage> { + if offset + len > data.len() { + Self::read_more_at_offset(bar0, data, offset, len).inspect_err(|e| { + pr_err!("Failed to read more at offset {:#x}: {:?}\n", offset, e) + })?; + } + + BiosImage::try_from(&data[offset..offset + len]).inspect_err(|e| { + pr_err!( + "Failed to create BiosImage at offset {:#x}: {:?}\n", + offset, + e + ) + }) + } + + /// Probe for VBIOS extraction + /// Once the VBIOS object is built, bar0 is not read for vbios purposes anymore. + pub(crate) fn probe(bar0: &Devres<Bar0>) -> Result<Self> { + // VBIOS data vector: As BIOS images are scanned, they are added to this vector + // for reference or copying into other data structures. It is the entire + // scanned contents of the VBIOS which progressively extends. It is used + // so that we do not re-read any contents that are already read as we use + // the cumulative length read so far, and re-read any gaps as we extend + // the length + let mut data = KVec::new(); + + // Loop through all the BiosImage and extract relevant ones and relevant data from them + let mut cur_offset = 0; + let mut pci_at_image: Option<PciAtBiosImage> = None; + let mut first_fwsec_image: Option<FwSecBiosImage> = None; + let mut second_fwsec_image: Option<FwSecBiosImage> = None; + + // loop till break + loop { + // Try to parse a BIOS image at the current offset + // This will now check for all valid ROM signatures (0xAA55, 0xBB77, 0x4E56) + let image_size + Self::read_bios_image_at_offset(bar0, &mut data, cur_offset, BIOS_READ_AHEAD_SIZE) + .and_then(|image| image.image_size_bytes()) + .inspect_err(|e| { + pr_err!( + "Failed to parse initial BIOS image headers at offset {:#x}: {:?}\n", + cur_offset, + e + ); + })?; + + // Create a new BiosImage with the full image data + let full_image + Self::read_bios_image_at_offset(bar0, &mut data, cur_offset, image_size) + .inspect_err(|e| { + pr_err!( + "Failed to parse full BIOS image at offset {:#x}: {:?}\n", + cur_offset, + e + ); + })?; + + // Determine the image type + let image_type = full_image.image_type_str(); + + pr_info!( + "Found BIOS image at offset {:#x}, size: {:#x}, type: {}\n", + cur_offset, + image_size, + image_type + ); + + let is_last = full_image.is_last(); + // Get references to images we will need after the loop, in order to + // setup the falcon data offset. + match full_image { + BiosImage::PciAt(image) => { + pci_at_image = Some(image); + } + BiosImage::FwSec(image) => { + if first_fwsec_image.is_none() { + first_fwsec_image = Some(image); + } else { + second_fwsec_image = Some(image); + } + } + // For now we don't need to handle these + BiosImage::Efi(_image) => {} + BiosImage::Nbsi(_image) => {} + } + + // Break if this is the last image + if is_last { + break; + } + + // Move to the next image (aligned to 512 bytes) + cur_offset += image_size; + cur_offset = (cur_offset + 511) & !511; + + // Safety check - don't go beyond BIOS_MAX_SCAN_LEN (1MB) + if cur_offset > BIOS_MAX_SCAN_LEN { + pr_err!("Error: exceeded BIOS scan limit, stopping scan\n"); + break; + } + } // end of loop + + // Using all the images, setup the falcon data pointer in Fwsec. + // We need mutable access here, so we handle the Option manually. + let final_fwsec_image = { + let mut second = second_fwsec_image; // Take ownership of the option + let first_ref = first_fwsec_image.as_ref(); + let pci_at_ref = pci_at_image.as_ref(); + + if let (Some(second), Some(first), Some(pci_at)) + (second.as_mut(), first_ref, pci_at_ref) + { + second + .setup_falcon_data(pci_at, first) + .inspect_err(|e| pr_err!("Falcon data setup failed: {:?}\n", e))?; + } else { + pr_err!("Missing required images for falcon data setup, skipping\n"); + } + second // Return the potentially modified second image + }; + + Ok(Self { + fwsec_image: final_fwsec_image, + }) + } + + pub(crate) fn fwsec_header(&self) -> Result<&FalconUCodeDescV3> { + let image = self.fwsec_image.as_ref().ok_or(EINVAL)?; + image.fwsec_header() + } + + pub(crate) fn fwsec_ucode(&self) -> Result<&[u8]> { + let image = self.fwsec_image.as_ref().ok_or(EINVAL)?; + image.fwsec_ucode(image.fwsec_header()?) + } + + pub(crate) fn fwsec_sigs(&self) -> Result<&[u8]> { + let image = self.fwsec_image.as_ref().ok_or(EINVAL)?; + image.fwsec_sigs(image.fwsec_header()?) + } +} + +/// PCI Data Structure as defined in PCI Firmware Specification +#[derive(Debug, Clone)] +#[repr(C)] +#[allow(dead_code)] +struct PcirStruct { + /// PCI Data Structure signature ("PCIR" or "NPDS") + pub signature: [u8; 4], + /// PCI Vendor ID (e.g., 0x10DE for NVIDIA) + pub vendor_id: u16, + /// PCI Device ID + pub device_id: u16, + /// Device List Pointer + pub device_list_ptr: u16, + /// PCI Data Structure Length + pub pci_data_struct_len: u16, + /// PCI Data Structure Revision + pub pci_data_struct_rev: u8, + /// Class code (3 bytes, 0x03 for display controller) + pub class_code: [u8; 3], + /// Size of this image in 512-byte blocks + pub image_len: u16, + /// Revision Level of the Vendor's ROM + pub vendor_rom_rev: u16, + /// ROM image type (0x00 = PC-AT compatible, 0x03 = EFI, 0x70 = NBSI) + pub code_type: u8, + /// Last image indicator (0x00 = Not last image, 0x80 = Last image) + pub last_image: u8, + /// Maximum Run-time Image Length (units of 512 bytes) + pub max_runtime_image_len: u16, +} + +impl TryFrom<&[u8]> for PcirStruct { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + if data.len() < core::mem::size_of::<PcirStruct>() { + pr_err!("Not enough data for PcirStruct\n"); + return Err(EINVAL); + } + + let mut signature = [0u8; 4]; + signature.copy_from_slice(&data[0..4]); + + // Signature should be "PCIR" (0x52494350) or "NPDS" (0x5344504e) + if &signature != b"PCIR" && &signature != b"NPDS" { + pr_err!("Invalid signature for PcirStruct: {:?}\n", signature); + return Err(EINVAL); + } + + let mut class_code = [0u8; 3]; + class_code.copy_from_slice(&data[13..16]); + + Ok(PcirStruct { + signature, + vendor_id: u16::from_le_bytes([data[4], data[5]]), + device_id: u16::from_le_bytes([data[6], data[7]]), + device_list_ptr: u16::from_le_bytes([data[8], data[9]]), + pci_data_struct_len: u16::from_le_bytes([data[10], data[11]]), + pci_data_struct_rev: data[12], + class_code, + image_len: u16::from_le_bytes([data[16], data[17]]), + vendor_rom_rev: u16::from_le_bytes([data[18], data[19]]), + code_type: data[20], + last_image: data[21], + max_runtime_image_len: u16::from_le_bytes([data[22], data[23]]), + }) + } +} + +impl PcirStruct { + /// Check if this is the last image in the ROM + fn is_last(&self) -> bool { + self.last_image & 0x80 != 0 + } + + /// Calculate image size in bytes + fn image_size_bytes(&self) -> Result<usize> { + if self.image_len > 0 { + // Image size is in 512-byte blocks + Ok(self.image_len as usize * 512) + } else { + Err(EINVAL) + } + } +} + +/// BIOS Information Table (BIT) Header +/// This is the head of the BIT table, that is used to locate the Falcon data. +/// The BIT table (with its header) is in the PciAtBiosImage and the falcon data +/// it is pointing to is in the FwSecBiosImage. +#[derive(Debug, Clone, Copy)] +#[allow(dead_code)] +struct BitHeader { + /// 0h: BIT Header Identifier (BMP=0x7FFF/BIT=0xB8FF) + pub id: u16, + /// 2h: BIT Header Signature ("BIT\0") + pub signature: [u8; 4], + /// 6h: Binary Coded Decimal Version, ex: 0x0100 is 1.00. + pub bcd_version: u16, + /// 8h: Size of BIT Header (in bytes) + pub header_size: u8, + /// 9h: Size of BIT Tokens (in bytes) + pub token_size: u8, + /// 10h: Number of token entries that follow + pub token_entries: u8, + /// 11h: BIT Header Checksum + pub checksum: u8, +} + +impl TryFrom<&[u8]> for BitHeader { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + if data.len() < 12 { + return Err(EINVAL); + } + + let mut signature = [0u8; 4]; + signature.copy_from_slice(&data[2..6]); + + // Check header ID and signature + let id = u16::from_le_bytes([data[0], data[1]]); + if id != 0xB8FF || &signature != b"BIT\0" { + return Err(EINVAL); + } + + Ok(BitHeader { + id, + signature, + bcd_version: u16::from_le_bytes([data[6], data[7]]), + header_size: data[8], + token_size: data[9], + token_entries: data[10], + checksum: data[11], + }) + } +} + +/// BIT Token Entry: Records in the BIT table followed by the BIT header +#[derive(Debug, Clone, Copy)] +#[allow(dead_code)] +struct BitToken { + /// 00h: Token identifier + pub id: u8, + /// 01h: Version of the token data + pub data_version: u8, + /// 02h: Size of token data in bytes + pub data_size: u16, + /// 04h: Offset to the token data + pub data_offset: u16, +} + +// Define the token ID for the Falcon data +pub(in crate::vbios) const BIT_TOKEN_ID_FALCON_DATA: u8 = 0x70; + +impl BitToken { + /// Find a BIT token entry by BIT ID in a PciAtBiosImage + pub(in crate::vbios) fn from_id(image: &PciAtBiosImage, token_id: u8) -> Result<Self> { + let header = image.bit_header.as_ref().ok_or(EINVAL)?; + + // Offset to the first token entry + let tokens_start = image.bit_offset.unwrap() + header.header_size as usize; + + for i in 0..header.token_entries as usize { + let entry_offset = tokens_start + (i * header.token_size as usize); + + // Make sure we don't go out of bounds + if entry_offset + header.token_size as usize > image.base.data.len() { + return Err(EINVAL); + } + + // Check if this token has the requested ID + if image.base.data[entry_offset] == token_id { + return Ok(BitToken { + id: image.base.data[entry_offset], + data_version: image.base.data[entry_offset + 1], + data_size: u16::from_le_bytes([ + image.base.data[entry_offset + 2], + image.base.data[entry_offset + 3], + ]), + data_offset: u16::from_le_bytes([ + image.base.data[entry_offset + 4], + image.base.data[entry_offset + 5], + ]), + }); + } + } + + // Token not found + Err(ENOENT) + } +} + +/// PCI ROM Expansion Header as defined in PCI Firmware Specification. +/// This is header is at the beginning of every image in the set of +/// images in the ROM. It contains a pointer to the PCI Data Structure +/// which describes the image. +/// For "NBSI" images (NoteBook System Information), the ROM +/// header deviates from the standard and contains an offset to the +/// NBSI image however we do not yet parse that in this module and keep +/// it for future reference. +#[derive(Debug, Clone, Copy)] +#[allow(dead_code)] +struct PciRomHeader { + /// 00h: Signature (0xAA55) + pub signature: u16, + /// 02h: Reserved bytes for processor architecture unique data (20 bytes) + pub reserved: [u8; 20], + /// 16h: NBSI Data Offset (NBSI-specific, offset from header to NBSI image) + pub nbsi_data_offset: Option<u16>, + /// 18h: Pointer to PCI Data Structure (offset from start of ROM image) + pub pci_data_struct_offset: u16, + /// 1Ah: Size of block (this is NBSI-specific) + pub size_of_block: Option<u32>, +} + +impl TryFrom<&[u8]> for PciRomHeader { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + if data.len() < 26 { + // Need at least 26 bytes to read pciDataStrucPtr and sizeOfBlock + return Err(EINVAL); + } + + let signature = u16::from_le_bytes([data[0], data[1]]); + + // Check for valid ROM signatures + match signature { + 0xAA55 | 0xBB77 | 0x4E56 => {} + _ => { + pr_err!("ROM signature unknown {:#x}\n", signature); + return Err(EINVAL); + } + } + + // Read the pointer to the PCI Data Structure at offset 0x18 + let pci_data_struct_ptr = u16::from_le_bytes([data[24], data[25]]); + + // Try to read optional fields if enough data + let mut size_of_block = None; + let mut nbsi_data_offset = None; + + if data.len() >= 30 { + // Read size_of_block at offset 0x1A + size_of_block = Some( + (data[29] as u32) << 24 + | (data[28] as u32) << 16 + | (data[27] as u32) << 8 + | (data[26] as u32), + ); + } + + // For NBSI images, try to read the nbsiDataOffset at offset 0x16 + if data.len() >= 24 { + nbsi_data_offset = Some(u16::from_le_bytes([data[22], data[23]])); + } + + Ok(PciRomHeader { + signature, + reserved: [0u8; 20], + pci_data_struct_offset: pci_data_struct_ptr, + size_of_block, + nbsi_data_offset, + }) + } +} + +/// NVIDIA PCI Data Extension Structure. This is similar to the +/// PCI Data Structure, but is Nvidia-specific and is placed right after +/// the PCI Data Structure. It contains some fields that are redundant +/// with the PCI Data Structure, but are needed for traversing the +/// BIOS images. It is expected to be present in all BIOS images except +/// for NBSI images. +#[derive(Debug, Clone)] +#[allow(dead_code)] +struct NpdeStruct { + /// 00h: Signature ("NPDE") + pub signature: [u8; 4], + /// 04h: NVIDIA PCI Data Extension Revision + pub npci_data_ext_rev: u16, + /// 06h: NVIDIA PCI Data Extension Length + pub npci_data_ext_len: u16, + /// 08h: Sub-image Length (in 512-byte units) + pub subimage_len: u16, + /// 0Ah: Last image indicator flag + pub last_image: u8, +} + +impl TryFrom<&[u8]> for NpdeStruct { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + if data.len() < 11 { + pr_err!("Not enough data for NpdeStruct\n"); + return Err(EINVAL); + } + + let mut signature = [0u8; 4]; + signature.copy_from_slice(&data[0..4]); + + // Signature should be "NPDE" (0x4544504E) + if &signature != b"NPDE" { + pr_err!("Invalid signature for NpdeStruct: {:?}\n", signature); + return Err(EINVAL); + } + + Ok(NpdeStruct { + signature, + npci_data_ext_rev: u16::from_le_bytes([data[4], data[5]]), + npci_data_ext_len: u16::from_le_bytes([data[6], data[7]]), + subimage_len: u16::from_le_bytes([data[8], data[9]]), + last_image: data[10], + }) + } +} + +impl NpdeStruct { + /// Check if this is the last image in the ROM + fn is_last(&self) -> bool { + self.last_image & 0x80 != 0 + } + + /// Calculate image size in bytes + fn image_size_bytes(&self) -> Result<usize> { + if self.subimage_len > 0 { + // Image size is in 512-byte blocks + Ok(self.subimage_len as usize * 512) + } else { + Err(EINVAL) + } + } + + /// Try to find NPDE in the data, the NPDE is right after the PCIR. + fn find_in_data(data: &[u8], rom_header: &PciRomHeader, pcir: &PcirStruct) -> Option<Self> { + // Calculate the offset where NPDE might be located + // NPDE should be right after the PCIR structure, aligned to 16 bytes + let pcir_offset = rom_header.pci_data_struct_offset as usize; + let npde_start = (pcir_offset + pcir.pci_data_struct_len as usize + 0x0F) & !0x0F; + + // Check if we have enough data + if npde_start + 11 > data.len() { + pr_err!("Not enough data for NPDE\n"); + return None; + } + + // Try to create NPDE from the data + NpdeStruct::try_from(&data[npde_start..]) + .inspect_err(|e| { + pr_err!("Error creating NpdeStruct: {:?}\n", e); + }) + .ok() + } +} +// Use a macro to implement BiosImage enum and methods. This avoids having to +// repeat each enum type when implementing functions like base() in BiosImage. +macro_rules! bios_image { + ( + $($variant:ident $class:ident),* $(,)? + ) => { + // BiosImage enum with variants for each image type + enum BiosImage { + $($variant($class)),* + } + + impl BiosImage { + /// Get a reference to the common BIOS image data regardless of type + fn base(&self) -> &BiosImageBase { + match self { + $(Self::$variant(img) => &img.base),* + } + } + + /// Returns a string representing the type of BIOS image + fn image_type_str(&self) -> &'static str { + match self { + $(Self::$variant(_) => stringify!($variant)),* + } + } + } + } +} + +impl BiosImage { + /// Check if this is the last image + fn is_last(&self) -> bool { + let base = self.base(); + + // For NBSI images (type == 0x70), return true as they're + // considered the last image + if matches!(self, Self::Nbsi(_)) { + return true; + } + + // For other image types, check NPDE first if available + if let Some(ref npde) = base.npde { + return npde.is_last(); + } + + // Otherwise, fall back to checking the PCIR last_image flag + base.pcir.is_last() + } + + /// Get the image size in bytes + fn image_size_bytes(&self) -> Result<usize> { + let base = self.base(); + + // Prefer NPDE image size if available + if let Some(ref npde) = base.npde { + return npde.image_size_bytes(); + } + + // Otherwise, fall back to the PCIR image size + base.pcir.image_size_bytes() + } +} + +bios_image! { + PciAt PciAtBiosImage, // PCI-AT compatible BIOS image + Efi EfiBiosImage, // EFI (Extensible Firmware Interface) + Nbsi NbsiBiosImage, // NBSI (Nvidia Bios System Interface) + FwSec FwSecBiosImage // FWSEC (Firmware Security) +} + +struct PciAtBiosImage { + base: BiosImageBase, + bit_header: Option<BitHeader>, + bit_offset: Option<usize>, +} + +struct EfiBiosImage { + base: BiosImageBase, + // EFI-specific fields can be added here in the future. +} + +struct NbsiBiosImage { + base: BiosImageBase, + // NBSI-specific fields can be added here in the future. +} + +pub(crate) struct FwSecBiosImage { + base: BiosImageBase, + // FWSEC-specific fields + // The offset of the Falcon data from the start of Fwsec image + falcon_data_offset: Option<usize>, + // The PmuLookupTable starts at the offset of the falcon data pointer + pmu_lookup_table: Option<PmuLookupTable>, + // The offset of the Falcon ucode + falcon_ucode_offset: Option<usize>, +} + +// Convert from BiosImageBase to BiosImage +impl TryFrom<BiosImageBase> for BiosImage { + type Error = Error; + + fn try_from(base: BiosImageBase) -> Result<Self> { + match base.pcir.code_type { + 0x00 => Ok(BiosImage::PciAt(base.try_into()?)), + 0x03 => Ok(BiosImage::Efi(EfiBiosImage { base })), + 0x70 => Ok(BiosImage::Nbsi(NbsiBiosImage { base })), + 0xE0 => Ok(BiosImage::FwSec(FwSecBiosImage { + base, + falcon_data_offset: None, + pmu_lookup_table: None, + falcon_ucode_offset: None, + })), + _ => { + pr_err!("Unknown BIOS image type {:#x}\n", base.pcir.code_type); + Err(EINVAL) + } + } + } +} + +/// BiosImage creation from a byte slice. This creates a BiosImageBase +/// and then converts it to a BiosImage which triggers the constructor of +/// the specific BiosImage enum variant. +impl TryFrom<&[u8]> for BiosImage { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + let base = BiosImageBase::try_from(data)?; + let image = base.to_image()?; + + image + .image_size_bytes() + .inspect_err(|_| pr_err!("Invalid image size computed during BiosImage creation\n"))?; + + Ok(image) + } +} + +/// BIOS Image structure containing various headers and references +/// fields base to all BIOS images. Each BiosImage type has a +/// BiosImageBase type along with other image-specific fields. +/// Note that Rust favors composition of types over inheritance. +#[derive(Debug)] +#[allow(dead_code)] +struct BiosImageBase { + /// PCI ROM Expansion Header + pub rom_header: PciRomHeader, + /// PCI Data Structure + pub pcir: PcirStruct, + /// NVIDIA PCI Data Extension (optional) + pub npde: Option<NpdeStruct>, + /// Image data (includes ROM header and PCIR) + pub data: KVec<u8>, +} + +impl BiosImageBase { + fn to_image(self) -> Result<BiosImage> { + BiosImage::try_from(self) + } +} + +impl TryFrom<&[u8]> for BiosImageBase { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + // Ensure we have enough data for the ROM header + if data.len() < 26 { + pr_err!("Not enough data for ROM header\n"); + return Err(EINVAL); + } + + // Parse the ROM header + let rom_header = PciRomHeader::try_from(&data[0..26]) + .inspect_err(|e| pr_err!("Failed to create PciRomHeader: {:?}\n", e))?; + + // Get the PCI Data Structure using the pointer from the ROM header + let pcir_offset = rom_header.pci_data_struct_offset as usize; + let pcir_data = data + .get(pcir_offset..pcir_offset + core::mem::size_of::<PcirStruct>()) + .ok_or(EINVAL) + .inspect_err(|_| { + pr_err!( + "PCIR offset {:#x} out of bounds (data length: {})\n", + pcir_offset, + data.len() + ); + pr_err!("Consider reading more data for construction of BiosImage\n"); + })?; + + let pcir = PcirStruct::try_from(pcir_data) + .inspect_err(|e| pr_err!("Failed to create PcirStruct: {:?}\n", e))?; + + // Look for NPDE structure if this is not an NBSI image (type != 0x70) + let npde = NpdeStruct::find_in_data(data, &rom_header, &pcir); + + // Create a copy of the data + let mut data_copy = KVec::new(); + data_copy.extend_with(data.len(), 0, GFP_KERNEL)?; + data_copy.copy_from_slice(data); + + Ok(BiosImageBase { + rom_header, + pcir, + npde, + data: data_copy, + }) + } +} + +/// The PciAt BIOS image is typically the first BIOS image type found in the +/// BIOS image chain. It contains the BIT header and the BIT tokens. +impl PciAtBiosImage { + /// Find a byte pattern in a slice + fn find_byte_pattern(haystack: &[u8], needle: &[u8]) -> Option<usize> { + haystack + .windows(needle.len()) + .position(|window| window == needle) + } + + /// Find the BIT header in the PciAtBiosImage + fn find_bit_header(data: &[u8]) -> Result<(BitHeader, usize)> { + let bit_pattern = [0xff, 0xb8, b'B', b'I', b'T', 0x00]; + let bit_offset = Self::find_byte_pattern(data, &bit_pattern); + if bit_offset.is_none() { + return Err(EINVAL); + } + + let bit_header = BitHeader::try_from(&data[bit_offset.unwrap()..])?; + Ok((bit_header, bit_offset.unwrap())) + } + + /// Get a BIT token entry from the BIT table in the PciAtBiosImage + fn get_bit_token(&self, token_id: u8) -> Result<BitToken> { + BitToken::from_id(self, token_id) + } + + /// Find the Falcon data pointer structure in the PciAtBiosImage + /// This is just a 4 byte structure that contains a pointer to the + /// Falcon data in the FWSEC image. + fn falcon_data_ptr(&self) -> Result<u32> { + let token = self.get_bit_token(BIT_TOKEN_ID_FALCON_DATA)?; + + // Make sure we don't go out of bounds + if token.data_offset as usize + 4 > self.base.data.len() { + return Err(EINVAL); + } + + // read the 4 bytes at the offset specified in the token + let offset = token.data_offset as usize; + let bytes: [u8; 4] = self.base.data[offset..offset + 4].try_into().map_err(|_| { + pr_err!("Failed to convert data slice to array"); + EINVAL + })?; + + let data_ptr = u32::from_le_bytes(bytes); + + if (data_ptr as usize) < self.base.data.len() { + pr_err!("Falcon data pointer out of bounds\n"); + return Err(EINVAL); + } + + Ok(data_ptr) + } +} + +impl TryFrom<BiosImageBase> for PciAtBiosImage { + type Error = Error; + + fn try_from(base: BiosImageBase) -> Result<Self> { + let data_slice = &base.data; + let (bit_header, bit_offset) = PciAtBiosImage::find_bit_header(data_slice)?; + + Ok(PciAtBiosImage { + base, + bit_header: Some(bit_header), + bit_offset: Some(bit_offset), + }) + } +} + +/// The PmuLookupTableEntry structure is a single entry in the PmuLookupTable. +/// See the PmuLookupTable description for more information. +#[allow(dead_code)] +struct PmuLookupTableEntry { + application_id: u8, + target_id: u8, + data: u32, +} + +impl TryFrom<&[u8]> for PmuLookupTableEntry { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + if data.len() < 5 { + return Err(EINVAL); + } + + Ok(PmuLookupTableEntry { + application_id: data[0], + target_id: data[1], + data: u32::from_le_bytes(data[2..6].try_into().map_err(|_| EINVAL)?), + }) + } +} + +/// The PmuLookupTableEntry structure is used to find the PmuLookupTableEntry +/// for a given application ID. The table of entries is pointed to by the falcon +/// data pointer in the BIT table, and is used to locate the Falcon Ucode. +#[allow(dead_code)] +struct PmuLookupTable { + version: u8, + header_len: u8, + entry_len: u8, + entry_count: u8, + table_data: KVec<u8>, +} + +impl TryFrom<&[u8]> for PmuLookupTable { + type Error = Error; + + fn try_from(data: &[u8]) -> Result<Self> { + if data.len() < 4 { + return Err(EINVAL); + } + + let header_len = data[1] as usize; + let entry_len = data[2] as usize; + let entry_count = data[3] as usize; + + let required_bytes = header_len + (entry_count * entry_len); + + if data.len() < required_bytes { + return Err(EINVAL); + } + + // Create a copy of only the table data + let mut table_data = KVec::new(); + + // "last_entry_bytes" is a debugging aid. + // let mut last_entry_bytes: Option<KVec<u8>> = Some(KVec::new()); + + for &byte in &data[header_len..required_bytes] { + table_data.push(byte, GFP_KERNEL)?; + /* + * Uncomment for debugging (dumps the table data to dmesg): + * last_entry_bytes.as_mut().ok_or(EINVAL)?.push(byte, GFP_KERNEL)?; + * + * let last_entry_bytes_len = last_entry_bytes.as_ref().ok_or(EINVAL)?.len(); + * if last_entry_bytes_len == entry_len { + * pr_info!("Last entry bytes: {:02x?}\n", &last_entry_bytes.as_ref().ok_or(EINVAL)?[..]); + * last_entry_bytes = Some(KVec::new()); + * } + */ + } + + Ok(PmuLookupTable { + version: data[0], + header_len: header_len as u8, + entry_len: entry_len as u8, + entry_count: entry_count as u8, + table_data, + }) + } +} + +impl PmuLookupTable { + fn lookup_index(&self, idx: u8) -> Result<PmuLookupTableEntry> { + if idx >= self.entry_count { + return Err(EINVAL); + } + + let index = (idx as usize) * self.entry_len as usize; + Ok(PmuLookupTableEntry::try_from(&self.table_data[index..])?) + } + + // find entry by type value + fn find_entry_by_type(&self, entry_type: u8) -> Result<PmuLookupTableEntry> { + for i in 0..self.entry_count { + let entry = self.lookup_index(i)?; + if entry.application_id == entry_type { + return Ok(entry); + } + } + + Err(EINVAL) + } +} + +/// The FwSecBiosImage structure contains the PMU table and the Falcon Ucode. +/// The PMU table contains voltage/frequency tables as well as a pointer to the +/// Falcon Ucode. +impl FwSecBiosImage { + fn setup_falcon_data( + &mut self, + pci_at_image: &PciAtBiosImage, + first_fwsec_image: &FwSecBiosImage, + ) -> Result<()> { + let mut offset = pci_at_image.falcon_data_ptr()? as usize; + + // The falcon data pointer assumes that the PciAt and FWSEC images + // are contiguous in memory. However, testing shows the EFI image sits in + // between them. So calculate the offset from the end of the PciAt image + // rather than the start of it. Compensate. + offset -= pci_at_image.base.data.len(); + + // The offset is now from the start of the first Fwsec image, however + // the offset points to a location in the second Fwsec image. Since + // the fwsec images are contiguous, subtract the length of the first Fwsec + // image from the offset to get the offset to the start of the second + // Fwsec image. + offset -= first_fwsec_image.base.data.len(); + + self.falcon_data_offset = Some(offset); + + // The PmuLookupTable starts at the offset of the falcon data pointer + self.pmu_lookup_table = Some(PmuLookupTable::try_from(&self.base.data[offset..])?); + + match self + .pmu_lookup_table + .as_ref() + .ok_or(EINVAL)? + .find_entry_by_type(FALCON_UCODE_ENTRY_APPID_FWSEC_PROD) + { + Ok(entry) => { + let mut ucode_offset = entry.data as usize; + ucode_offset -= pci_at_image.base.data.len(); + ucode_offset -= first_fwsec_image.base.data.len(); + self.falcon_ucode_offset = Some(ucode_offset); + + /* + * Uncomment for debug: print the v3_desc header + * let v3_desc = self.fwsec_header()?; + * pr_info!("PmuLookupTableEntry v3_desc: {:#?}\n", v3_desc); + */ + } + Err(e) => { + pr_err!("PmuLookupTableEntry not found, error: {:?}\n", e); + } + } + Ok(()) + } + + /// TODO: These were borrowed from the old code for integrating this module + /// with the outside world. They should be cleaned up and integrated properly. + /// + /// Get the FwSec header (FalconUCodeDescV3) + fn fwsec_header(&self) -> Result<&FalconUCodeDescV3> { + // Get the falcon ucode offset that was found in setup_falcon_data + let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)? as usize; + + // Make sure the offset is within the data bounds + if falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>() > self.base.data.len() { + pr_err!("fwsec-frts header not contained within BIOS bounds\n"); + return Err(ERANGE); + } + + // Read the first 4 bytes to get the version + let hdr_bytes: [u8; 4] = self.base.data[falcon_ucode_offset..falcon_ucode_offset + 4] + .try_into() + .map_err(|_| EINVAL)?; + let hdr = u32::from_le_bytes(hdr_bytes); + let ver = (hdr & 0xff00) >> 8; + + if ver != 3 { + pr_err!("invalid fwsec firmware version\n"); + return Err(EINVAL); + } + + // Return a reference to the FalconUCodeDescV3 structure + Ok(unsafe { + &*(self.base.data.as_ptr().add(falcon_ucode_offset) as *const FalconUCodeDescV3) + }) + } + /// Get the ucode data as a byte slice + fn fwsec_ucode(&self, v3_desc: &FalconUCodeDescV3) -> Result<&[u8]> { + let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)? as usize; + + // The ucode data follows the descriptor + let ucode_data_offset = falcon_ucode_offset + v3_desc.size(); + let size = (v3_desc.imem_load_size + v3_desc.dmem_load_size) as usize; + + // Get the data slice, checking bounds in a single operation + self.base + .data + .get(ucode_data_offset..ucode_data_offset + size) + .ok_or(ERANGE) + .inspect_err(|_| pr_err!("fwsec ucode data not contained within BIOS bounds\n")) + } + + /// Get the signatures as a byte slice + fn fwsec_sigs(&self, v3_desc: &FalconUCodeDescV3) -> Result<&[u8]> { + const SIG_SIZE: usize = 96 * 4; + + let falcon_ucode_offset = self.falcon_ucode_offset.ok_or(EINVAL)? as usize; + + // The signatures data follows the descriptor + let sigs_data_offset = falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>(); + let size = v3_desc.signature_count as usize * SIG_SIZE; + + // Make sure the data is within bounds + if sigs_data_offset + size > self.base.data.len() { + pr_err!("fwsec signatures data not contained within BIOS bounds\n"); + return Err(ERANGE); + } + + Ok(&self.base.data[sigs_data_offset..sigs_data_offset + size]) + } +} -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 14/16] gpu: nova-core: compute layout of the FRTS region
FWSEC-FRTS is run with the desired address of the FRTS region as parameter, which we need to compute depending on some hardware parameters. Do this in a `FbLayout` structure, that will be later extended to describe more memory regions used to boot the GSP. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/gpu.rs | 4 ++ drivers/gpu/nova-core/gsp.rs | 3 + drivers/gpu/nova-core/gsp/fb.rs | 109 +++++++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/nova_core.rs | 1 + drivers/gpu/nova-core/regs.rs | 27 +++++++++ 5 files changed, 144 insertions(+) diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 2344dfc69fe4246644437d70572680a4450b5bd7..b43d1fc6bba15ffd76d564eccdb9e2afe239a3a4 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -8,6 +8,7 @@ use crate::falcon::gsp::GspFalcon; use crate::falcon::sec2::Sec2Falcon; use crate::firmware::Firmware; +use crate::gsp::fb::FbLayout; use crate::regs; use crate::timer::Timer; use crate::util; @@ -241,6 +242,9 @@ pub(crate) fn new( let bios = Vbios::probe(&bar)?; + let fb_layout = FbLayout::new(spec.chipset, &bar)?; + dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout); + Ok(pin_init!(Self { spec, bar, diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs new file mode 100644 index 0000000000000000000000000000000000000000..27616a9d2b7069b18661fc97811fa1cac285b8f8 --- /dev/null +++ b/drivers/gpu/nova-core/gsp.rs @@ -0,0 +1,3 @@ +// SPDX-License-Identifier: GPL-2.0 + +pub(crate) mod fb; diff --git a/drivers/gpu/nova-core/gsp/fb.rs b/drivers/gpu/nova-core/gsp/fb.rs new file mode 100644 index 0000000000000000000000000000000000000000..63f41dfa184c434aa4eb7d4cb1f5f1e6f0552563 --- /dev/null +++ b/drivers/gpu/nova-core/gsp/fb.rs @@ -0,0 +1,109 @@ +// SPDX-License-Identifier: GPL-2.0 + +use core::ops::Range; + +use kernel::devres::Devres; +use kernel::prelude::*; + +use crate::driver::Bar0; +use crate::gpu::Chipset; +use crate::regs; + +fn align_down(value: u64, align: u64) -> u64 { + value & !(align - 1) +} + +/// Layout of the GPU framebuffer memory. +/// +/// Contains ranges of GPU memory reserved for a given purpose during the GSP bootup process. +#[derive(Debug)] +#[allow(dead_code)] +pub(crate) struct FbLayout { + pub fb: Range<u64>, + + pub vga_workspace: Range<u64>, + pub bios: Range<u64>, + + pub frts: Range<u64>, +} + +impl FbLayout { + pub(crate) fn new(chipset: Chipset, bar: &Devres<Bar0>) -> Result<Self> { + let fb = { + let fb_size = with_bar!(bar, |b| vidmem_size(b, chipset))?; + + 0..fb_size + }; + let fb_len = fb.end - fb.start; + + let vga_workspace = { + let vga_base = with_bar!(bar, |b| vga_workspace_addr(&b, fb_len, chipset,))?; + + vga_base..fb.end + }; + + let bios = vga_workspace.clone(); + + let frts = { + const FRTS_DOWN_ALIGN: u64 = 0x20000; + const FRTS_SIZE: u64 = 0x100000; + let frts_base = align_down(vga_workspace.start, FRTS_DOWN_ALIGN) - FRTS_SIZE; + + frts_base..frts_base + FRTS_SIZE + }; + + Ok(Self { + fb, + vga_workspace, + bios, + frts, + }) + } +} + +/// Returns `true` if the display is disabled. +fn display_disabled(bar: &Bar0, chipset: Chipset) -> bool { + if chipset >= Chipset::GA100 { + regs::FuseStatusOptDisplayAmpere::read(bar).display_disabled() + } else { + regs::FuseStatusOptDisplayMaxwell::read(bar).display_disabled() + } +} + +/// Returns the video memory size in bytes. +fn vidmem_size(bar: &Bar0, chipset: Chipset) -> u64 { + if chipset >= Chipset::GA102 { + (regs::Pgc6AonSecureScratchGroup42::read(bar).value() as u64) << 20 + } else { + let local_mem_range = regs::PfbPriMmuLocalMemoryRange::read(bar); + let size + (local_mem_range.lower_mag() as u64) << ((local_mem_range.lower_scale() as u64) + 20); + + if local_mem_range.ecc_mode_enabled() { + size / 16 * 15 + } else { + size + } + } +} + +/// Returns the vga workspace address. +fn vga_workspace_addr(bar: &Bar0, fb_size: u64, chipset: Chipset) -> u64 { + let base = fb_size - 0x100000; + let vga_workspace_base = if display_disabled(bar, chipset) { + regs::PdispVgaWorkspaceBase::read(bar) + } else { + return base; + }; + + if !vga_workspace_base.status_valid() { + return base; + } + + let addr = (vga_workspace_base.addr() as u64) << 16; + if addr < base { + fb_size - 0x20000 + } else { + addr + } +} diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index 2858f4a0dc35eb9d6547d5cbd81de44c8fc47bae..b78a71dea6e10707dc594fdc070b71dbb663e505 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -26,6 +26,7 @@ macro_rules! with_bar { mod falcon; mod firmware; mod gpu; +mod gsp; mod regs; mod timer; mod util; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index c76a16dc8e7267a4eb54cb71e1cca6fb9e00188f..3954542fdd77debd8f96d111ddd231d72dbf5b5a 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -38,6 +38,12 @@ 23:0 adr_63_40 => as u32 ); +register!(PfbPriMmuLocalMemoryRange at 0x00100ce0; + 3:0 lower_scale => as u8; + 9:4 lower_mag => as u8; + 30:30 ecc_mode_enabled => as_bit bool; +); + /* GC6 */ register!(Pgc6AonSecureScratchGroup05PrivLevelMask at 0x00118128; @@ -49,6 +55,27 @@ 31:0 value => as u32 ); +register!(Pgc6AonSecureScratchGroup42 at 0x001183a4; + 31:0 value => as u32 +); + +/* PDISP */ + +register!(PdispVgaWorkspaceBase at 0x00625f04; + 3:3 status_valid => as_bit bool; + 31:8 addr => as u32; +); + +/* FUSE */ + +register!(FuseStatusOptDisplayMaxwell at 0x00021c04; + 0:0 display_disabled => as_bit bool; +); + +register!(FuseStatusOptDisplayAmpere at 0x00820c04; + 0:0 display_disabled => as_bit bool; +); + /* PFALCON */ register!(FalconIrqsclr at +0x00000004; -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 15/16] gpu: nova-core: extract FWSEC from BIOS and patch it to run FWSEC-FRTS
The FWSEC firmware needs to be extracted from the VBIOS and patched with the desired command, as well as the right signature. Do this so we are ready to load and run this firmware into the GSP falcon and create the FRTS region. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/firmware.rs | 22 ++- drivers/gpu/nova-core/firmware/fwsec.rs | 340 ++++++++++++++++++++++++++++++++ drivers/gpu/nova-core/gpu.rs | 18 +- 3 files changed, 378 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 58c0513d49e9a0cef36917c8e2b25c414f6fc596..010763afdd74e92a4380d739a17319e05781007f 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -8,9 +8,14 @@ use kernel::prelude::*; use kernel::str::CString; +use crate::dma::DmaObject; use crate::gpu; use crate::gpu::Chipset; +pub(crate) mod fwsec; + +pub(crate) const FIRMWARE_VERSION: &'static str = "535.113.01"; + /// Structure encapsulating the firmware blobs required for the GPU to operate. #[expect(dead_code)] pub(crate) struct Firmware { @@ -69,10 +74,25 @@ pub(crate) fn size(&self) -> usize { } } +/// Patch the `ucode_dma` firmware at offset `sig_base_img` with `signature`. +fn patch_signature(ucode_dma: &mut DmaObject, signature: &[u8], sig_base_img: usize) -> Result<()> { + if sig_base_img + signature.len() > ucode_dma.len { + return Err(ERANGE); + } + + // SAFETY: we are the only user of this object, so there cannot be any race. + let dst = unsafe { ucode_dma.dma.start_ptr_mut().offset(sig_base_img as isize) }; + + // SAFETY: `signature` and `dst` are valid, properly aligned, and do not overlap. + unsafe { core::ptr::copy_nonoverlapping(signature.as_ptr(), dst, signature.len()) }; + + Ok(()) +} + pub(crate) struct ModInfoBuilder<const N: usize>(firmware::ModInfoBuilder<N>); impl<const N: usize> ModInfoBuilder<N> { - const VERSION: &'static str = "535.113.01"; + const VERSION: &'static str = FIRMWARE_VERSION; const fn make_entry_file(self, chipset: &str, fw: &str) -> Self { ModInfoBuilder( diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs b/drivers/gpu/nova-core/firmware/fwsec.rs new file mode 100644 index 0000000000000000000000000000000000000000..664319d1d31c9727bb830100641c53b5d914be5a --- /dev/null +++ b/drivers/gpu/nova-core/firmware/fwsec.rs @@ -0,0 +1,340 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! FWSEC is a High Secure firmware that is extracted from the BIOS and performs the first step of +//! the GSP startup by creating the WPR2 memory region and copying critical areas of the VBIOS into +//! it after authenticating them, ensuring they haven't been tampered with. It runs on the GSP +//! falcon. +//! +//! Before being run, it needs to be patched in two areas: +//! +//! - The command to be run, as this firmware can perform several tasks ; +//! - The ucode signature, so the GSP falcon can run FWSEC in HS mode. + +use core::alloc::Layout; + +use kernel::bindings; +use kernel::device::{self, Device}; +use kernel::devres::Devres; +use kernel::prelude::*; +use kernel::transmute::FromBytes; + +use crate::dma::DmaObject; +use crate::driver::Bar0; +use crate::falcon::gsp::Gsp; +use crate::falcon::{Falcon, FalconBromParams, FalconFirmware, FalconLoadTarget}; +use crate::firmware::FalconUCodeDescV3; +use crate::vbios::Vbios; + +const NVFW_FALCON_APPIF_ID_DMEMMAPPER: u32 = 0x4; + +#[repr(C)] +#[derive(Debug)] +struct FalconAppifHdrV1 { + ver: u8, + hdr: u8, + len: u8, + cnt: u8, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FalconAppifHdrV1 {} + +#[repr(C, packed)] +#[derive(Debug)] +struct FalconAppifV1 { + id: u32, + dmem_base: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FalconAppifV1 {} + +#[derive(Debug)] +#[repr(C, packed)] +struct FalconAppifDmemmapperV3 { + signature: u32, + version: u16, + size: u16, + cmd_in_buffer_offset: u32, + cmd_in_buffer_size: u32, + cmd_out_buffer_offset: u32, + cmd_out_buffer_size: u32, + nvf_img_data_buffer_offset: u32, + nvf_img_data_buffer_size: u32, + printf_buffer_hdr: u32, + ucode_build_time_stamp: u32, + ucode_signature: u32, + init_cmd: u32, + ucode_feature: u32, + ucode_cmd_mask0: u32, + ucode_cmd_mask1: u32, + multi_tgt_tbl: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FalconAppifDmemmapperV3 {} + +#[derive(Debug)] +#[repr(C, packed)] +struct ReadVbios { + ver: u32, + hdr: u32, + addr: u64, + size: u32, + flags: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for ReadVbios {} + +#[derive(Debug)] +#[repr(C, packed)] +struct FrtsRegion { + ver: u32, + hdr: u32, + addr: u32, + size: u32, + ftype: u32, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FrtsRegion {} + +const NVFW_FRTS_CMD_REGION_TYPE_FB: u32 = 2; + +#[repr(C, packed)] +struct FrtsCmd { + read_vbios: ReadVbios, + frts_region: FrtsRegion, +} +// SAFETY: any byte sequence is valid for this struct. +unsafe impl FromBytes for FrtsCmd {} + +const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS: u32 = 0x15; +const NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB: u32 = 0x19; + +/// Command for the [`FwsecFirmware`] to execute. +pub(crate) enum FwsecCommand { + /// Asks [`FwsecFirmware`] to carve out the WPR2 area and place a verified copy of the VBIOS + /// image into it. + Frts { frts_addr: u64, frts_size: u64 }, + /// Asks [`FwsecFirmware`] to load pre-OS apps on the PMU. + #[allow(dead_code)] + Sb, +} + +/// Reinterpret the area starting from `offset` in `fw` as an instance of `T` (which must implement +/// [`FromBytes`]) and return a reference to it. +/// +/// # Safety +/// +/// Callers must ensure that the region of memory returned is not written for as long as the +/// returned reference is alive. +unsafe fn transmute<'a, 'b, T: Sized + FromBytes>( + fw: &'a DmaObject, + offset: usize, +) -> Result<&'b T> { + if offset + core::mem::size_of::<T>() > fw.len { + return Err(ERANGE); + } + if (fw.dma.start_ptr() as usize + offset) % core::mem::align_of::<T>() != 0 { + return Err(EINVAL); + } + + // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is + // large enough the contains an instance of `T`, which implements `FromBytes`. + Ok(unsafe { &*(fw.dma.start_ptr().offset(offset as isize) as *const T) }) +} + +/// Reinterpret the area starting from `offset` in `fw` as a mutable instance of `T` (which must +/// implement [`FromBytes`]) and return a reference to it. +/// +/// # Safety +/// +/// Callers must ensure that the region of memory returned is not read or written for as long as +/// the returned reference is alive. +unsafe fn transmute_mut<'a, 'b, T: Sized + FromBytes>( + fw: &'a mut DmaObject, + offset: usize, +) -> Result<&'b mut T> { + if offset + core::mem::size_of::<T>() > fw.len { + return Err(ERANGE); + } + if (fw.dma.start_ptr_mut() as usize + offset) % core::mem::align_of::<T>() != 0 { + return Err(EINVAL); + } + + // SAFETY: we have checked that the pointer is properly aligned that its pointed memory is + // large enough the contains an instance of `T`, which implements `FromBytes`. + Ok(unsafe { &mut *(fw.dma.start_ptr_mut().offset(offset as isize) as *mut T) }) +} + +/// Patch the Fwsec firmware image in `fw` to run the command `cmd`. +fn patch_command(fw: &mut DmaObject, v3_desc: &FalconUCodeDescV3, cmd: FwsecCommand) -> Result<()> { + let hdr_offset = (v3_desc.imem_load_size + v3_desc.interface_offset) as usize; + let hdr: &FalconAppifHdrV1 = unsafe { transmute(fw, hdr_offset) }?; + + if hdr.ver != 1 { + return Err(EINVAL); + } + + // Find the DMEM mapper section in the firmware. + for i in 0..hdr.cnt as usize { + let app: &FalconAppifV1 + unsafe { transmute(fw, hdr_offset + hdr.hdr as usize + i * hdr.len as usize) }?; + + if app.id != NVFW_FALCON_APPIF_ID_DMEMMAPPER { + continue; + } + + let dmem_mapper: &mut FalconAppifDmemmapperV3 + unsafe { transmute_mut(fw, (v3_desc.imem_load_size + app.dmem_base) as usize) }?; + + let frts_cmd: &mut FrtsCmd = unsafe { + transmute_mut( + fw, + (v3_desc.imem_load_size + dmem_mapper.cmd_in_buffer_offset) as usize, + ) + }?; + + frts_cmd.read_vbios = ReadVbios { + ver: 1, + hdr: core::mem::size_of::<ReadVbios>() as u32, + addr: 0, + size: 0, + flags: 2, + }; + + dmem_mapper.init_cmd = match cmd { + FwsecCommand::Frts { + frts_addr, + frts_size, + } => { + frts_cmd.frts_region = FrtsRegion { + ver: 1, + hdr: core::mem::size_of::<FrtsRegion>() as u32, + addr: (frts_addr >> 12) as u32, + size: (frts_size >> 12) as u32, + ftype: NVFW_FRTS_CMD_REGION_TYPE_FB, + }; + + NVFW_FALCON_APPIF_DMEMMAPPER_CMD_FRTS + } + FwsecCommand::Sb => NVFW_FALCON_APPIF_DMEMMAPPER_CMD_SB, + }; + + // Return early as we found and patched the DMEMMAPPER region. + return Ok(()); + } + + Err(ENOTSUPP) +} + +/// Firmware extracted from the VBIOS and responsible for e.g. carving out the WPR2 region as the +/// first step of the GSP bootflow. +pub(crate) struct FwsecFirmware { + desc: FalconUCodeDescV3, + ucode: DmaObject, +} + +impl FalconFirmware for FwsecFirmware { + type Target = Gsp; + + fn dma_handle(&self) -> bindings::dma_addr_t { + self.ucode.dma.dma_handle() + } + + fn imem_load(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: 0, + dst_start: self.desc.imem_phys_base, + len: self.desc.imem_load_size, + } + } + + fn dmem_load(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: self.desc.imem_load_size, + dst_start: self.desc.dmem_phys_base, + len: Layout::from_size_align(self.desc.dmem_load_size as usize, 256) + // Cannot panic, as 256 is non-zero and a power of 2. + .unwrap() + .pad_to_align() + .size() as u32, + } + } + + fn brom_params(&self) -> FalconBromParams { + FalconBromParams { + pkc_data_offset: self.desc.pkc_data_offset, + engine_id_mask: self.desc.engine_id_mask, + ucode_id: self.desc.ucode_id, + } + } + + fn boot_addr(&self) -> u32 { + 0 + } +} + +impl FwsecFirmware { + /// Extract the Fwsec firmware from `bios` and patch it to run with the `cmd` command. + pub(crate) fn new( + falcon: &Falcon<Gsp>, + pdev: &Device<device::Bound>, + bar: &Devres<Bar0>, + bios: &Vbios, + cmd: FwsecCommand, + ) -> Result<Self> { + let v3_desc = bios.fwsec_header()?; + let ucode = bios.fwsec_ucode()?; + + let mut ucode_dma = DmaObject::from_data(pdev, ucode, "fwsec-frts")?; + patch_command(&mut ucode_dma, v3_desc, cmd)?; + + const SIG_SIZE: usize = 96 * 4; + let signatures = bios.fwsec_sigs()?; + let sig_base_img = (v3_desc.imem_load_size + v3_desc.pkc_data_offset) as usize; + + if v3_desc.signature_count != 0 { + // Patch signature. + let mut sig_fuse_version = v3_desc.signature_versions as u32; + pr_debug!("sig_fuse_version: {}\n", sig_fuse_version); + let reg_fuse_version = falcon.hal.get_signature_reg_fuse_version( + bar, + v3_desc.engine_id_mask, + v3_desc.ucode_id, + )?; + let idx = { + let mut reg_fuse_version = 1 << reg_fuse_version; + pr_debug!("reg_fuse_version: {:#x}\n", reg_fuse_version); + if (reg_fuse_version & sig_fuse_version) == 0 { + pr_warn!( + "no matching signature: {:#x} {:#x}\n", + reg_fuse_version, + v3_desc.signature_versions + ); + return Err(EINVAL); + } + + let mut idx = 0; + while (reg_fuse_version & sig_fuse_version & 1) == 0 { + idx += sig_fuse_version & 1; + reg_fuse_version >>= 1; + sig_fuse_version >>= 1; + + if reg_fuse_version == 0 || sig_fuse_version == 0 { + return Err(EINVAL); + } + } + + idx + }; + + pr_debug!("patching signature with idx {}\n", idx); + let signature_start = idx as usize * SIG_SIZE; + let signature = &signatures[signature_start..signature_start + SIG_SIZE]; + super::patch_signature(&mut ucode_dma, signature, sig_base_img)?; + } + + Ok(FwsecFirmware { + desc: v3_desc.clone(), + ucode: ucode_dma, + }) + } +} diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index b43d1fc6bba15ffd76d564eccdb9e2afe239a3a4..5d15a99f8d1eec3c2e1f6d119eb521361733c709 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -7,6 +7,7 @@ use crate::driver::Bar0; use crate::falcon::gsp::GspFalcon; use crate::falcon::sec2::Sec2Falcon; +use crate::firmware::fwsec::{FwsecCommand, FwsecFirmware}; use crate::firmware::Firmware; use crate::gsp::fb::FbLayout; use crate::regs; @@ -185,7 +186,11 @@ pub(crate) fn new( bar: Devres<Bar0>, ) -> Result<impl PinInit<Self>> { let spec = Spec::new(&bar)?; - let fw = Firmware::new(pdev.as_ref(), spec.chipset, "535.113.01")?; + let fw = Firmware::new( + pdev.as_ref(), + spec.chipset, + crate::firmware::FIRMWARE_VERSION, + )?; dev_info!( pdev.as_ref(), @@ -245,6 +250,17 @@ pub(crate) fn new( let fb_layout = FbLayout::new(spec.chipset, &bar)?; dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout); + let _fwsec_frts = FwsecFirmware::new( + &gsp_falcon, + pdev.as_ref(), + &bar, + &bios, + FwsecCommand::Frts { + frts_addr: fb_layout.frts.start, + frts_size: fb_layout.frts.end - fb_layout.frts.start, + }, + )?; + Ok(pin_init!(Self { spec, bar, -- 2.49.0
Alexandre Courbot
2025-Apr-20 12:19 UTC
[PATCH 16/16] gpu: nova-core: load and run FWSEC-FRTS
With all the required pieces in place, load FWSEC-FRTS onto the GSP falcon, run it, and check that it completed successfully by carving out the WPR2 region out of framebuffer memory. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/falcon.rs | 3 --- drivers/gpu/nova-core/gpu.rs | 59 ++++++++++++++++++++++++++++++++++++++++- drivers/gpu/nova-core/regs.rs | 15 +++++++++++ drivers/gpu/nova-core/vbios.rs | 3 --- 4 files changed, 73 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs index 71f374445ff3277eac628e183942c79f557366d5..f90bb739cb9864d88e3427c7ec76953c69ec2c67 100644 --- a/drivers/gpu/nova-core/falcon.rs +++ b/drivers/gpu/nova-core/falcon.rs @@ -2,9 +2,6 @@ //! Falcon microprocessor base support -// To be removed when all code is used. -#![allow(dead_code)] - use core::hint::unreachable_unchecked; use core::time::Duration; use hal::FalconHal; diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 5d15a99f8d1eec3c2e1f6d119eb521361733c709..4d03a0b11b6411e22a652183e975f6889446ed46 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -250,7 +250,7 @@ pub(crate) fn new( let fb_layout = FbLayout::new(spec.chipset, &bar)?; dev_dbg!(pdev.as_ref(), "{:#x?}\n", fb_layout); - let _fwsec_frts = FwsecFirmware::new( + let fwsec_frts = FwsecFirmware::new( &gsp_falcon, pdev.as_ref(), &bar, @@ -261,6 +261,63 @@ pub(crate) fn new( }, )?; + // Check that the WPR2 region does not already exists - if it does, the GPU needs to be + // reset. + if with_bar!(bar, |b| regs::PfbPriMmuWpr2AddrHi::read(b).hi_val())? != 0 { + dev_err!( + pdev.as_ref(), + "WPR2 region already exists - GPU needs to be reset to proceed\n" + ); + return Err(EBUSY); + } + + // Reset falcon, load FWSEC-FRTS, and run it. + gsp_falcon.reset(&bar, &timer)?; + gsp_falcon.dma_load(&bar, &timer, &fwsec_frts)?; + let (mbox0, _) = gsp_falcon.boot(&bar, &timer, Some(0), None)?; + if mbox0 != 0 { + dev_err!(pdev.as_ref(), "FWSEC firmware returned error {}\n", mbox0); + return Err(EINVAL); + } + + // SCRATCH_E contains FWSEC-FRTS' error code, if any. + let frts_status = with_bar!(bar, |b| regs::PbusSwScratche::read(b).frts_err_code())?; + if frts_status != 0 { + dev_err!( + pdev.as_ref(), + "FWSEC-FRTS returned with error code {:#x}", + frts_status + ); + return Err(EINVAL); + } + + // Check the WPR2 has been created as we requested. + let (wpr2_lo, wpr2_hi) = with_bar!(bar, |b| { + ( + (regs::PfbPriMmuWpr2AddrLo::read(b).lo_val() as u64) << 12, + (regs::PfbPriMmuWpr2AddrHi::read(b).hi_val() as u64) << 12, + ) + })?; + if wpr2_hi == 0 { + dev_err!( + pdev.as_ref(), + "WPR2 region not created after running FWSEC-FRTS\n" + ); + + return Err(ENOTTY); + } else if wpr2_lo != fb_layout.frts.start { + dev_err!( + pdev.as_ref(), + "WPR2 region created at unexpected address {:#x} ; expected {:#x}\n", + wpr2_lo, + fb_layout.frts.start, + ); + return Err(EINVAL); + } + + dev_info!(pdev.as_ref(), "WPR2: {:#x}-{:#x}\n", wpr2_lo, wpr2_hi); + dev_info!(pdev.as_ref(), "GPU instance built\n"); + Ok(pin_init!(Self { spec, bar, diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 3954542fdd77debd8f96d111ddd231d72dbf5b5a..eae5b7c13155d2da39f47661024ae52390e04366 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -18,6 +18,13 @@ 28:20 chipset => try_into Chipset, "chipset model" ); +/* PBUS */ + +register!(PbusSwScratche at 0x00001438; + 15:0 sb_err_code => as u16; + 31:16 frts_err_code => as u16; +); + /* PTIMER */ register!(PtimerTime0 at 0x00009400; @@ -44,6 +51,14 @@ 30:30 ecc_mode_enabled => as_bit bool; ); +register!(PfbPriMmuWpr2AddrLo at 0x001fa824; + 31:4 lo_val => as u32 +); + +register!(PfbPriMmuWpr2AddrHi at 0x001fa828; + 31:4 hi_val => as u32 +); + /* GC6 */ register!(Pgc6AonSecureScratchGroup05PrivLevelMask at 0x00118128; diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs index 534107b708cab0eb8d0accf7daa5718edf030358..74735c083d472ce955d6d3afaabd46a8d354c792 100644 --- a/drivers/gpu/nova-core/vbios.rs +++ b/drivers/gpu/nova-core/vbios.rs @@ -1,8 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 -// To be removed when all code is used. -#![allow(dead_code)] - //! VBIOS extraction and parsing. use crate::driver::Bar0; -- 2.49.0
Danilo Krummrich
2025-Apr-22 08:40 UTC
[PATCH 00/16] nova-core: run FWSEC-FRTS to perform first stage of GSP initialization
On Sun, Apr 20, 2025 at 09:19:32PM +0900, Alexandre Courbot wrote:> Hi everyone, > > This series is a continuation of my previous RFCs [1] to complete the > first step of GSP booting (running the FWSEC-FRTS firmware extracted > from the BIOS) on Ampere devices. While it is still far from bringing > the GPU into a state where it can do anything useful, it sets up the > basic layout of the driver upon which we can build in order to continue > with the next steps of GSP booting, as well as supporting more chipsets. > > Upon successful probe, the driver will display the range of the WPR2 > region constructed by FWSEC-FRTS: > > [ 95.436000] NovaCore 0000:01:00.0: WPR2: 0xffc00000-0xffce0000 > [ 95.436002] NovaCore 0000:01:00.0: GPU instance built > > This code is based on nova-next with the try_access_with patch [2].Please make sure to compile with CLIPPY=1, the series has quite some clippy warnings. I also noticed that there are a lot of compiler warnings about unreachable pub fields with rustc 1.78, whereas with the latest stable compiler there are none. I'm not exactly sure why that is (and I haven't looked further), but the corresponding fields indeed seem to have unnecessary pub visibility.> There is still a bit of unsafe code where it is not desired, notably to > transmute byte slices into types that implement FromBytes - this is > because support for doing such transmute operations safely are not in > the kernel crate yet.I assume you refer to [3]? As long as we put a TODO and follow up once the series lands, that's fine for me.> > [1] https://lore.kernel.org/rust-for-linux/20250320-nova_timer-v3-0-79aa2ad25a79 at nvidia.com/ > [2] https://lore.kernel.org/rust-for-linux/20250411-try_with-v4-0-f470ac79e2e2 at nvidia.com/[3] https://lore.kernel.org/lkml/20250330234039.29814-1-christiansantoslima21 at gmail.com/