Alexandre Courbot
2025-Nov-18 13:04 UTC
[PATCH 09/11] gpu: nova-core: add FalconUCodeDescV2 support
On Tue Nov 18, 2025 at 8:10 AM JST, Joel Fernandes wrote:> On Fri, Nov 14, 2025 at 05:30:42PM -0600, Timur Tabi wrote: >> The FRTS firmware in Turing and GA100 VBIOS has an older header >> format (v2 instead of v3). To support both v2 and v3 at runtime, >> add the FalconUCodeDescV2 struct, and update code that references >> the FalconUCodeDescV3 directly with a FalconUCodeDesc enum that >> encapsulates both. >> >> Signed-off-by: Timur Tabi <ttabi at nvidia.com> >> --- >> drivers/gpu/nova-core/firmware.rs | 108 +++++++++++++++++++++++- >> drivers/gpu/nova-core/firmware/fwsec.rs | 72 ++++++++++------ >> drivers/gpu/nova-core/vbios.rs | 74 ++++++++++------ >> 3 files changed, 202 insertions(+), 52 deletions(-) >> >> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs >> index 2d2008b33fb4..5ca5bf1fb610 100644 >> --- a/drivers/gpu/nova-core/firmware.rs >> +++ b/drivers/gpu/nova-core/firmware.rs >> @@ -43,6 +43,43 @@ fn request_firmware( >> .and_then(|path| firmware::Firmware::request(&path, dev)) >> } >> >> +/// Structure used to describe some firmwares, notably FWSEC-FRTS. >> +#[repr(C)] >> +#[derive(Debug, Clone)] >> +pub(crate) struct FalconUCodeDescV2 { >> + /// Header defined by 'NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*' in OpenRM. >> + hdr: u32, >> + /// Stored size of the ucode after the header, compressed or uncompressed >> + stored_size: u32, >> + /// Uncompressed size of the ucode. If store_size == uncompressed_size, then the ucode >> + /// is not compressed. >> + pub(crate) uncompressed_size: u32, >> + /// Code entry point >> + pub(crate) virtual_entry: u32, >> + /// Offset after the code segment at which the Application Interface Table headers are located. >> + pub(crate) interface_offset: u32, >> + /// Base address at which to load the code segment into 'IMEM'. >> + pub(crate) imem_phys_base: u32, >> + /// Size in bytes of the code to copy into 'IMEM'. >> + pub(crate) imem_load_size: u32, >> + /// Virtual 'IMEM' address (i.e. 'tag') at which the code should start. >> + pub(crate) imem_virt_base: u32, >> + /// Virtual address of secure IMEM segment. >> + pub(crate) imem_sec_base: u32, >> + /// Size of secure IMEM segment. >> + pub(crate) imem_sec_size: u32, >> + /// Offset into stored (uncompressed) image at which DMEM begins. >> + pub(crate) dmem_offset: u32, >> + /// Base address at which to load the data segment into 'DMEM'. >> + pub(crate) dmem_phys_base: u32, >> + /// Size in bytes of the data to copy into 'DMEM'. >> + pub(crate) dmem_load_size: u32, >> + /// "Alternate" Size of data to load into IMEM. >> + pub(crate) alt_imem_load_size: u32, >> + /// "Alternate" Size of data to load into DMEM. >> + pub(crate) alt_dmem_load_size: u32, >> +} >> + >> /// Structure used to describe some firmwares, notably FWSEC-FRTS. >> #[repr(C)] >> #[derive(Debug, Clone)] >> @@ -76,13 +113,80 @@ pub(crate) struct FalconUCodeDescV3 { >> _reserved: u16, >> } >> >> -impl FalconUCodeDescV3 { >> +#[derive(Debug, Clone)] >> +pub(crate) enum FalconUCodeDesc { >> + V2(FalconUCodeDescV2), >> + V3(FalconUCodeDescV3), >> +} >> + >> +impl FalconUCodeDesc { >> /// Returns the size in bytes of the header. >> pub(crate) fn size(&self) -> usize { >> + let hdr = match self { >> + FalconUCodeDesc::V2(v2) => v2.hdr, >> + FalconUCodeDesc::V3(v3) => v3.hdr, >> + }; >> + >> const HDR_SIZE_SHIFT: u32 = 16; >> const HDR_SIZE_MASK: u32 = 0xffff0000; >> + ((hdr & HDR_SIZE_MASK) >> HDR_SIZE_SHIFT).into_safe_cast() >> + } >> + >> + pub(crate) fn imem_load_size(&self) -> u32 { >> + match self { >> + FalconUCodeDesc::V2(v2) => v2.imem_load_size, >> + FalconUCodeDesc::V3(v3) => v3.imem_load_size, >> + } >> + } > > > This looks like the perfect use case for a trait object. You can define a > trait, make both descriptors implement the trait and get rid of a lot of the > match statements: > > // First define trait > pub(crate) trait FalconUCodeDescriptor { > fn imem_load_size(&self) -> u32; > fn dmem_load_size(&self) -> u32; > fn engine_id_mask(&self) -> u16; // V3-only field, V2 returns 0 > ... > } > > // Implement trait for both versions > impl FalconUCodeDescriptor for FalconUCodeDescV2 { > fn imem_load_size(&self) -> u32 { self.imem_load_size } > fn dmem_load_size(&self) -> u32 { self.dmem_load_size } > fn engine_id_mask(&self) -> u16 { 0 } // V2 doesn't have this field > ... > } > > impl FalconUCodeDescriptor for FalconUCodeDescV3 { > fn imem_load_size(&self) -> u32 { self.imem_load_size } > fn dmem_load_size(&self) -> u32 { self.dmem_load_size } > fn engine_id_mask(&self) -> u16 { self.engine_id_mask } > ... > } > > // Keep the same enum. If you want to get rid of the enum, you'll need Box, > // but then that requires allocation. > pub(crate) enum FalconUCodeDesc { > V2(FalconUCodeDescV2), > V3(FalconUCodeDescV3), > } > > impl FalconUCodeDesc { > // Return trait object, the only match needed. > pub(crate) fn as_descriptor(&self) -> &dyn FalconUCodeDescriptor { > match self { > FalconUCodeDesc::V2(v2) => v2, > FalconUCodeDesc::V3(v3) => v3, > } > } > > // delegate to trait, no match statements! > pub(crate) fn imem_load_size(&self) -> u32 { > self.as_descriptor().imem_load_size() > } > > pub(crate) fn dmem_load_size(&self) -> u32 { > self.as_descriptor().dmem_load_size() > } > } > > // Usage example - no more match statements needed! > impl FalconLoadParams for FwsecFirmware { > fn dmem_load_params(&self) -> FalconLoadTarget { > FalconLoadTarget { > src_start: 0, > dst_start: 0, > len: self.desc.dmem_load_size(), > } > } > }On principle, I tend to agree. In practice, we will probably never have more than these two variants, so we need to balance the benefit of a trait against the overhead of defining it in the first place (there are quite a few methods in there). Trait objects come with their own complications, i.e. you need to store them on the heap if you need more than a short-lived reference - but in our case the short-lived reference should be what we need anyway.
Timur Tabi
2025-Nov-18 15:08 UTC
[PATCH 09/11] gpu: nova-core: add FalconUCodeDescV2 support
On Tue, 2025-11-18 at 22:04 +0900, Alexandre Courbot wrote:> On principle, I tend to agree. In practice, we will probably never have > more than these two variants, so we need to balance the benefit of a > trait against the overhead of defining it in the first place (there are > quite a few methods in there). > > Trait objects come with their own complications, i.e. you need to store > them on the heap if you need more than a short-lived reference - but in > our case the short-lived reference should be what we need anyway.So I would prefer not to rewrite everything, especially since you did tell me early on than an enum was the right approach.
Joel Fernandes
2025-Nov-18 19:45 UTC
[PATCH 09/11] gpu: nova-core: add FalconUCodeDescV2 support
On 11/18/2025 8:04 AM, Alexandre Courbot wrote:> On Tue Nov 18, 2025 at 8:10 AM JST, Joel Fernandes wrote: >> On Fri, Nov 14, 2025 at 05:30:42PM -0600, Timur Tabi wrote: >>> The FRTS firmware in Turing and GA100 VBIOS has an older header >>> format (v2 instead of v3). To support both v2 and v3 at runtime, >>> add the FalconUCodeDescV2 struct, and update code that references >>> the FalconUCodeDescV3 directly with a FalconUCodeDesc enum that >>> encapsulates both. >>> >>> Signed-off-by: Timur Tabi <ttabi at nvidia.com> >>> --- >>> drivers/gpu/nova-core/firmware.rs | 108 +++++++++++++++++++++++- >>> drivers/gpu/nova-core/firmware/fwsec.rs | 72 ++++++++++------ >>> drivers/gpu/nova-core/vbios.rs | 74 ++++++++++------ >>> 3 files changed, 202 insertions(+), 52 deletions(-) >>> >>> diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs >>> index 2d2008b33fb4..5ca5bf1fb610 100644 >>> --- a/drivers/gpu/nova-core/firmware.rs >>> +++ b/drivers/gpu/nova-core/firmware.rs >>> @@ -43,6 +43,43 @@ fn request_firmware( >>> .and_then(|path| firmware::Firmware::request(&path, dev)) >>> } >>> >>> +/// Structure used to describe some firmwares, notably FWSEC-FRTS. >>> +#[repr(C)] >>> +#[derive(Debug, Clone)] >>> +pub(crate) struct FalconUCodeDescV2 { >>> + /// Header defined by 'NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*' in OpenRM. >>> + hdr: u32, >>> + /// Stored size of the ucode after the header, compressed or uncompressed >>> + stored_size: u32, >>> + /// Uncompressed size of the ucode. If store_size == uncompressed_size, then the ucode >>> + /// is not compressed. >>> + pub(crate) uncompressed_size: u32, >>> + /// Code entry point >>> + pub(crate) virtual_entry: u32, >>> + /// Offset after the code segment at which the Application Interface Table headers are located. >>> + pub(crate) interface_offset: u32, >>> + /// Base address at which to load the code segment into 'IMEM'. >>> + pub(crate) imem_phys_base: u32, >>> + /// Size in bytes of the code to copy into 'IMEM'. >>> + pub(crate) imem_load_size: u32, >>> + /// Virtual 'IMEM' address (i.e. 'tag') at which the code should start. >>> + pub(crate) imem_virt_base: u32, >>> + /// Virtual address of secure IMEM segment. >>> + pub(crate) imem_sec_base: u32, >>> + /// Size of secure IMEM segment. >>> + pub(crate) imem_sec_size: u32, >>> + /// Offset into stored (uncompressed) image at which DMEM begins. >>> + pub(crate) dmem_offset: u32, >>> + /// Base address at which to load the data segment into 'DMEM'. >>> + pub(crate) dmem_phys_base: u32, >>> + /// Size in bytes of the data to copy into 'DMEM'. >>> + pub(crate) dmem_load_size: u32, >>> + /// "Alternate" Size of data to load into IMEM. >>> + pub(crate) alt_imem_load_size: u32, >>> + /// "Alternate" Size of data to load into DMEM. >>> + pub(crate) alt_dmem_load_size: u32, >>> +} >>> + >>> /// Structure used to describe some firmwares, notably FWSEC-FRTS. >>> #[repr(C)] >>> #[derive(Debug, Clone)] >>> @@ -76,13 +113,80 @@ pub(crate) struct FalconUCodeDescV3 { >>> _reserved: u16, >>> } >>> >>> -impl FalconUCodeDescV3 { >>> +#[derive(Debug, Clone)] >>> +pub(crate) enum FalconUCodeDesc { >>> + V2(FalconUCodeDescV2), >>> + V3(FalconUCodeDescV3), >>> +} >>> + >>> +impl FalconUCodeDesc { >>> /// Returns the size in bytes of the header. >>> pub(crate) fn size(&self) -> usize { >>> + let hdr = match self { >>> + FalconUCodeDesc::V2(v2) => v2.hdr, >>> + FalconUCodeDesc::V3(v3) => v3.hdr, >>> + }; >>> + >>> const HDR_SIZE_SHIFT: u32 = 16; >>> const HDR_SIZE_MASK: u32 = 0xffff0000; >>> + ((hdr & HDR_SIZE_MASK) >> HDR_SIZE_SHIFT).into_safe_cast() >>> + } >>> + >>> + pub(crate) fn imem_load_size(&self) -> u32 { >>> + match self { >>> + FalconUCodeDesc::V2(v2) => v2.imem_load_size, >>> + FalconUCodeDesc::V3(v3) => v3.imem_load_size, >>> + } >>> + } >> >> >> This looks like the perfect use case for a trait object. You can define a >> trait, make both descriptors implement the trait and get rid of a lot of the >> match statements: >> >> // First define trait >> pub(crate) trait FalconUCodeDescriptor { >> fn imem_load_size(&self) -> u32; >> fn dmem_load_size(&self) -> u32; >> fn engine_id_mask(&self) -> u16; // V3-only field, V2 returns 0 >> ... >> } >> >> // Implement trait for both versions >> impl FalconUCodeDescriptor for FalconUCodeDescV2 { >> fn imem_load_size(&self) -> u32 { self.imem_load_size } >> fn dmem_load_size(&self) -> u32 { self.dmem_load_size } >> fn engine_id_mask(&self) -> u16 { 0 } // V2 doesn't have this field >> ... >> } >> >> impl FalconUCodeDescriptor for FalconUCodeDescV3 { >> fn imem_load_size(&self) -> u32 { self.imem_load_size } >> fn dmem_load_size(&self) -> u32 { self.dmem_load_size } >> fn engine_id_mask(&self) -> u16 { self.engine_id_mask } >> ... >> } >> >> // Keep the same enum. If you want to get rid of the enum, you'll need Box, >> // but then that requires allocation. >> pub(crate) enum FalconUCodeDesc { >> V2(FalconUCodeDescV2), >> V3(FalconUCodeDescV3), >> } >> >> impl FalconUCodeDesc { >> // Return trait object, the only match needed. >> pub(crate) fn as_descriptor(&self) -> &dyn FalconUCodeDescriptor { >> match self { >> FalconUCodeDesc::V2(v2) => v2, >> FalconUCodeDesc::V3(v3) => v3, >> } >> } >> >> // delegate to trait, no match statements! >> pub(crate) fn imem_load_size(&self) -> u32 { >> self.as_descriptor().imem_load_size() >> } >> >> pub(crate) fn dmem_load_size(&self) -> u32 { >> self.as_descriptor().dmem_load_size() >> } >> } >> >> // Usage example - no more match statements needed! >> impl FalconLoadParams for FwsecFirmware { >> fn dmem_load_params(&self) -> FalconLoadTarget { >> FalconLoadTarget { >> src_start: 0, >> dst_start: 0, >> len: self.desc.dmem_load_size(), >> } >> } >> } > > On principle, I tend to agree. In practice, we will probably never have > more than these two variants, so we need to balance the benefit of a > trait against the overhead of defining it in the first place (there are > quite a few methods in there).I don't know if we'll never have more than one more variant really, its hard to take a call on that. If a third variant comes into being, then the match arms increase more.> Trait objects come with their own complications, i.e. you need to store > them on the heap if you need more than a short-lived reference - but in > our case the short-lived reference should be what we need anyway.Yeah, true. AFAICS though, and as you mentioned, in Timur's case it looks like that is not an issue and we do not need an allocation. Thanks.