Alexandre Courbot
2025-May-21 06:45 UTC
[PATCH v4 20/20] gpu: nova-core: load and run FWSEC-FRTS
With all the required pieces in place, load FWSEC-FRTS onto the GSP falcon, run it, and check that it successfully carved out the WPR2 region out of framebuffer memory. Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> --- drivers/gpu/nova-core/falcon.rs | 3 --- drivers/gpu/nova-core/gpu.rs | 57 ++++++++++++++++++++++++++++++++++++++++- drivers/gpu/nova-core/regs.rs | 15 +++++++++++ 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs index f224ca881b72954d17fee87278ecc7a0ffac5322..91f0451a04e7b4d0631fbcf9b1e76e59d5dfb7e8 100644 --- a/drivers/gpu/nova-core/falcon.rs +++ b/drivers/gpu/nova-core/falcon.rs @@ -2,9 +2,6 @@ //! Falcon microprocessor base support -// To be removed when all code is used. -#![expect(dead_code)] - use core::ops::Deref; use core::time::Duration; use hal::FalconHal; diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 5a4c23a7a6c22abc1f6e72a307fa3336d731a396..280929203189fba6ad8e37709927597bb9c7d545 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -246,7 +246,7 @@ pub(crate) fn new( let bios = Vbios::new(pdev, bar)?; - let _fwsec_frts = FwsecFirmware::new( + let fwsec_frts = FwsecFirmware::new( &gsp_falcon, pdev.as_ref(), bar, @@ -257,6 +257,61 @@ pub(crate) fn new( }, )?; + // Check that the WPR2 region does not already exists - if it does, the GPU needs to be + // reset. + if regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() != 0 { + dev_err!( + pdev.as_ref(), + "WPR2 region already exists - GPU needs to be reset to proceed\n" + ); + return Err(EBUSY); + } + + // Reset falcon, load FWSEC-FRTS, and run it. + gsp_falcon.reset(bar)?; + gsp_falcon.dma_load(bar, &fwsec_frts)?; + let (mbox0, _) = gsp_falcon.boot(bar, Some(0), None)?; + if mbox0 != 0 { + dev_err!(pdev.as_ref(), "FWSEC firmware returned error {}\n", mbox0); + return Err(EINVAL); + } + + // SCRATCH_E contains FWSEC-FRTS' error code, if any. + let frts_status = regs::NV_PBUS_SW_SCRATCH_0E::read(bar).frts_err_code(); + if frts_status != 0 { + dev_err!( + pdev.as_ref(), + "FWSEC-FRTS returned with error code {:#x}", + frts_status + ); + return Err(EINVAL); + } + + // Check the WPR2 has been created as we requested. + let (wpr2_lo, wpr2_hi) = ( + (regs::NV_PFB_PRI_MMU_WPR2_ADDR_LO::read(bar).lo_val() as u64) << 12, + (regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() as u64) << 12, + ); + if wpr2_hi == 0 { + dev_err!( + pdev.as_ref(), + "WPR2 region not created after running FWSEC-FRTS\n" + ); + + return Err(ENOTTY); + } else if wpr2_lo != fb_layout.frts.start { + dev_err!( + pdev.as_ref(), + "WPR2 region created at unexpected address {:#x} ; expected {:#x}\n", + wpr2_lo, + fb_layout.frts.start, + ); + return Err(EINVAL); + } + + dev_dbg!(pdev.as_ref(), "WPR2: {:#x}-{:#x}\n", wpr2_lo, wpr2_hi); + dev_dbg!(pdev.as_ref(), "GPU instance built\n"); + Ok(pin_init!(Self { spec, bar: devres_bar, diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 54d4d37d6bf2c31947b965258d2733009c293a18..2a2d5610e552780957bcf00e0da1ec4cd3ac85d2 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -42,6 +42,13 @@ pub(crate) fn chipset(self) -> Result<Chipset> { } } +/* PBUS */ + +// TODO: this is an array of registers. +register!(NV_PBUS_SW_SCRATCH_0E at 0x00001438 { + 31:16 frts_err_code as u16; +}); + /* PFB */ register!(NV_PFB_NISO_FLUSH_SYSMEM_ADDR @ 0x00100c10 { @@ -73,6 +80,14 @@ pub(crate) fn usable_fb_size(self) -> u64 { } } +register!(NV_PFB_PRI_MMU_WPR2_ADDR_LO at 0x001fa824 { + 31:4 lo_val as u32; +}); + +register!(NV_PFB_PRI_MMU_WPR2_ADDR_HI at 0x001fa828 { + 31:4 hi_val as u32; +}); + /* PGC6 */ register!(NV_PGC6_AON_SECURE_SCRATCH_GROUP_05_PRIV_LEVEL_MASK @ 0x00118128 { -- 2.49.0
On Wed, 2025-05-21 at 15:45 +0900, Alexandre Courbot wrote: I noticed something interesting in this change to Gpu::new().> +??????? // Check that the WPR2 region does not already exists - if it does, the GPU needs to be > +??????? // reset. > +??????? if regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() != 0 { > +??????????? dev_err!( > +??????????????? pdev.as_ref(), > +??????????????? "WPR2 region already exists - GPU needs to be reset to proceed\n" > +??????????? ); > +??????????? return Err(EBUSY); > +??????? }You have a lot of checks in this code that display an error message and then return an Err(). But then ...> + > +??????? // Reset falcon, load FWSEC-FRTS, and run it. > +??????? gsp_falcon.reset(bar)?; > +??????? gsp_falcon.dma_load(bar, &fwsec_frts)?; > +??????? let (mbox0, _) = gsp_falcon.boot(bar, Some(0), None)?; > +??????? if mbox0 != 0 { > +??????????? dev_err!(pdev.as_ref(), "FWSEC firmware returned error {}\n", mbox0); > +??????????? return Err(EINVAL); > +??????? }There are several lines where you just terminate them with "?". This means that no error message is displays. I think all of these ? should be replaced with something like: gsp_falcon.reset(bar).inspect_err(|e| { dev_err!(pdev.as_ref(), "Failed to reset GSP falcon: {:?}\n", e); })?; This feels like something that would benefit from a macro, but I can't imagine what that would look like.
Alexandre Courbot
2025-Jun-04 01:37 UTC
[PATCH v4 20/20] gpu: nova-core: load and run FWSEC-FRTS
On Fri May 30, 2025 at 6:30 AM JST, Timur Tabi wrote:> On Wed, 2025-05-21 at 15:45 +0900, Alexandre Courbot wrote: > > I noticed something interesting in this change to Gpu::new(). > >> +??????? // Check that the WPR2 region does not already exists - if it does, the GPU needs to be >> +??????? // reset. >> +??????? if regs::NV_PFB_PRI_MMU_WPR2_ADDR_HI::read(bar).hi_val() != 0 { >> +??????????? dev_err!( >> +??????????????? pdev.as_ref(), >> +??????????????? "WPR2 region already exists - GPU needs to be reset to proceed\n" >> +??????????? ); >> +??????????? return Err(EBUSY); >> +??????? } > > You have a lot of checks in this code that display an error message and then return an Err(). > > But then ... > >> + >> +??????? // Reset falcon, load FWSEC-FRTS, and run it. >> +??????? gsp_falcon.reset(bar)?; >> +??????? gsp_falcon.dma_load(bar, &fwsec_frts)?; >> +??????? let (mbox0, _) = gsp_falcon.boot(bar, Some(0), None)?; >> +??????? if mbox0 != 0 { >> +??????????? dev_err!(pdev.as_ref(), "FWSEC firmware returned error {}\n", mbox0); >> +??????????? return Err(EINVAL); >> +??????? } > > There are several lines where you just terminate them with "?". This means that no error message is > displays. > > I think all of these ? should be replaced with something like: > > gsp_falcon.reset(bar).inspect_err(|e| { > dev_err!(pdev.as_ref(), "Failed to reset GSP falcon: {:?}\n", e); > })?; > > This feels like something that would benefit from a macro, but I can't imagine what that would look > like.This is because we are checking the cause of the error (unexpected value after firmware runs) in this file, so it is the correct place to display an error message. If the falcon reset fails, the error happens within the `reset()` method which can display an error message if needed, so I thought it was adequate to just propagate the error here. Now doing so would not tell us *which* falcon failed, and this sequence is so important that it is a good idea to understand where is fails precicely, so I've added a few `inspect_err` as you suggested for clarity. Ideally we would have something like the user-space `thiserror` crate to manage errors nicely and have custom error types like Lyude suggested.