thr3ads.net - Nouveau - [PATCH v3 02/11] gpu: nova-core: move GSP boot code out of `Gpu` constructor [Sep 2025]

If this information is useful, please help other people find it:
Share via:

Alexandre Courbot

2025-Sep-10 04:48 UTC

[PATCH v3 02/11] gpu: nova-core: move GSP boot code out of `Gpu` constructor

On Tue Sep 9, 2025 at 11:43 PM JST, Danilo Krummrich
wrote:> On Tue Sep 9, 2025 at 4:11 PM CEST, Alexandre Courbot wrote:
>> On Wed Sep 3, 2025 at 5:27 PM JST, Danilo Krummrich wrote:
>>> On Wed Sep 3, 2025 at 9:10 AM CEST, Alexandre Courbot wrote:
>>>> On Wed Sep 3, 2025 at 8:12 AM JST, Danilo Krummrich wrote:
>>>>> On 9/2/25 4:31 PM, Alexandre Courbot wrote:
>>>>>>       pub(crate) fn new(
>>>>>>           pdev: &pci::Device<device::Bound>,
>>>>>>           devres_bar: Arc<Devres<Bar0>>,
>>>>>
>>>>> The diff is hiding it, but with this patch we should also
make sure that this
>>>>> returns impl PinInit<Self, Error> rather than
Result<impl PinInit<Self>.
>>>>>
>>>>> I think this should be possible now.
>>>>
>>>> There is still code that can return errors (falcon creation,
etc) - do
>>>> you mean that we should move it into the pin initializer and
turn it
>>>> into a `try_pin_init`?
>>>
>>> Yeah, that would be better practice, if it doesn't work out for
a good reason
>>> we can also fall back to Result<impl PinInit<Self, Error>,
but we should at
>>> least try to avoid it.
>>
>> I tried but could not do it in a way that is satisfying. The problem is
>> that `Gpu::new` receives a `Arc<Devres<Bar0>>`, which we
need to
>> `access` in order to do anything useful with it. If we first store it
>> into the `Gpu` structure, then every subsequent member needs to
`access`
>> it in its own code block in order to perform their own initialization.
>> This is quite cumbersome.
>>
>> If there is a way to obtain the `Bar0` once after the `bar` member of
>> `Gpu` is initialized, and then use that instance with each remaining
>> member, then that problem would go away but I am not aware of such a
>> thing.
>
> What about this?
>
> 	impl Gpu {
> 	    pub(crate) fn new<'a>(
> 	        dev: &'a Device<Bound>,
> 	        bar: &'a Bar0
> 	        devres_bar: Arc<Devres<Bar0>>,
> 	    ) -> impl PinInit<Self, Error> + 'a {
> 	        try_pin_init(Self {
> 	            bar: devres_bar,
> 	            spec: Spec::new(bar)?,
> 	            gsp_falcon: Falcon::<Gsp>::new(dev, spec.chipset)?,
> 	            sec2_falcon: Falcon::<Sec2>::new(dev, spec.chipset)?,
> 	            sysmem_flush: SysmemFlush::register(dev, bar, spec.chipset)?
> 	            gsp <- Gsp::new(gsp_falcon, sec2_falcon, sysmem_flush)?,
> 	        })
> 	    }
> 	}
It does work. The bizareness of passing the `bar` twice aside, here is
what it looks like when I got it to compile:

    pub(crate) fn new<'a>(
        pdev: &'a pci::Device<device::Bound>,
        devres_bar: Arc<Devres<Bar0>>,
        bar: &'a Bar0,
    ) -> impl PinInit<Self, Error> + 'a {
        try_pin_init!(Self {
            spec: Spec::new(bar).inspect(|spec| {
                dev_info!(
                    pdev.as_ref(),
                    "NVIDIA (Chipset: {}, Architecture: {:?}, Revision:
{})\n",
                    spec.chipset,
                    spec.chipset.arch(),
                    spec.revision
                );
            })?,

            sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar,
spec.chipset)?,

            gsp_falcon: Falcon::<Gsp>::new(
                pdev.as_ref(),
                spec.chipset,
                bar,
                spec.chipset > Chipset::GA100,
            )
            .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,

            sec2_falcon: Falcon::<Sec2>::new(pdev.as_ref(), spec.chipset,
bar, true)?,

            gsp: Self::start_gsp(pdev, bar, spec.chipset, gsp_falcon,
sec2_falcon)?,

            bar: devres_bar,
        })
    }

The wait for GFW initialization had to be moved to `probe`, but that's
fine IMO. I do however find the code less readable in this form, less
editable as well. And LSP seems lost, so I don't get any syntax
highlighting in the `try_pin_init` block.

Fundamentally, this changes the method from a fallible method returning
a non-fallible initializer into a non-fallible method returning a
fallible initializer. I'm ok with that, and maybe this new form will
encourage us to keep this method short, which is what we want, but other
than that what benefit are we getting from this change?

Danilo Krummrich

2025-Sep-10 08:01 UTC

head link

[PATCH v3 02/11] gpu: nova-core: move GSP boot code out of `Gpu` constructor

On Wed Sep 10, 2025 at 6:48 AM CEST, Alexandre Courbot
wrote:> On Tue Sep 9, 2025 at 11:43 PM JST, Danilo Krummrich wrote:
>> 	impl Gpu {
>> 	    pub(crate) fn new<'a>(
>> 	        dev: &'a Device<Bound>,
>> 	        bar: &'a Bar0
>> 	        devres_bar: Arc<Devres<Bar0>>,
>> 	    ) -> impl PinInit<Self, Error> + 'a {
>> 	        try_pin_init(Self {
>> 	            bar: devres_bar,
>> 	            spec: Spec::new(bar)?,
>> 	            gsp_falcon: Falcon::<Gsp>::new(dev, spec.chipset)?,
>> 	            sec2_falcon: Falcon::<Sec2>::new(dev,
spec.chipset)?,
>> 	            sysmem_flush: SysmemFlush::register(dev, bar,
spec.chipset)?
>> 	            gsp <- Gsp::new(gsp_falcon, sec2_falcon,
sysmem_flush)?,
>> 	        })
>> 	    }
>> 	}
>
> It does work. The bizareness of passing the `bar` twice aside,
Yeah, but it really seems like a minor inconvinience. (Especially, since this
will be the only occurance of this we'll ever have.)
> here is what it looks like when I got it to compile:
This looks great!
>     pub(crate) fn new<'a>(
>         pdev: &'a pci::Device<device::Bound>,
>         devres_bar: Arc<Devres<Bar0>>,
>         bar: &'a Bar0,
>     ) -> impl PinInit<Self, Error> + 'a {
>         try_pin_init!(Self {
>             spec: Spec::new(bar).inspect(|spec| {
>                 dev_info!(
>                     pdev.as_ref(),
>                     "NVIDIA (Chipset: {}, Architecture: {:?},
Revision: {})\n",
>                     spec.chipset,
>                     spec.chipset.arch(),
>                     spec.revision
>                 );
>             })?,
+             _: {
+                 gfw::wait_gfw_boot_completion(bar)
+                     .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot
did not complete"))?;
+             },>
>             sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar,
spec.chipset)?,
>
>             gsp_falcon: Falcon::<Gsp>::new(
>                 pdev.as_ref(),
>                 spec.chipset,
>                 bar,
>                 spec.chipset > Chipset::GA100,
>             )
>             .inspect(|falcon| falcon.clear_swgen0_intr(bar))?,
>
>             sec2_falcon: Falcon::<Sec2>::new(pdev.as_ref(),
spec.chipset, bar, true)?,
>-             gsp: Self::start_gsp(pdev, bar, spec.chipset, gsp_falcon,
sec2_falcon)?,
+             gsp <- Self::start_gsp(pdev, bar, spec.chipset, gsp_falcon,
sec2_falcon),>
>             bar: devres_bar,
>         })
>     }
>
> The wait for GFW initialization had to be moved to `probe`, but that's
> fine IMO.
That's not necessary, you can keep it in Gpu::new() -- I don't see
what's
preventing us from that. I inserted it in the code above.
> I do however find the code less readable in this form, less
> editable as well. And LSP seems lost, so I don't get any syntax
> highlighting in the `try_pin_init` block.
Benno is working on a syntax update, so automatic formatting etc. will properly
work.

Otherwise, I can't see how this is a downgrade. It represents the
initialization
process in a much clearer way that the current implementation of Gsp::new(),
which is rather messy.
> Fundamentally, this changes the method from a fallible method returning
> a non-fallible initializer into a non-fallible method returning a
> fallible initializer.
Yeah, that's the best case when working with pin-init.
> I'm ok with that, and maybe this new form will
> encourage us to keep this method short, which is what we want, but other
> than that what benefit are we getting from this change?
The immediate benefit is that we don't need an extra allocation for the Gsp
structure.

The general benefit is that once we need to add more fields to
structures that require pinning (such as locks -- and we will need a lot of
them) we're prepared for it.

If we're not prepared for it, I'm pretty sure that everytime someone
needs to
add e.g. a new lock for something, it will just result in a new
Pin<KBox<T>>,
because the initial pin-init hierarchy just isn't there, and creating a new
allocation is more convinient than fixing the existing code.

This is exactly what pin-init was introduced for in the kernel, such that
we're
not doing worse than equivalent C code.

	struct Foo {
	   struct Bar bar;
	   void *data;
	   struct mutex data_lock;
	}

	struct Bar {
	   void *data;
	   struct mutex data_lock;
	}

In C you can just kmalloc(sizeof(Foo), GFP_KERNEL) and subsequently initialize
all fields.

In Rust I don't want to fall back having to allocate for Bar *and* for Foo
separately.

If there are things to improve with pin-init, let's improve them, but
let's not
do worse than C code in this aspect.

Nouveau - Sep 2025 - [PATCH v3 02/11] gpu: nova-core: move GSP boot code out of `Gpu` constructor

[PATCH v3 02/11] gpu: nova-core: move GSP boot code out of `Gpu` constructor

[PATCH v3 02/11] gpu: nova-core: move GSP boot code out of `Gpu` constructor