Alexandre Courbot
2025-Dec-05 00:35 UTC
[PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned
On Fri Dec 5, 2025 at 6:45 AM JST, Timur Tabi wrote:> On Thu, 2025-12-04 at 21:18 +0000, Timur Tabi wrote: >> If you can tell me how to fix the CoherentAllocation allocation call so that it allocates 4K, that >> would be a better fix for this problem.? It's only necessary for Turing/GA100, however. > > Actually, I think I figured it out. > > /// Arguments for GSP startup. > #[repr(C)] > pub(crate) struct GspArgumentsCached( > bindings::GSP_ARGUMENTS_CACHED, > [u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()], // Padding to make > it 4K > ); > > impl GspArgumentsCached { > /// Creates the arguments for starting the GSP up using `cmdq` as its command queue. > pub(crate) fn new(cmdq: &Cmdq) -> Self { > Self( > bindings::GSP_ARGUMENTS_CACHED { > messageQueueInitArguments: MessageQueueInitArguments::new(cmdq).0, > bDmemStack: 1, > ..Default::default() > }, > [0u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()], > ) > } > } > > If you think this is okay, I'll put it in v3.Yes, that looks like a good way to address cases like this (please also make sure to detail in a comment that RM is requiring that padding). Having the constraint encoded into the type guarantees that we cannot miss it anywhere. With one caveat: `new` now returns a 4K object on the stack, which we definitely want to avoid. So maybe we can have a wrapper for things we want to align the 4K: #[repr(C)] pub(crate) struct GspPageAligned<T> { pub(crate) inner: T, padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<T>()], } We would then allocate the CoherentAllocation using a `GspPageAligned<GspArgumentsCached>`, and initialize its useful data with: dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(&cmdq))?;> I had to remove the #[repr(transparent)]. Is that > okay? The code compiles and seems to work.As long as the struct is `repr(C)`, the layout will be what we expect. Actually this made me realize that `repr(C)` is technically what we want for our bindings abstractions, not `repr(transparent)` - both happen to have the same effect since the wrapper struct is `repr(C)` anyway, but the latter is more restrictive than we need. Glad we found an elegant way to address this!
Timur Tabi
2025-Dec-05 20:22 UTC
[PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned
On Fri, 2025-12-05 at 09:35 +0900, Alexandre Courbot wrote:> > With one caveat: `new` now returns a 4K object on the stack, which we > definitely want to avoid. So maybe we can have a wrapper for things we > want to align the 4K: > > ? #[repr(C)] > ? pub(crate) struct GspPageAligned<T> { > ??? pub(crate) inner: T, > ??? padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<T>()], > ? } > > We would then allocate the CoherentAllocation using a > `GspPageAligned<GspArgumentsCached>`, and initialize its useful data > with: > > ? dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(&cmdq))?;I'm confused. Aren't we already avoiding the stack? This is the code today: let rmargs = CoherentAllocation::<GspArgumentsCached>::alloc_coherent( dev, 1, GFP_KERNEL | __GFP_ZERO, )?; dma_write!(rmargs[0] = fw::GspArgumentsCached::new(&cmdq))?; The only difference with what's there today vs what you suggest is the ".inner", and I think I can avoid even that if I make GspPageAligned a tuple instead of a named struct.> > > I had to remove the #[repr(transparent)].? Is that > > okay?? The code compiles and seems to work. > > As long as the struct is `repr(C)`, the layout will be what we expect. > Actually this made me realize that `repr(C)` is technically what we want for our > bindings abstractions, not `repr(transparent)` - both happen to have the > same effect since the wrapper struct is `repr(C)` anyway, but the latter > is more restrictive than we need. > > Glad we found an elegant way to address this!Actually, I think a more elegant solution would be a new variant of CoherentAllocation::alloc_coherent() that takes a size to allocate instead of using size_of::<T>. In fact, I wonder if it makes sense to always grow the size of the allocation to the nearest page, since dma_alloc_attrs() always allocates whole pages anyway. Perhaps CoherentAllocation<T> needs an allocated() method in addition to a size() method. size() returns count*sizeof as today, and allocated() returns that value rounded up to the nearest PAGE_SIZE.
Timur Tabi
2025-Dec-05 23:22 UTC
[PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned
On Fri, 2025-12-05 at 09:35 +0900, Alexandre Courbot wrote:> With one caveat: `new` now returns a 4K object on the stack, which we > definitely want to avoid. So maybe we can have a wrapper for things we > want to align the 4K: > > ? #[repr(C)] > ? pub(crate) struct GspPageAligned<T> { > ??? pub(crate) inner: T, > ??? padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<T>()], > ? }This doesn't seem to work. error: generic parameters may not be used in const operations --> drivers/gpu/nova-core/gsp/fw.rs:894:56 | 894 | ...::mem::size_of::<T>()], | ^ cannot perform const operation using `T` | I've tried different things and they all generate various compilation errors. I'll just post a v3 with my version, and then you can experiment with it to see if you can get GspPageAligned aligned to work on top of it.> > We would then allocate the CoherentAllocation using a > `GspPageAligned<GspArgumentsCached>`, and initialize its useful data > with: > > ? dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(&cmdq))?;So I assume you mean this: let rmargs = CoherentAllocation::<GspPageAligned<GspArgumentsCached>>::alloc_coherent( dev, 1, GFP_KERNEL | __GFP_ZERO, )?; dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(&cmdq))?;