Timur Tabi
2025-Dec-02 23:48 UTC
[PATCH v2 12/13] gpu: nova-core: add PIO support for loading firmware images
On Tue, 2025-12-02 at 15:40 -0800, John Hubbard wrote:> In fact, I just finished looking through my Hopper/Blackwell PIO code, which > also needs 4-byte alignment, and concluded that returning -EINVAL for misaligned > data seems to be the appropriate way to handle things.I've added this for v3: // Rejecting misaligned images here allows us to avoid checking // inside the loops. if img.len() % 4 != 0 { return Err(EINVAL); } And I manually create the &[u8; 4] now: for word in block.chunks_exact(4) { let w = [word[0], word[1], word[2], word[3]]; regs::NV_PFALCON_FALCON_IMEMD::default() .set_data(u32::from_le_bytes(w)) .write(bar, &E::ID, port); word[3] will always exist because of chunks_exact(4).
John Hubbard
2025-Dec-03 00:35 UTC
[PATCH v2 12/13] gpu: nova-core: add PIO support for loading firmware images
On 12/2/25 3:48 PM, Timur Tabi wrote:> On Tue, 2025-12-02 at 15:40 -0800, John Hubbard wrote: >> In fact, I just finished looking through my Hopper/Blackwell PIO code, which >> also needs 4-byte alignment, and concluded that returning -EINVAL for misaligned >> data seems to be the appropriate way to handle things. > > I've added this for v3: > > // Rejecting misaligned images here allows us to avoid checking > // inside the loops. > if img.len() % 4 != 0 { > return Err(EINVAL); > }Looks good.> > And I manually create the &[u8; 4] now: > > for word in block.chunks_exact(4) { > let w = [word[0], word[1], word[2], word[3]];Yes, this is probably the best way. Although...> regs::NV_PFALCON_FALCON_IMEMD::default() > .set_data(u32::from_le_bytes(w)) > .write(bar, &E::ID, port); > > word[3] will always exist because of chunks_exact(4). >Interesting, I was just looking at this, and the 4-byte manual construction bothered me a little ("why must I do this?"), so I'm currently wondering if "// PANIC..." plus an "infallible" .unwrap() is reasonable, for example: impl Falcon<Fsp> { ... pub(crate) fn write_emem(&self, bar: &Bar0, offset: u32, data: &[u8]) -> Result { if offset % 4 != 0 || data.len() % 4 != 0 { return Err(EINVAL); } ... for chunk in data.chunks_exact(4) { // PANIC: `chunks_exact(4)` guarantees each chunk is exactly 4 bytes. let word = u32::from_le_bytes(chunk.try_into().unwrap()); regs::NV_PFALCON_FALCON_EMEM_DATA::default() .set_data(word) .write(bar, &Fsp::ID); } ...but actually, I think your way is better, because you don't have just justify an .unwrap(). What do you think? I figured you'd enjoy this, coming as it does just one email after I wrote "never .unwrap()". haha :) thanks, -- John Hubbard