Timur Tabi
2025-Dec-02 22:51 UTC
[PATCH v2 12/13] gpu: nova-core: add PIO support for loading firmware images
On Tue, 2025-12-02 at 16:23 -0500, Joel Fernandes wrote:> > > On 12/1/2025 6:39 PM, Timur Tabi wrote: > > ? > > + > > +??? /// See nvkm_falcon_pio_wr - takes a byte array instead of a FalconFirmware > > +??? fn pio_wr_bytes( > > +??????? &self, > > +??????? bar: &Bar0, > > +??????? img: &[u8], > > +??????? mem_base: u16, > > +??????? target_mem: FalconMem, > > +??????? port: u8, > > +??????? tag: u16 > > +??? ) { > > +??????? let port = usize::from(port); > > + > > +??????? match target_mem { > > +??????????? FalconMem::ImemSecure | FalconMem::ImemNonSecure => { > > +??????????????? regs::NV_PFALCON_FALCON_IMEMC::default() > > +??????????????????? .set_secure(target_mem == FalconMem::ImemSecure) > > +??????????????????? .set_aincw(true) > > +??????????????????? .set_offs(mem_base) > > +??????????????????? .write(bar, &E::ID, port); > > + > > +??????????????? let mut tag = tag; > > +??????????????? for block in img.chunks(256) { > > +??????????????????? regs::NV_PFALCON_FALCON_IMEMT::default() > > +??????????????????????? .set_tag(tag) > > +??????????????????????? .write(bar, &E::ID, port); > > +??????????????????? for word in block.chunks(4) { > > +??????????????????????? let w = u32::from_le_bytes(word.try_into().unwrap()); > > If img.size is not a multiple of 4 bytes, this can panic right?I think so. I just noticed that I used chunks(4) here and chunks_exact(4) in the Dmem loop below. I need to make it consistent. chunks(4) will return &[u8; 3] if the buffer is shy one byte. chunks_exact(4) will simply skip the last 3 bytes. The problem is that it is an error for these images to not be a multiple of 4. ?Such an image is just not valid. So it's a lot simpler to just reject these misaligned images. The previous version of this function did return a Result, maybe I should put that back. It just seems wasteful to test for misalignment on every pass of the loop. What we really need is for from_le_bytes() to be less picky about the slice size. If I give it &[u8; 3], then it should be able to handle that.> Even if it is unlikely, unwrap() is quite frowned up due to possibility of > panic. I'd recommend something like the following since the function cannot > return an error: > > ??????????????????????? let w = if let Ok(bytes) = word.try_into() { > ??????????????????????????? u32::from_le_bytes(bytes) > ??????????????????????? } else { > ??? // can print a warning here too if needed. > ??????????????????????????? let mut buf = [0u8; 4]; > ??????????????????????????? buf[..word.len()].copy_from_slice(word); > ??????????????????????????? u32::from_le_bytes(buf) > ??????????????????????? };Wouldn't it be simpler to use chunks_exact() and then remainder()? That way, we wouldn't need a test inside the loop?> Btw, I wish we could encode the slice length constraint in the slice type itself > (i.e., the slice length ought to be a certain multiple). But I think there's no > way to do that without introducing a new type.Wouldn't it be a run-time constraint anyway? With the exception of the BootloaderDmemDescV2 write, all of the calls to pio_wr_bytes() have lengths only known at runtime.
Joel Fernandes
2025-Dec-02 23:20 UTC
[PATCH v2 12/13] gpu: nova-core: add PIO support for loading firmware images
On 12/2/2025 5:51 PM, Timur Tabi wrote:> On Tue, 2025-12-02 at 16:23 -0500, Joel Fernandes wrote: >> >> >> On 12/1/2025 6:39 PM, Timur Tabi wrote: >>> ? >>> + >>> +??? /// See nvkm_falcon_pio_wr - takes a byte array instead of a FalconFirmware >>> +??? fn pio_wr_bytes( >>> +??????? &self, >>> +??????? bar: &Bar0, >>> +??????? img: &[u8], >>> +??????? mem_base: u16, >>> +??????? target_mem: FalconMem, >>> +??????? port: u8, >>> +??????? tag: u16 >>> +??? ) { >>> +??????? let port = usize::from(port); >>> + >>> +??????? match target_mem { >>> +??????????? FalconMem::ImemSecure | FalconMem::ImemNonSecure => { >>> +??????????????? regs::NV_PFALCON_FALCON_IMEMC::default() >>> +??????????????????? .set_secure(target_mem == FalconMem::ImemSecure) >>> +??????????????????? .set_aincw(true) >>> +??????????????????? .set_offs(mem_base) >>> +??????????????????? .write(bar, &E::ID, port); >>> + >>> +??????????????? let mut tag = tag; >>> +??????????????? for block in img.chunks(256) { >>> +??????????????????? regs::NV_PFALCON_FALCON_IMEMT::default() >>> +??????????????????????? .set_tag(tag) >>> +??????????????????????? .write(bar, &E::ID, port); >>> +??????????????????? for word in block.chunks(4) { >>> +??????????????????????? let w = u32::from_le_bytes(word.try_into().unwrap()); >> >> If img.size is not a multiple of 4 bytes, this can panic right? > > I think so. I just noticed that I used chunks(4) here and chunks_exact(4) in the Dmem loop below. > I need to make it consistent. > > chunks(4) will return &[u8; 3] if the buffer is shy one byte. chunks_exact(4) will simply skip the > last 3 bytes. > > The problem is that it is an error for these images to not be a multiple of 4. ?Such an image is > just not valid. > > So it's a lot simpler to just reject these misaligned images. The previous version of this function > did return a Result, maybe I should put that back. It just seems wasteful to test for misalignment > on every pass of the loop. > > What we really need is for from_le_bytes() to be less picky about the slice size. If I give it > &[u8; 3], then it should be able to handle that.Yeah, but I guess whether to pad with zero's or error out when the slice is not a multiple of 4 is questionable.> >> Even if it is unlikely, unwrap() is quite frowned up due to possibility of >> panic. I'd recommend something like the following since the function cannot >> return an error: >> >> ??????????????????????? let w = if let Ok(bytes) = word.try_into() { >> ??????????????????????????? u32::from_le_bytes(bytes) >> ??????????????????????? } else { >> ??? // can print a warning here too if needed. >> ??????????????????????????? let mut buf = [0u8; 4]; >> ??????????????????????????? buf[..word.len()].copy_from_slice(word); >> ??????????????????????????? u32::from_le_bytes(buf) >> ??????????????????????? }; > > Wouldn't it be simpler to use chunks_exact() and then remainder()? That way, we wouldn't need a > test inside the loop?Sure.> >> Btw, I wish we could encode the slice length constraint in the slice type itself >> (i.e., the slice length ought to be a certain multiple). But I think there's no >> way to do that without introducing a new type. > > Wouldn't it be a run-time constraint anyway? With the exception of the BootloaderDmemDescV2 write, > all of the calls to pio_wr_bytes() have lengths only known at runtime.I am not sure but I think rust code is expected to not panic and handle situations gracefully even in the face of runtime constraints being violated, you could argue that the image length being violated is UB but I don't think that'd be enough to justify the unwrap(). But perhaps someone from the rust core team can chime in about that because I also have that question. Can a "FW image corruption" type of scenarios be considered something that safe rust code not need to worry about since it falls under the UB umbrella (similar to memory corruption)? Thanks.