Alexandre Courbot
2025-Dec-03 11:54 UTC
[PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned
On Tue Dec 2, 2025 at 8:25 AM JST, Timur Tabi wrote:> On Wed, 2025-11-19 at 12:36 +0900, Alexandre Courbot wrote: >> You can use the `Alignment` type here, as the rest of the code does: >> >> ??? let size = num::usize_as_u64(obj.size()) >> ??????? .align_up(Alignment::new::<GSP_PAGE_SIZE>())?; >> >> Now `align_up` returns an error in case of overflow, that we will need >> to pass down to the caller by changing the return type of `new`. It is a >> bit annoying, but better than the behavior of `next_mutiple_of` in such >> a case, which is to panic. :) > > I see your point, but these are u64s that we're talking about. The only way next_mutiple_of() can > panic is if obj.size() is greater than 0xFFFFFFFFFFFFF000, which is not possible. > > I would say in this case, a panic is preferable to a convoluted error return that will never be > exercised, because failure here indicates a coding error, not an input error.The input data is a usize, so technically we could get an input that triggers that error. I know it's a very edge case, and clearly indicates a bug, but the general rule is: don't panic the kernel. And in Rust, if possible, don't even let me compiler insert panic-handling code. If you don't want to change the return type of the method, then maybe use `unwrap_or` and `inspect_err` to print an error before returning e.g. `0`. But others have already thought "naah, that's never gonna happen" and got burnt very publicly [1], so let's learn from that. :P [1] https://blog.cloudflare.com/18-november-2025-outage/
Alice Ryhl
2025-Dec-03 12:03 UTC
[PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned
On Wed, Dec 03, 2025 at 08:54:31PM +0900, Alexandre Courbot wrote:> On Tue Dec 2, 2025 at 8:25 AM JST, Timur Tabi wrote: > > On Wed, 2025-11-19 at 12:36 +0900, Alexandre Courbot wrote: > >> You can use the `Alignment` type here, as the rest of the code does: > >> > >> ??? let size = num::usize_as_u64(obj.size()) > >> ??????? .align_up(Alignment::new::<GSP_PAGE_SIZE>())?; > >> > >> Now `align_up` returns an error in case of overflow, that we will need > >> to pass down to the caller by changing the return type of `new`. It is a > >> bit annoying, but better than the behavior of `next_mutiple_of` in such > >> a case, which is to panic. :) > > > > I see your point, but these are u64s that we're talking about. The only way next_mutiple_of() can > > panic is if obj.size() is greater than 0xFFFFFFFFFFFFF000, which is not possible. > > > > I would say in this case, a panic is preferable to a convoluted error return that will never be > > exercised, because failure here indicates a coding error, not an input error. > > The input data is a usize, so technically we could get an input that > triggers that error. > > I know it's a very edge case, and clearly indicates a bug, but the > general rule is: don't panic the kernel. And in Rust, if possible, don't > even let me compiler insert panic-handling code. If you don't want to > change the return type of the method, then maybe use `unwrap_or` and > `inspect_err` to print an error before returning e.g. `0`. > > But others have already thought "naah, that's never gonna happen" and > got burnt very publicly [1], so let's learn from that. :P > > [1] https://blog.cloudflare.com/18-november-2025-outage/For what it's worth, my friend at Cloudflare tells me they would have failed in the same way had they used ? instead of unwrap. Alice
Timur Tabi
2025-Dec-03 18:31 UTC
[PATCH 10/11] gpu: nova-core: LibosMemoryRegionInitArgument size must be page aligned
On Wed, 2025-12-03 at 20:54 +0900, Alexandre Courbot wrote:> On Tue Dec 2, 2025 at 8:25 AM JST, Timur Tabi wrote: > > On Wed, 2025-11-19 at 12:36 +0900, Alexandre Courbot wrote: > > > You can use the `Alignment` type here, as the rest of the code does: > > > > > > ??? let size = num::usize_as_u64(obj.size()) > > > ??????? .align_up(Alignment::new::<GSP_PAGE_SIZE>())?; > > > > > > Now `align_up` returns an error in case of overflow, that we will need > > > to pass down to the caller by changing the return type of `new`. It is a > > > bit annoying, but better than the behavior of `next_mutiple_of` in such > > > a case, which is to panic. :) > > > > I see your point, but these are u64s that we're talking about.? The only way next_mutiple_of() > > can > > panic is if obj.size() is greater than 0xFFFFFFFFFFFFF000, which is not possible.? > > > > I would say in this case, a panic is preferable to a convoluted error return that will never be > > exercised, because failure here indicates a coding error, not an input error. > > The input data is a usize, so technically we could get an input that > triggers that error.Actually, I still say it's not possible. Say I change the code to this, so that .next_multiple_of is called on a u64 instead of a usize: let size = num::usize_as_u64(obj.size()).next_multiple_of(GSP_PAGE_SIZE); Again, the only way this can fail is if the allocated object being passed in is almost 16 exabytes in size, which is physically impossible.> I know it's a very edge case, and clearly indicates a bug, but the > general rule is: don't panic the kernel. And in Rust, if possible, don't > even let me compiler insert panic-handling code. If you don't want to > change the return type of the method, then maybe use `unwrap_or` and > `inspect_err` to print an error before returning e.g. `0`.How about this: if .next_multiple_of(GSP_PAGE_SIZE) does return an error, I'll just assign size to obj.size() as-is? After all, at about 16GB/second for DMA, it will take about 31 years to DMA all that memory, so I will have long since retired before that bug shows up.