James Jones
2025-Jul-03 22:36 UTC
[RFC 0/4] Add Format Modifiers for NVIDIA Blackwell chipsets
The layout of bits within the individual tiles (referred to as sectors in the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro) changed for some formats starting in Blackwell 2 GPUs. New format modifiers are needed to denote the difference and prevent direct sharing of these incompatible buffers with older GPUs. This patch series proposes first adding some helper macros and inline functions to drm_fourcc.h to make the NVIDIA block-linear format modifiers easier to work with given the proposed solution makes them harder to parse, then extending the existing sector type field in the parametric format modifier macro DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() by another bit to accommodate the new layout type. There are a few ways the parameteric format modifier definition could have been altered to handle the new layouts: -The GOB Height and Page Kind field has a reserved value that could have been used. However, the GOB height and page kind enums did not change relative to prior chips, so this is sort of a lie. However, this is the least-invasive change. -An entirely new field could have been added. This seems inappropriate given the presence of an existing appropriate field. The advantage here is it avoids splitting the sector layout field across two bitfields. The proposed approach is the logically consistent one, but has the downside of being the most complex, and that it causes the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to evaluate its 's' parameter twice. However, I believe the added helper functions and macros address the complexity, and I have audited the relevant code and do not believe the double evaluation should cause any problems in practice. James Jones (4): drm: macros defining fields of NVIDIA modifiers drm: add inline helper funcs for NVIDIA modifiers drm/nouveau: use format modifier helper funcs drm: define NVIDIA DRM format modifiers for GB20x drivers/gpu/drm/nouveau/nouveau_display.c | 12 ++- include/uapi/drm/drm_fourcc.h | 100 ++++++++++++++++++---- 2 files changed, 92 insertions(+), 20 deletions(-) -- 2.49.0
James Jones
2025-Jul-03 22:36 UTC
[RFC 1/4] drm: macros defining fields of NVIDIA modifiers
Code parsing the fields of the NVIDIA block-linear format modifiers frequently needs to use magic values defined within the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro and document in the comment above it. Such code is error prone and hard to read. See for example, nouveau_decode_mod() in the nouveau driver. This change adds macros defining the mask and bit shift associated with each parameteric field in these modifiers. Signed-off-by: James Jones <jajones at nvidia.com> --- include/uapi/drm/drm_fourcc.h | 38 ++++++++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index 6483f76a2165..4240a4a146b6 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -902,6 +902,18 @@ extern "C" { */ #define DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED fourcc_mod_code(NVIDIA, 1) +/* Fields in the parameteric NVIDIA block linear format modifiers */ +#define __fourcc_mod_nvidia_l2gpbh_mask 0xfULL +#define __fourcc_mod_nvidia_l2gpbh_shift 0 +#define __fourcc_mod_nvidia_pkind_mask 0xffULL +#define __fourcc_mod_nvidia_pkind_shift 12 +#define __fourcc_mod_nvidia_kgen_mask 0x3ULL +#define __fourcc_mod_nvidia_kgen_shift 20 +#define __fourcc_mod_nvidia_slayout_mask 0x1ULL +#define __fourcc_mod_nvidia_slayout_shift 22 +#define __fourcc_mod_nvidia_comp_mask 0x7ULL +#define __fourcc_mod_nvidia_comp_shift 23 + /* * Generalized Block Linear layout, used by desktop GPUs starting with NV50/G80, * and Tegra GPUs starting with Tegra K1. @@ -983,15 +995,21 @@ extern "C" { * 6 = Reserved for future use * 7 = Reserved for future use * - * 55:25 - Reserved for future use. Must be zero. + * 55:26 - Reserved for future use. Must be zero. */ #define DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(c, s, g, k, h) \ - fourcc_mod_code(NVIDIA, (0x10 | \ - ((h) & 0xf) | \ - (((k) & 0xff) << 12) | \ - (((g) & 0x3) << 20) | \ - (((s) & 0x1) << 22) | \ - (((c) & 0x7) << 23))) + fourcc_mod_code(NVIDIA, \ + (0x10 | \ + (((h) & __fourcc_mod_nvidia_l2gpbh_mask) << \ + __fourcc_mod_nvidia_l2gpbh_shift) | \ + (((k) & __fourcc_mod_nvidia_pkind_mask) << \ + __fourcc_mod_nvidia_pkind_shift) | \ + (((g) & __fourcc_mod_nvidia_kgen_mask) << \ + __fourcc_mod_nvidia_kgen_shift) | \ + (((s) & __fourcc_mod_nvidia_slayout_mask) << \ + __fourcc_mod_nvidia_slayout_shift) | \ + (((c) & __fourcc_mod_nvidia_comp_mask) << \ + __fourcc_mod_nvidia_comp_shift))) /* To grandfather in prior block linear format modifiers to the above layout, * the page kind "0", which corresponds to "pitch/linear" and hence is unusable @@ -1002,10 +1020,12 @@ extern "C" { static inline __u64 drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) { - if (!(modifier & 0x10) || (modifier & (0xff << 12))) + if (!(modifier & 0x10) || + (modifier & (__fourcc_mod_nvidia_pkind_mask << + __fourcc_mod_nvidia_pkind_shift))) return modifier; else - return modifier | (0xfe << 12); + return modifier | (0xfeULL << __fourcc_mod_nvidia_pkind_shift); } /* -- 2.49.0
James Jones
2025-Jul-03 22:36 UTC
[RFC 2/4] drm: add inline helper funcs for NVIDIA modifiers
Code parsing the fields of the NVIDIA block-linear format modifiers frequently needs to use magic values defined within the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro and document in the comment above it. Such code is error prone and hard to read. See for example, nouveau_decode_mod() in the nouveau driver. This change adds macros using the previously added field definition macros to extract values from individual fields in a valid NVIDIA block-linear format modifier. Signed-off-by: James Jones <jajones at nvidia.com> --- include/uapi/drm/drm_fourcc.h | 44 ++++++++++++++++++++++++++++++++--- 1 file changed, 41 insertions(+), 3 deletions(-) diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index 4240a4a146b6..052e5fdd1d3b 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -1011,6 +1011,46 @@ extern "C" { (((c) & __fourcc_mod_nvidia_comp_mask) << \ __fourcc_mod_nvidia_comp_shift))) +#define __DRM_FOURCC_MKNVHELPER_FUNC(f__) \ + static inline __u64 \ + drm_fourcc_nvidia_format_mod_##f__(__u64 mod) \ +{ \ + return (mod >> __fourcc_mod_nvidia_##f__##_shift) & \ + __fourcc_mod_nvidia_##f__##_mask; \ +} + +/* + * Get log2 of the block height in gobs specified by mod: + * static inline __u64 drm_fourcc_nvidia_format_mod_l2gpbh(__u64 mod) + */ +__DRM_FOURCC_MKNVHELPER_FUNC(l2gpbh) + +/* + * Get the page kind specified by mod: + * static inline __u64 drm_fourcc_nvidia_format_mod_pkind(__u64 mod) + */ +__DRM_FOURCC_MKNVHELPER_FUNC(pkind) + +/* + * Get the page kind generation specified by mod: + * static inline __u64 drm_fourcc_nvidia_format_mod_kgen(__u64 mod) + */ +__DRM_FOURCC_MKNVHELPER_FUNC(kgen) + +/* + * Get the sector layout specified by mod: + * static inline __u64 drm_fourcc_nvidia_format_mod_slayout(__u64 mod) + */ +__DRM_FOURCC_MKNVHELPER_FUNC(slayout) + +/* + * Get the lossless framebuffer compression specified by mod: + * static inline __u64 drm_fourcc_nvidia_format_mod_kgen(__u64 mod) + */ +__DRM_FOURCC_MKNVHELPER_FUNC(comp) + +#undef __DRM_FOURCC_MKNVHELPER_FUNC + /* To grandfather in prior block linear format modifiers to the above layout, * the page kind "0", which corresponds to "pitch/linear" and hence is unusable * with block-linear layouts, is remapped within drivers to the value 0xfe, @@ -1020,9 +1060,7 @@ extern "C" { static inline __u64 drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) { - if (!(modifier & 0x10) || - (modifier & (__fourcc_mod_nvidia_pkind_mask << - __fourcc_mod_nvidia_pkind_shift))) + if (!(modifier & 0x10) || drm_fourcc_nvidia_format_mod_pkind(modifier)) return modifier; else return modifier | (0xfeULL << __fourcc_mod_nvidia_pkind_shift); -- 2.49.0
When parsing the parameteric NVIDIA block-linear format modifiers to determine surface tiling attributes, use the new helper functions to extract values from various fields. This avoids using magic values to extract the bitfields from the modifier, which makes the code more readable. Signed-off-by: James Jones <jajones at nvidia.com> --- drivers/gpu/drm/nouveau/nouveau_display.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c index add006fc8d81..1bec664a2b67 100644 --- a/drivers/gpu/drm/nouveau/nouveau_display.c +++ b/drivers/gpu/drm/nouveau/nouveau_display.c @@ -146,14 +146,18 @@ nouveau_decode_mod(struct nouveau_drm *drm, * Extract the block height and kind from the corresponding * modifier fields. See drm_fourcc.h for details. */ + uint64_t pkind = drm_fourcc_nvidia_format_mod_pkind(modifier); - if ((modifier & (0xffull << 12)) == 0ull) { + if (pkind == 0ull) { /* Legacy modifier. Translate to this dev's 'kind.' */ - modifier |= disp->format_modifiers[0] & (0xffull << 12); + const uint64_t any_dev_mod = disp->format_modifiers[0]; + + pkind = drm_fourcc_nvidia_format_mod_pkind(any_dev_mod); } - *tile_mode = (uint32_t)(modifier & 0xF); - *kind = (uint8_t)((modifier >> 12) & 0xFF); + *tile_mode + (uint32_t)drm_fourcc_nvidia_format_mod_l2gpbh(modifier); + *kind = (uint8_t)pkind; if (drm->client.device.info.chipset >= 0xc0) *tile_mode <<= 4; -- 2.49.0
James Jones
2025-Jul-03 22:36 UTC
[RFC 4/4] drm: define NVIDIA DRM format modifiers for GB20x
The layout of bits within the individual tiles (referred to as sectors in the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro) changed for some formats starting in Blackwell 2 GPUs. To denote the difference, extend the sector field in the parametric format modifier definition used to generate modifier values for NVIDIA hardware. Without this change, it would be impossible to differentiate the two layouts based on modifiers, and as a result software could attempt to share surfaces directly between pre-GB20x and GB20x cards, resulting in corruption when the surface was accessed on one of the GPUs after being populated with content by the other. Of note: This change causes the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to evaluate its "s" parameter twice, with the side effects that entails. I surveyed all usage of the modifier in the kernel and Mesa code, and that does not appear to be problematic in any current usage, but I thought it was worth calling out. Signed-off-by: James Jones <jajones at nvidia.com> --- include/uapi/drm/drm_fourcc.h | 46 +++++++++++++++++++++-------------- 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index 052e5fdd1d3b..348b2f1c1cb7 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -909,8 +909,10 @@ extern "C" { #define __fourcc_mod_nvidia_pkind_shift 12 #define __fourcc_mod_nvidia_kgen_mask 0x3ULL #define __fourcc_mod_nvidia_kgen_shift 20 -#define __fourcc_mod_nvidia_slayout_mask 0x1ULL -#define __fourcc_mod_nvidia_slayout_shift 22 +#define __fourcc_mod_nvidia_slayout_low_mask 0x1ULL +#define __fourcc_mod_nvidia_slayout_low_shift 22 +#define __fourcc_mod_nvidia_slayout_high_mask 0x2ULL +#define __fourcc_mod_nvidia_slayout_high_shift 25 #define __fourcc_mod_nvidia_comp_mask 0x7ULL #define __fourcc_mod_nvidia_comp_shift 23 @@ -973,14 +975,16 @@ extern "C" { * 2 = Gob Height 8, Turing+ Page Kind mapping * 3 = Reserved for future use. * - * 22:22 s Sector layout. On Tegra GPUs prior to Xavier, there is a further - * bit remapping step that occurs at an even lower level than the - * page kind and block linear swizzles. This causes the layout of - * surfaces mapped in those SOC's GPUs to be incompatible with the - * equivalent mapping on other GPUs in the same system. + * 22:22 s Sector layout. There is a further bit remapping step that occurs + * 26:26 at an even lower level than the page kind and block linear + * swizzles. This causes the bit arrangement of surfaces in memory + * to differ subtly, and prevents direct sharing of surfaces between + * GPUs with different layouts. * - * 0 = Tegra K1 - Tegra Parker/TX2 Layout. - * 1 = Desktop GPU and Tegra Xavier+ Layout + * 0 = Tegra K1 - Tegra Parker/TX2 Layout + * 1 = Pre-GB20x, Tegra Xavier-Orin, GB10 Layout + * 2 = GB20x(Blackwell 2)+ Layout for some pixel/texel sizes + * 3 = reserved for future use. * * 25:23 c Lossless Framebuffer Compression type. * @@ -995,7 +999,7 @@ extern "C" { * 6 = Reserved for future use * 7 = Reserved for future use * - * 55:26 - Reserved for future use. Must be zero. + * 55:27 - Reserved for future use. Must be zero. */ #define DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(c, s, g, k, h) \ fourcc_mod_code(NVIDIA, \ @@ -1006,8 +1010,10 @@ extern "C" { __fourcc_mod_nvidia_pkind_shift) | \ (((g) & __fourcc_mod_nvidia_kgen_mask) << \ __fourcc_mod_nvidia_kgen_shift) | \ - (((s) & __fourcc_mod_nvidia_slayout_mask) << \ - __fourcc_mod_nvidia_slayout_shift) | \ + (((s) & __fourcc_mod_nvidia_slayout_low_mask) << \ + __fourcc_mod_nvidia_slayout_low_shift) | \ + (((s) & __fourcc_mod_nvidia_slayout_high_mask) << \ + __fourcc_mod_nvidia_slayout_high_shift) | \ (((c) & __fourcc_mod_nvidia_comp_mask) << \ __fourcc_mod_nvidia_comp_shift))) @@ -1037,12 +1043,6 @@ __DRM_FOURCC_MKNVHELPER_FUNC(pkind) */ __DRM_FOURCC_MKNVHELPER_FUNC(kgen) -/* - * Get the sector layout specified by mod: - * static inline __u64 drm_fourcc_nvidia_format_mod_slayout(__u64 mod) - */ -__DRM_FOURCC_MKNVHELPER_FUNC(slayout) - /* * Get the lossless framebuffer compression specified by mod: * static inline __u64 drm_fourcc_nvidia_format_mod_kgen(__u64 mod) @@ -1051,6 +1051,16 @@ __DRM_FOURCC_MKNVHELPER_FUNC(comp) #undef __DRM_FOURCC_MKNVHELPER_FUNC +/* Get the sector layout specified by mod: */ +static inline __u64 +drm_fourcc_nvidia_format_mod_slayout(__u64 mod) +{ + return ((mod >> __fourcc_mod_nvidia_slayout_low_shift) & + __fourcc_mod_nvidia_slayout_low_mask) | + ((mod >> __fourcc_mod_nvidia_slayout_high_shift) & + __fourcc_mod_nvidia_slayout_high_mask); +} + /* To grandfather in prior block linear format modifiers to the above layout, * the page kind "0", which corresponds to "pitch/linear" and hence is unusable * with block-linear layouts, is remapped within drivers to the value 0xfe, -- 2.49.0
Faith Ekstrand
2025-Jul-04 14:45 UTC
[RFC 0/4] Add Format Modifiers for NVIDIA Blackwell chipsets
On Thu, Jul 3, 2025 at 6:34?PM James Jones <jajones at nvidia.com> wrote:> The layout of bits within the individual tiles (referred to as > sectors in the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro) > changed for some formats starting in Blackwell 2 GPUs. New format > modifiers are needed to denote the difference and prevent direct > sharing of these incompatible buffers with older GPUs. > > This patch series proposes first adding some helper macros and > inline functions to drm_fourcc.h to make the NVIDIA block-linear > format modifiers easier to work with given the proposed solution > makes them harder to parse, then extending the existing sector type > field in the parametric format modifier macro > DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() by another bit to > accommodate the new layout type. > > There are a few ways the parameteric format modifier definition > could have been altered to handle the new layouts: > > -The GOB Height and Page Kind field has a reserved value that could > have been used. However, the GOB height and page kind enums did > not change relative to prior chips, so this is sort of a lie. > However, this is the least-invasive change. > > -An entirely new field could have been added. This seems > inappropriate given the presence of an existing appropriate field. > The advantage here is it avoids splitting the sector layout field > across two bitfields. > > The proposed approach is the logically consistent one, but has the > downside of being the most complex, and that it causes the > DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to evaluate its > 's' parameter twice. However, I believe the added helper functions > and macros address the complexity, and I have audited the relevant > code and do not believe the double evaluation should cause any > problems in practice. >I think we'll converge pretty quickly on the last patch. I'm less sure about the first 3. While I like the idea of having static inlines for modifiers that are shared by everybody, we can't actually use them from NVK because our image layout code is in rust and bindgen can't generate bindings for inlines so we're going to end up re-typing that all anyway. Also, I'm not seeing a patch to fix KMS to advertise the correct modifiers. Were you planning to type that or should I ask Lyude or Ben? ~Faith> James Jones (4): > drm: macros defining fields of NVIDIA modifiers > drm: add inline helper funcs for NVIDIA modifiers > drm/nouveau: use format modifier helper funcs > drm: define NVIDIA DRM format modifiers for GB20x > > drivers/gpu/drm/nouveau/nouveau_display.c | 12 ++- > include/uapi/drm/drm_fourcc.h | 100 ++++++++++++++++++---- > 2 files changed, 92 insertions(+), 20 deletions(-) > > -- > 2.49.0 > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20250704/559af368/attachment.htm>