James Jones
2025-Aug-11 22:00 UTC
[PATCH 0/3] drm/nouveau: Advertise correct modifiers on GB20x
This series adds new format modifiers for 8 and 16-bit formats on GB20x GPUs, preventing them from mistakenly sharing block-linear surfaces using these formats with prior GPUs that use a different layout. There are a few ways the parameteric format modifier definition could have been altered to handle the new layouts: -The GOB Height and Page Kind field has a reserved value that could have been used. However, the GOB height and page kind enums did not change relative to prior chips, so this is sort of a lie. However, this is the least-invasive change. -An entirely new field could have been added. This seems inappropriate given the presence of an existing appropriate field. The advantage here is it avoids splitting the sector layout field across two bitfields. The chosen approach is the logically consistent one, but has the downside of being the most complex, and that it causes the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to evaluate its 's' parameter twice. However, utilizing simple helper functions in client code when accessing the parameteric format modifier fields easily addresses the complexity, and I have audited the relevant code and do not believe the double evaluation should cause any problems in practice. Tested on GB20x and TU10x cards using the following: -kmscube w/NVK+Zink built with these patches applied: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36336 with various manually specified formats and both manually specified and automatically selected modifiers. -drmfmtmods, a tiny test program that lists modifiers: https://github.com/cubanismo/drmfmtmods Changes since the RFC version here: https://lore.kernel.org/nouveau/20250703223658.1457-1-jajones at nvidia.com/ -Dropped the helper macros & static inlines in drm_fourcc.h as requested by Faith Ekstrand, who noted these aren't helpful for UMD code, which is all written in rust now. I may re- introduce some of these in a subsequent series, but we both agreed we do not want to delay progress on the modifiers themselves while we debate the details of those cometic details. -Reserved an extra bit for future sector layouts. -Fixed handling of linear modifiers on GB20x and NV5x/G8x/G9x/GT2xx chips. James Jones (3): drm: define NVIDIA DRM format modifiers for GB20x drm/nouveau/disp: Always accept linear modifier drm/nouveau: Advertise correct modifiers on GB20x drivers/gpu/drm/nouveau/dispnv50/disp.c | 3 ++ drivers/gpu/drm/nouveau/dispnv50/disp.h | 1 + drivers/gpu/drm/nouveau/dispnv50/wndw.c | 25 ++++++++++++++-- drivers/gpu/drm/nouveau/dispnv50/wndwca7e.c | 33 +++++++++++++++++++++ include/uapi/drm/drm_fourcc.h | 25 ++++++++++------ 5 files changed, 76 insertions(+), 11 deletions(-) -- 2.50.1
James Jones
2025-Aug-11 22:00 UTC
[PATCH 1/3] drm: define NVIDIA DRM format modifiers for GB20x
The layout of bits within the individual tiles (referred to as sectors in the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro) changed for 8 and 16-bit surfaces starting in Blackwell 2 GPUs (With the exception of GB10). To denote the difference, extend the sector field in the parametric format modifier definition used to generate modifier values for NVIDIA hardware. Without this change, it would be impossible to differentiate the two layouts based on modifiers, and as a result software could attempt to share surfaces directly between pre-GB20x and GB20x cards, resulting in corruption when the surface was accessed on one of the GPUs after being populated with content by the other. Of note: This change causes the DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to evaluate its "s" parameter twice, with the side effects that entails. I surveyed all usage of the modifier in the kernel and Mesa code, and that does not appear to be problematic in any current usage, but I thought it was worth calling out. Signed-off-by: James Jones <jajones at nvidia.com> --- include/uapi/drm/drm_fourcc.h | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index ea91aa8afde9..e527b24bd824 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -979,14 +979,20 @@ extern "C" { * 2 = Gob Height 8, Turing+ Page Kind mapping * 3 = Reserved for future use. * - * 22:22 s Sector layout. On Tegra GPUs prior to Xavier, there is a further - * bit remapping step that occurs at an even lower level than the - * page kind and block linear swizzles. This causes the layout of - * surfaces mapped in those SOC's GPUs to be incompatible with the - * equivalent mapping on other GPUs in the same system. - * - * 0 = Tegra K1 - Tegra Parker/TX2 Layout. - * 1 = Desktop GPU and Tegra Xavier+ Layout + * 22:22 s Sector layout. There is a further bit remapping step that occurs + * 26:27 at an even lower level than the page kind and block linear + * swizzles. This causes the bit arrangement of surfaces in memory + * to differ subtly, and prevents direct sharing of surfaces between + * GPUs with different layouts. + * + * 0 = Tegra K1 - Tegra Parker/TX2 Layout + * 1 = Pre-GB20x, GB20x 32+ bpp, GB10, Tegra Xavier-Orin Layout + * 2 = GB20x(Blackwell 2)+ 8 bpp surface layout + * 3 = GB20x(Blackwell 2)+ 16 bpp surface layout + * 4 = Reserved for future use. + * 5 = Reserved for future use. + * 6 = Reserved for future use. + * 7 = Reserved for future use. * * 25:23 c Lossless Framebuffer Compression type. * @@ -1001,7 +1007,7 @@ extern "C" { * 6 = Reserved for future use * 7 = Reserved for future use * - * 55:25 - Reserved for future use. Must be zero. + * 55:28 - Reserved for future use. Must be zero. */ #define DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(c, s, g, k, h) \ fourcc_mod_code(NVIDIA, (0x10 | \ @@ -1009,6 +1015,7 @@ extern "C" { (((k) & 0xff) << 12) | \ (((g) & 0x3) << 20) | \ (((s) & 0x1) << 22) | \ + (((s) & 0x6) << 25) | \ (((c) & 0x7) << 23))) /* To grandfather in prior block linear format modifiers to the above layout, -- 2.50.1
James Jones
2025-Aug-11 22:00 UTC
[PATCH 2/3] drm/nouveau/disp: Always accept linear modifier
On some chipsets, which block-linear modifiers are supported is format-specific. However, linear modifiers are always be supported. The prior modifier filtering logic was not accounting for the linear case. Fixes: c586f30bf74c ("drm/nouveau/kms: Add format mod prop to base/ovly/nvdisp") Signed-off-by: James Jones <jajones at nvidia.com> --- drivers/gpu/drm/nouveau/dispnv50/wndw.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c index 11d5b923d6e7..e2c55f4b9c5a 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c @@ -795,6 +795,10 @@ static bool nv50_plane_format_mod_supported(struct drm_plane *plane, struct nouveau_drm *drm = nouveau_drm(plane->dev); uint8_t i; + /* All chipsets can display all formats in linear layout */ + if (modifier == DRM_FORMAT_MOD_LINEAR) + return true; + if (drm->client.device.info.chipset < 0xc0) { const struct drm_format_info *info = drm_format_info(format); const uint8_t kind = (modifier >> 12) & 0xff; -- 2.50.1
James Jones
2025-Aug-11 22:00 UTC
[PATCH 3/3] drm/nouveau: Advertise correct modifiers on GB20x
8 and 16 bit formats use a different layout on GB20x than they did on prior chips. Add the corresponding DRM format modifiers to the list of modifiers supported by the display engine on such chips, and filter the supported modifiers for each format based on its bytes per pixel in nv50_plane_format_mod_supported(). Note this logic will need to be updated when GB10 support is added, since it is a GB20x chip that uses the pre-GB20x sector layout for all formats. Signed-off-by: James Jones <jajones at nvidia.com> --- drivers/gpu/drm/nouveau/dispnv50/disp.c | 4 ++- drivers/gpu/drm/nouveau/dispnv50/disp.h | 1 + drivers/gpu/drm/nouveau/dispnv50/wndw.c | 24 +++++++++++++-- drivers/gpu/drm/nouveau/dispnv50/wndwca7e.c | 33 +++++++++++++++++++++ 4 files changed, 59 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index e97e39abf3a2..12b1dba8e05d 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -2867,7 +2867,9 @@ nv50_display_create(struct drm_device *dev) } /* Assign the correct format modifiers */ - if (disp->disp->object.oclass >= TU102_DISP) + if (disp->disp->object.oclass >= GB202_DISP) + nouveau_display(dev)->format_modifiers = wndwca7e_modifiers; + else if (disp->disp->object.oclass >= TU102_DISP) nouveau_display(dev)->format_modifiers = wndwc57e_modifiers; else if (drm->client.device.info.family >= NV_DEVICE_INFO_V0_FERMI) diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.h b/drivers/gpu/drm/nouveau/dispnv50/disp.h index 15f9242b72ac..5d998f0319dc 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.h +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.h @@ -104,4 +104,5 @@ struct nouveau_encoder *nv50_real_outp(struct drm_encoder *encoder); extern const u64 disp50xx_modifiers[]; extern const u64 disp90xx_modifiers[]; extern const u64 wndwc57e_modifiers[]; +extern const u64 wndwca7e_modifiers[]; #endif diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c index e2c55f4b9c5a..ef9e410babbf 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c @@ -786,13 +786,14 @@ nv50_wndw_destroy(struct drm_plane *plane) } /* This function assumes the format has already been validated against the plane - * and the modifier was validated against the device-wides modifier list at FB + * and the modifier was validated against the device-wide modifier list at FB * creation time. */ static bool nv50_plane_format_mod_supported(struct drm_plane *plane, u32 format, u64 modifier) { struct nouveau_drm *drm = nouveau_drm(plane->dev); + const struct drm_format_info *info = drm_format_info(format); uint8_t i; /* All chipsets can display all formats in linear layout */ @@ -800,13 +801,32 @@ static bool nv50_plane_format_mod_supported(struct drm_plane *plane, return true; if (drm->client.device.info.chipset < 0xc0) { - const struct drm_format_info *info = drm_format_info(format); const uint8_t kind = (modifier >> 12) & 0xff; if (!format) return false; for (i = 0; i < info->num_planes; i++) if ((info->cpp[i] != 4) && kind != 0x70) return false; + } else if (drm->client.device.info.chipset >= 0x1b2) { + const uint8_t slayout = ((modifier >> 22) & 0x1) | + ((modifier >> 25) & 0x6); + + if (!format) + return false; + + /* + * Note in practice this implies only formats where cpp is equal + * for each plane, or >= 4 for all planes, are supported. + */ + for (i = 0; i < info->num_planes; i++) { + if (((info->cpp[i] == 2) && slayout != 3) || + ((info->cpp[i] == 1) && slayout != 2) || + ((info->cpp[i] >= 4) && slayout != 1)) + return false; + + /* 24-bit not supported. It has yet another layout */ + WARN_ON(info->cpp[i] == 3); + } } return true; diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndwca7e.c b/drivers/gpu/drm/nouveau/dispnv50/wndwca7e.c index 0d8e9a9d1a57..2cec8cfbd546 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/wndwca7e.c +++ b/drivers/gpu/drm/nouveau/dispnv50/wndwca7e.c @@ -179,6 +179,39 @@ wndwca7e_ntfy_set(struct nv50_wndw *wndw, struct nv50_wndw_atom *asyw) return 0; } +/**************************************************************** + * Log2(block height) ----------------------------+ * + * Page Kind ----------------------------------+ | * + * Gob Height/Page Kind Generation ------+ | | * + * Sector layout -------+ | | | * + * Compression ------+ | | | | */ +const u64 wndwca7e_modifiers[] = { /* | | | | | */ + /* 4cpp+ modifiers */ + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 1, 2, 0x06, 0), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 1, 2, 0x06, 1), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 1, 2, 0x06, 2), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 1, 2, 0x06, 3), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 1, 2, 0x06, 4), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 1, 2, 0x06, 5), + /* 1cpp/8bpp modifiers */ + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 2, 2, 0x06, 0), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 2, 2, 0x06, 1), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 2, 2, 0x06, 2), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 2, 2, 0x06, 3), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 2, 2, 0x06, 4), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 2, 2, 0x06, 5), + /* 2cpp/16bpp modifiers */ + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 3, 2, 0x06, 0), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 3, 2, 0x06, 1), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 3, 2, 0x06, 2), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 3, 2, 0x06, 3), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 3, 2, 0x06, 4), + DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D(0, 3, 2, 0x06, 5), + /* All formats support linear */ + DRM_FORMAT_MOD_LINEAR, + DRM_FORMAT_MOD_INVALID +}; + static const struct nv50_wndw_func wndwca7e = { .acquire = wndwc37e_acquire, -- 2.50.1
Faith Ekstrand
2025-Aug-22 20:44 UTC
[PATCH 0/3] drm/nouveau: Advertise correct modifiers on GB20x
On Mon, Aug 11, 2025 at 5:57?PM James Jones <jajones at nvidia.com> wrote:> > This series adds new format modifiers for 8 and 16-bit formats on GB20x > GPUs, preventing them from mistakenly sharing block-linear surfaces > using these formats with prior GPUs that use a different layout. > > There are a few ways the parameteric format modifier definition > could have been altered to handle the new layouts: > > -The GOB Height and Page Kind field has a reserved value that could > have been used. However, the GOB height and page kind enums did > not change relative to prior chips, so this is sort of a lie. > However, this is the least-invasive change. > > -An entirely new field could have been added. This seems > inappropriate given the presence of an existing appropriate field. > The advantage here is it avoids splitting the sector layout field > across two bitfields. > > The chosen approach is the logically consistent one, but has the > downside of being the most complex, and that it causes the > DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to evaluate its 's' > parameter twice. However, utilizing simple helper functions in > client code when accessing the parameteric format modifier fields > easily addresses the complexity, and I have audited the relevant code > and do not believe the double evaluation should cause any problems in > practice. > > Tested on GB20x and TU10x cards using the following: > > -kmscube w/NVK+Zink built with these patches applied: > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36336Both the Mesa and kernel pieces are now reviewed so I think we're good to start landing. I've independently tested with kmscube to verify. ~Faith> > with various manually specified formats > and both manually specified and automatically > selected modifiers. > > -drmfmtmods, a tiny test program that lists modifiers: > > https://github.com/cubanismo/drmfmtmods > > Changes since the RFC version here: > > https://lore.kernel.org/nouveau/20250703223658.1457-1-jajones at nvidia.com/ > > -Dropped the helper macros & static inlines in > drm_fourcc.h as requested by Faith Ekstrand, > who noted these aren't helpful for UMD code, > which is all written in rust now. I may re- > introduce some of these in a subsequent series, > but we both agreed we do not want to delay > progress on the modifiers themselves while we > debate the details of those cometic details. > > -Reserved an extra bit for future sector > layouts. > > -Fixed handling of linear modifiers on GB20x > and NV5x/G8x/G9x/GT2xx chips. > > James Jones (3): > drm: define NVIDIA DRM format modifiers for GB20x > drm/nouveau/disp: Always accept linear modifier > drm/nouveau: Advertise correct modifiers on GB20x > > drivers/gpu/drm/nouveau/dispnv50/disp.c | 3 ++ > drivers/gpu/drm/nouveau/dispnv50/disp.h | 1 + > drivers/gpu/drm/nouveau/dispnv50/wndw.c | 25 ++++++++++++++-- > drivers/gpu/drm/nouveau/dispnv50/wndwca7e.c | 33 +++++++++++++++++++++ > include/uapi/drm/drm_fourcc.h | 25 ++++++++++------ > 5 files changed, 76 insertions(+), 11 deletions(-) > > -- > 2.50.1 >