Zhi Wang
2024-Nov-22 12:57 UTC
[RFC 0/8] drm/nouveau: scrubber ucode image support for vGPU
Hi folks: Supporting multiple vGPUs is one of the goals of the next version of the RFC of NVIDIA vGPU support. Requesting the a larger GSP heap size is the first step. However the pre-scrubbed FB memory size on Ada is 256MB. Thus, using a larger GSP heap > 256MB requires an extra scrubber ucode image to scrub the FB memory before any other ucode images are executed. Thus, the scrubber ucode image support is required as a pre-condition for supporting the max vGPUs. I would like to start this RFC for discussions, collecting people's feedback before the next RFC of NVIDIA vGPU support. Besides, a kernel doc is attached to explain the story. This series should also addresses the comment [1] from Jason in the RFCv1 [2]. The series can also be found from a repo [3]. Tested on vGPU RFCv1 repo [2] and [3] with running Heaven for 3 hrs and Vulkan CTS without any problem. PATCH 1 - 2: Factor out some common routines for all the SKUs. PATCH 3: Load the scrubber ucode image when WPR2 heap size > 256MB PATCH 4: Execute the scrubber ucode image when the image firmware is loaded. PATCH 5 - 6: Set the WPR2 heap size to 576MB when vGPU(SRIOV) is supported. PATCH 7: Set the max supported vGPU count when SRIOV is supported. PATCH 8: Introduce a kernel doc. Generating the scrubber ucode image ================================== The following patch is required before generating the scrubber ucdoe image via open-gpu-kernel-modules[4]: diff --git a/nouveau/extract-firmware-nouveau.py b/nouveau/extract-firmware-nouveau.py index 837edc8d..6268934c 100755 --- a/nouveau/extract-firmware-nouveau.py +++ b/nouveau/extract-firmware-nouveau.py @@ -335,7 +335,7 @@ def main(): booter("ad102", "load", 384) booter("ad102", "unload", 384) bootloader("ad102", "_prod_") - # scrubber("ad102", 384) # Not currently used by Nouveau + scrubber("ad102", 384) # Not currently used by Nouveau if __name__ == "__main__": main() Once the script is patched, it will generate the scrubber ucode image binary. [1] https://lore.kernel.org/all/20241015163556.GN3394334 at nvidia.com/ [2] https://lore.kernel.org/all/20240922124951.1946072-1-zhiw at nvidia.com/ [3] https://github.com/zhiwang-nvidia/linux/tree/zhi/scrubber-support [4] https://github.com/NVIDIA/open-gpu-kernel-modules/tree/535 Zhi Wang (8): drm/nouveau: factor out nvkm_gsp_init_fw_heap() drm/nouveau: introduce tu102_gsp_init_fw_heap() drm/nouveau: load scrubber ucode image when WPR2 heap size > 256MB drm/nouveau: scrub the FB memory when scrubber firmware is loaded drm/nouveau: support WPR2 heap size override drm/nouveau: override the WPR2 heap size when SRIOV is supported on Ada drm/nouveau: set max supported vGPU count when SRIOV is supported drm/nouveau: introduce the scrubber on Ada in a kernel doc .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 4 +- .../gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 81 ++++++++++++++++++ .../gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c | 1 + .../gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c | 1 + .../gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 8 +- .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 85 ++++++++++++------- .../gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c | 9 ++ .../gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c | 1 + 8 files changed, 157 insertions(+), 33 deletions(-) -- 2.34.1
To support the per-SKU GSP WPR2 heap initialization, first, factor out the common routine for all the SKUs. Factor out nvkm_gsp_init_fw_heap(). Adjust some indent to make checkpatch.pl happy. No functional change is intended. Cc: Milos Tijanic <mtijanic at nvidia.com> Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- .../gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 1 + .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 69 +++++++++++-------- 2 files changed, 40 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h index 9f4a62375a27..579d83048164 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h @@ -58,6 +58,7 @@ int ga102_gsp_booter_ctor(struct nvkm_gsp *, const char *, const struct firmware int ga102_gsp_reset(struct nvkm_gsp *); void r535_gsp_dtor(struct nvkm_gsp *); +void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp); int r535_gsp_oneinit(struct nvkm_gsp *); int r535_gsp_init(struct nvkm_gsp *); int r535_gsp_fini(struct nvkm_gsp *, bool suspend); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index cf58f9da9139..6f2319845322 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -2517,6 +2517,44 @@ r535_gsp_dtor(struct nvkm_gsp *gsp) nvkm_gsp_mem_dtor(gsp, &gsp->logrm); } +void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp) +{ + /* Calculate FB layout. */ + gsp->fb.wpr2.frts.size = 0x100000; + gsp->fb.wpr2.frts.addr = ALIGN_DOWN(gsp->fb.bios.addr, 0x20000) - gsp->fb.wpr2.frts.size; + + gsp->fb.wpr2.boot.size = gsp->boot.fw.size; + gsp->fb.wpr2.boot.addr = ALIGN_DOWN(gsp->fb.wpr2.frts.addr - gsp->fb.wpr2.boot.size, + 0x1000); + + gsp->fb.wpr2.elf.size = gsp->fw.len; + gsp->fb.wpr2.elf.addr = ALIGN_DOWN(gsp->fb.wpr2.boot.addr - gsp->fb.wpr2.elf.size, + 0x10000); + + { + u32 fb_size_gb = DIV_ROUND_UP_ULL(gsp->fb.size, 1 << 30); + + gsp->fb.wpr2.heap.size + gsp->func->wpr_heap.os_carveout_size + + gsp->func->wpr_heap.base_size + + ALIGN(GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB * fb_size_gb, 1 << 20) + + ALIGN(GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE, 1 << 20); + + gsp->fb.wpr2.heap.size = max(gsp->fb.wpr2.heap.size, gsp->func->wpr_heap.min_size); + } + + gsp->fb.wpr2.heap.addr = ALIGN_DOWN(gsp->fb.wpr2.elf.addr - gsp->fb.wpr2.heap.size, + 0x100000); + gsp->fb.wpr2.heap.size = ALIGN_DOWN(gsp->fb.wpr2.elf.addr - gsp->fb.wpr2.heap.addr, + 0x100000); + + gsp->fb.wpr2.addr = ALIGN_DOWN(gsp->fb.wpr2.heap.addr - sizeof(GspFwWprMeta), 0x100000); + gsp->fb.wpr2.size = gsp->fb.wpr2.frts.addr + gsp->fb.wpr2.frts.size - gsp->fb.wpr2.addr; + + gsp->fb.heap.size = 0x100000; + gsp->fb.heap.addr = gsp->fb.wpr2.addr - gsp->fb.heap.size; +} + int r535_gsp_oneinit(struct nvkm_gsp *gsp) { @@ -2581,36 +2619,7 @@ r535_gsp_oneinit(struct nvkm_gsp *gsp) /* Release FW images - we've copied them to DMA buffers now. */ r535_gsp_dtor_fws(gsp); - /* Calculate FB layout. */ - gsp->fb.wpr2.frts.size = 0x100000; - gsp->fb.wpr2.frts.addr = ALIGN_DOWN(gsp->fb.bios.addr, 0x20000) - gsp->fb.wpr2.frts.size; - - gsp->fb.wpr2.boot.size = gsp->boot.fw.size; - gsp->fb.wpr2.boot.addr = ALIGN_DOWN(gsp->fb.wpr2.frts.addr - gsp->fb.wpr2.boot.size, 0x1000); - - gsp->fb.wpr2.elf.size = gsp->fw.len; - gsp->fb.wpr2.elf.addr = ALIGN_DOWN(gsp->fb.wpr2.boot.addr - gsp->fb.wpr2.elf.size, 0x10000); - - { - u32 fb_size_gb = DIV_ROUND_UP_ULL(gsp->fb.size, 1 << 30); - - gsp->fb.wpr2.heap.size - gsp->func->wpr_heap.os_carveout_size + - gsp->func->wpr_heap.base_size + - ALIGN(GSP_FW_HEAP_PARAM_SIZE_PER_GB_FB * fb_size_gb, 1 << 20) + - ALIGN(GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE, 1 << 20); - - gsp->fb.wpr2.heap.size = max(gsp->fb.wpr2.heap.size, gsp->func->wpr_heap.min_size); - } - - gsp->fb.wpr2.heap.addr = ALIGN_DOWN(gsp->fb.wpr2.elf.addr - gsp->fb.wpr2.heap.size, 0x100000); - gsp->fb.wpr2.heap.size = ALIGN_DOWN(gsp->fb.wpr2.elf.addr - gsp->fb.wpr2.heap.addr, 0x100000); - - gsp->fb.wpr2.addr = ALIGN_DOWN(gsp->fb.wpr2.heap.addr - sizeof(GspFwWprMeta), 0x100000); - gsp->fb.wpr2.size = gsp->fb.wpr2.frts.addr + gsp->fb.wpr2.frts.size - gsp->fb.wpr2.addr; - - gsp->fb.heap.size = 0x100000; - gsp->fb.heap.addr = gsp->fb.wpr2.addr - gsp->fb.heap.size; + nvkm_gsp_init_fw_heap(gsp); ret = nvkm_gsp_fwsec_frts(gsp); if (WARN_ON(ret)) -- 2.34.1
To support the per-SKU GSP WPR2 heap initialization, introduce tu102_gsp_init_fw_heap() as the common function for the support SKUs. No functional change is intended. Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 1 + drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c | 1 + drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c | 1 + drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 2 ++ drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 4 +++- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c | 9 +++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c | 1 + 7 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index c849c6299c52..00a7ec875400 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -31,6 +31,7 @@ ad102_gsp_r535_113_01 = { .wpr_heap.os_carveout_size = 20 << 20, .wpr_heap.base_size = 8 << 20, .wpr_heap.min_size = 84 << 20, + .wpr_heap.init_fw_heap = tu102_gsp_init_fw_heap, .booter.ctor = ga102_gsp_booter_ctor, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c index 223f68b532ef..e5423199232a 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga100.c @@ -47,6 +47,7 @@ ga100_gsp_r535_113_01 = { .wpr_heap.base_size = 8 << 20, .wpr_heap.min_size = 64 << 20, + .wpr_heap.init_fw_heap = tu102_gsp_init_fw_heap, .booter.ctor = tu102_gsp_booter_ctor, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c index 4c4b4168a266..a79dcca873f0 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ga102.c @@ -159,6 +159,7 @@ ga102_gsp_r535_113_01 = { .wpr_heap.os_carveout_size = 20 << 20, .wpr_heap.base_size = 8 << 20, .wpr_heap.min_size = 84 << 20, + .wpr_heap.init_fw_heap = tu102_gsp_init_fw_heap, .booter.ctor = ga102_gsp_booter_ctor, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h index 579d83048164..dfb41be3d677 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h @@ -28,6 +28,7 @@ struct nvkm_gsp_func { u32 os_carveout_size; u32 base_size; u64 min_size; + int (*init_fw_heap)(struct nvkm_gsp *gsp); } wpr_heap; struct { @@ -48,6 +49,7 @@ extern const struct nvkm_falcon_func tu102_gsp_flcn; extern const struct nvkm_falcon_fw_func tu102_gsp_fwsec; int tu102_gsp_booter_ctor(struct nvkm_gsp *, const char *, const struct firmware *, struct nvkm_falcon *, struct nvkm_falcon_fw *); +int tu102_gsp_init_fw_heap(struct nvkm_gsp *gsp); int tu102_gsp_oneinit(struct nvkm_gsp *); int tu102_gsp_reset(struct nvkm_gsp *); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 6f2319845322..c56c545f2bdb 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -2619,7 +2619,9 @@ r535_gsp_oneinit(struct nvkm_gsp *gsp) /* Release FW images - we've copied them to DMA buffers now. */ r535_gsp_dtor_fws(gsp); - nvkm_gsp_init_fw_heap(gsp); + ret = gsp->func->wpr_heap.init_fw_heap(gsp); + if (WARN_ON(ret)) + return ret; ret = nvkm_gsp_fwsec_frts(gsp); if (WARN_ON(ret)) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c index 59c5f2b9172a..e279a322704a 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c @@ -76,6 +76,14 @@ tu102_gsp_booter_ctor(struct nvkm_gsp *gsp, const char *name, const struct firmw return ret; } +int +tu102_gsp_init_fw_heap(struct nvkm_gsp *gsp) +{ + nvkm_gsp_init_fw_heap(gsp); + + return 0; +} + static int tu102_gsp_fwsec_load_bld(struct nvkm_falcon_fw *fw) { @@ -171,6 +179,7 @@ tu102_gsp_r535_113_01 = { .wpr_heap.base_size = 8 << 20, .wpr_heap.min_size = 64 << 20, + .wpr_heap.init_fw_heap = tu102_gsp_init_fw_heap, .booter.ctor = tu102_gsp_booter_ctor, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c index 04fbd9ed28b1..daa954835ef9 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu116.c @@ -30,6 +30,7 @@ tu116_gsp_r535_113_01 = { .wpr_heap.base_size = 8 << 20, .wpr_heap.min_size = 64 << 20, + .wpr_heap.init_fw_heap = tu102_gsp_init_fw_heap, .booter.ctor = tu102_gsp_booter_ctor, -- 2.34.1
Zhi Wang
2024-Nov-22 12:57 UTC
[RFC 3/8] drm/nouveau: load scrubber ucode image when WPR2 heap size > 256MB
When WPR2 heap size > 256MB, the FB memory needs to be scrubbed before use. If not, the GSP firmware hangs when booting. Introduce ad102_gsp_init_fw_heap(). Load scrubber ucode image when WRP2 heap size > 256MB after the FB memory layout initialization. Save the fwif in nvkm_gsp for firmware loading in ad102_gsp_init_fw_heap(). Cc: Surath Mitra <smitra at nvidia.com> Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- .../gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 3 ++- .../gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 21 ++++++++++++++++++- .../gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 4 +++- .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 6 +++++- 4 files changed, 30 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h index a2055f2a014a..c6fe2d9d47de 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h @@ -33,7 +33,7 @@ struct nvkm_gsp { struct nvkm_subdev subdev; struct nvkm_falcon falcon; - + const struct nvkm_gsp_fwif *fwif; struct { struct { const struct firmware *load; @@ -41,6 +41,7 @@ struct nvkm_gsp { } booter; const struct firmware *bl; const struct firmware *rm; + const struct firmware *scrubber; } fws; struct nvkm_firmware fw; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index 00a7ec875400..bd8bd37955fa 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -21,6 +21,25 @@ */ #include "priv.h" +static int +ad102_gsp_init_fw_heap(struct nvkm_gsp *gsp) +{ + int ret; + + nvkm_gsp_init_fw_heap(gsp); + + if (gsp->fb.wpr2.heap.size <= SZ_256M) + return 0; + + /* Load scrubber ucode image */ + ret = r535_gsp_load_fw(gsp, "scrubber", gsp->fwif->ver, + &gsp->fws.scrubber); + if (ret) + return ret; + + return 0; +} + static const struct nvkm_gsp_func ad102_gsp_r535_113_01 = { .flcn = &ga102_gsp_flcn, @@ -31,7 +50,7 @@ ad102_gsp_r535_113_01 = { .wpr_heap.os_carveout_size = 20 << 20, .wpr_heap.base_size = 8 << 20, .wpr_heap.min_size = 84 << 20, - .wpr_heap.init_fw_heap = tu102_gsp_init_fw_heap, + .wpr_heap.init_fw_heap = ad102_gsp_init_fw_heap, .booter.ctor = ga102_gsp_booter_ctor, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h index dfb41be3d677..a89ab7b22263 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h @@ -16,7 +16,9 @@ struct nvkm_gsp_fwif { }; int gv100_gsp_nofw(struct nvkm_gsp *, int, const struct nvkm_gsp_fwif *); -int r535_gsp_load(struct nvkm_gsp *, int, const struct nvkm_gsp_fwif *); +int r535_gsp_load_fw(struct nvkm_gsp *gsp, const char *name, + const char *ver, const struct firmware **pfw); +int r535_gsp_load(struct nvkm_gsp *gsp, int ver, const struct nvkm_gsp_fwif *fwif); struct nvkm_gsp_func { const struct nvkm_falcon_func *flcn; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index c56c545f2bdb..ef867eb20cff 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -2489,6 +2489,8 @@ r535_gsp_dtor_fws(struct nvkm_gsp *gsp) gsp->fws.booter.load = NULL; nvkm_firmware_put(gsp->fws.rm); gsp->fws.rm = NULL; + nvkm_firmware_put(gsp->fws.scrubber); + gsp->fws.scrubber = NULL; } void @@ -2656,7 +2658,7 @@ r535_gsp_oneinit(struct nvkm_gsp *gsp) return 0; } -static int +int r535_gsp_load_fw(struct nvkm_gsp *gsp, const char *name, const char *ver, const struct firmware **pfw) { @@ -2687,6 +2689,8 @@ r535_gsp_load(struct nvkm_gsp *gsp, int ver, const struct nvkm_gsp_fwif *fwif) return ret; } + gsp->fwif = fwif; + return 0; } -- 2.34.1
Zhi Wang
2024-Nov-22 12:57 UTC
[RFC 4/8] drm/nouveau: scrub the FB memory when scrubber firmware is loaded
When WPR2 heap size > 256MB, the FB memory needs to be scrubbed before use. If not, the GSP firmware hangs when booting. If the scrubber firmware presents, execute it to scrub the FB memory before executing any other ucode images. Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- .../gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 35 +++++++++++++++++++ .../gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 1 + .../gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 12 +++++-- 3 files changed, 45 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index bd8bd37955fa..596ccd758e66 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -19,8 +19,42 @@ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR * OTHER DEALINGS IN THE SOFTWARE. */ + +#include <engine/sec2.h> #include "priv.h" +static bool is_scrubber_completed(struct nvkm_gsp *gsp) +{ + return ((nvkm_rd32(gsp->subdev.device, 0x001180fc) >> 29) >= 0x3); +} + +static int +ad102_execute_scrubber(struct nvkm_gsp *gsp) +{ + struct nvkm_falcon_fw fw = {0}; + struct nvkm_subdev *subdev = &gsp->subdev; + struct nvkm_device *device = subdev->device; + int ret; + + if (!gsp->fws.scrubber || is_scrubber_completed(gsp)) + return 0; + + ret = gsp->func->booter.ctor(gsp, "scrubber", gsp->fws.scrubber, + &device->sec2->falcon, &fw); + if (ret) + return ret; + + ret = nvkm_falcon_fw_boot(&fw, subdev, true, NULL, NULL, 0, 0); + nvkm_falcon_fw_dtor(&fw); + if (ret) + return ret; + + if (WARN_ON(!is_scrubber_completed(gsp))) + return -ENOSPC; + + return 0; +} + static int ad102_gsp_init_fw_heap(struct nvkm_gsp *gsp) { @@ -51,6 +85,7 @@ ad102_gsp_r535_113_01 = { .wpr_heap.base_size = 8 << 20, .wpr_heap.min_size = 84 << 20, .wpr_heap.init_fw_heap = ad102_gsp_init_fw_heap, + .wpr_heap.execute_scrubber = ad102_execute_scrubber, .booter.ctor = ga102_gsp_booter_ctor, diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h index a89ab7b22263..fe56ced9b369 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h @@ -31,6 +31,7 @@ struct nvkm_gsp_func { u32 base_size; u64 min_size; int (*init_fw_heap)(struct nvkm_gsp *gsp); + int (*execute_scrubber)(struct nvkm_gsp *gsp); } wpr_heap; struct { diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index ef867eb20cff..d5d6d0df863e 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -2618,13 +2618,19 @@ r535_gsp_oneinit(struct nvkm_gsp *gsp) if (ret) return ret; - /* Release FW images - we've copied them to DMA buffers now. */ - r535_gsp_dtor_fws(gsp); - ret = gsp->func->wpr_heap.init_fw_heap(gsp); if (WARN_ON(ret)) return ret; + if (gsp->func->wpr_heap.execute_scrubber) { + ret = gsp->func->wpr_heap.execute_scrubber(gsp); + if (ret) + return ret; + } + + /* Release FW images - we've copied them to DMA buffers now. */ + r535_gsp_dtor_fws(gsp); + ret = nvkm_gsp_fwsec_frts(gsp); if (WARN_ON(ret)) return ret; -- 2.34.1
To support the maximum vGPUs on the device that support SRIOV, a larger WPR2 heap size is required. Support WPR2 heap size override when initializing the WPR2 heap memory layout. If zero, use the default WRP2 heap size. No functional change is intended. Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 7 ++++--- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c | 2 +- 4 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index 596ccd758e66..3ba67eab08d7 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -60,7 +60,7 @@ ad102_gsp_init_fw_heap(struct nvkm_gsp *gsp) { int ret; - nvkm_gsp_init_fw_heap(gsp); + nvkm_gsp_init_fw_heap(gsp, 0); if (gsp->fb.wpr2.heap.size <= SZ_256M) return 0; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h index fe56ced9b369..fe2ad4753d5e 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/priv.h @@ -63,7 +63,7 @@ int ga102_gsp_booter_ctor(struct nvkm_gsp *, const char *, const struct firmware int ga102_gsp_reset(struct nvkm_gsp *); void r535_gsp_dtor(struct nvkm_gsp *); -void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp); +void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp, u64 wpr2_heap_size); int r535_gsp_oneinit(struct nvkm_gsp *); int r535_gsp_init(struct nvkm_gsp *); int r535_gsp_fini(struct nvkm_gsp *, bool suspend); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index d5d6d0df863e..5a47201bf0c4 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -2519,7 +2519,7 @@ r535_gsp_dtor(struct nvkm_gsp *gsp) nvkm_gsp_mem_dtor(gsp, &gsp->logrm); } -void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp) +void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp, u64 wpr2_heap_size) { /* Calculate FB layout. */ gsp->fb.wpr2.frts.size = 0x100000; @@ -2533,7 +2533,7 @@ void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp) gsp->fb.wpr2.elf.addr = ALIGN_DOWN(gsp->fb.wpr2.boot.addr - gsp->fb.wpr2.elf.size, 0x10000); - { + if (!wpr2_heap_size) { u32 fb_size_gb = DIV_ROUND_UP_ULL(gsp->fb.size, 1 << 30); gsp->fb.wpr2.heap.size @@ -2543,7 +2543,8 @@ void nvkm_gsp_init_fw_heap(struct nvkm_gsp *gsp) ALIGN(GSP_FW_HEAP_PARAM_CLIENT_ALLOC_SIZE, 1 << 20); gsp->fb.wpr2.heap.size = max(gsp->fb.wpr2.heap.size, gsp->func->wpr_heap.min_size); - } + } else + gsp->fb.wpr2.heap.size = wpr2_heap_size; gsp->fb.wpr2.heap.addr = ALIGN_DOWN(gsp->fb.wpr2.elf.addr - gsp->fb.wpr2.heap.size, 0x100000); diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c index e279a322704a..eb6081946c13 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/tu102.c @@ -79,7 +79,7 @@ tu102_gsp_booter_ctor(struct nvkm_gsp *gsp, const char *name, const struct firmw int tu102_gsp_init_fw_heap(struct nvkm_gsp *gsp) { - nvkm_gsp_init_fw_heap(gsp); + nvkm_gsp_init_fw_heap(gsp, 0); return 0; } -- 2.34.1
Zhi Wang
2024-Nov-22 12:57 UTC
[RFC 6/8] drm/nouveau: override the WPR2 heap size when SRIOV is supported on Ada
To support the maximum vGPUs on the device that support SRIOV, a larger WPR2 heap size is required. On Ada with SRIOV supported, the size should be set to at least 549MB. By setting the WPR2 heap size up to 549MB, the scrubber ucode image is required to scrub the FB memory before any other ucode image is executed. Override the default WPR2 heap size on Ada when SRIOV is supported. Set the WPR2 heap size up to 576MB when SRIOV is supported on Ada. Cc: Milos Tijanic <mtijanic at nvidia.com> Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index 3ba67eab08d7..1e403dbd7323 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -20,6 +20,7 @@ * OTHER DEALINGS IN THE SOFTWARE. */ +#include <core/pci.h> #include <engine/sec2.h> #include "priv.h" @@ -58,9 +59,18 @@ ad102_execute_scrubber(struct nvkm_gsp *gsp) static int ad102_gsp_init_fw_heap(struct nvkm_gsp *gsp) { + struct nvkm_subdev *subdev = &gsp->subdev; + struct nvkm_device *device = subdev->device; + struct nvkm_device_pci *device_pci = container_of(device, + typeof(*device_pci), device); + int num_vfs; int ret; - nvkm_gsp_init_fw_heap(gsp, 0); + num_vfs = pci_sriov_get_totalvfs(device_pci->pdev); + if (!num_vfs) + nvkm_gsp_init_fw_heap(gsp, 0); + else + nvkm_gsp_init_fw_heap(gsp, 576 * SZ_1M); if (gsp->fb.wpr2.heap.size <= SZ_256M) return 0; -- 2.34.1
Zhi Wang
2024-Nov-22 12:57 UTC
[RFC 7/8] drm/nouveau: set max supported vGPU count when SRIOV is supported
Set the max supported vGPU count according to the number of VFs when SRIOV is supported on Ada. Suggested-by: Jason Gunthorpe <jgg at nvidia.com> Cc: Surath Mitra <smitra at nvidia.com> Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h | 1 + drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 4 +++- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h index c6fe2d9d47de..6e244af1e815 100644 --- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h +++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/gsp.h @@ -64,6 +64,7 @@ struct nvkm_gsp { } frts, boot, elf, heap; u64 addr; u64 size; + u64 max_vgpu_count; } wpr2; struct { u64 addr; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index 1e403dbd7323..80d6d73fe352 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -69,8 +69,10 @@ ad102_gsp_init_fw_heap(struct nvkm_gsp *gsp) num_vfs = pci_sriov_get_totalvfs(device_pci->pdev); if (!num_vfs) nvkm_gsp_init_fw_heap(gsp, 0); - else + else { nvkm_gsp_init_fw_heap(gsp, 576 * SZ_1M); + gsp->fb.wpr2.max_vgpu_count = num_vfs; + } if (gsp->fb.wpr2.heap.size <= SZ_256M) return 0; diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c index 5a47201bf0c4..2647a83773d2 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c @@ -1968,6 +1968,7 @@ r535_gsp_wpr_meta_init(struct nvkm_gsp *gsp) meta->partitionRpcAddr = 0; meta->partitionRpcRequestOffset = 0; meta->partitionRpcReplyOffset = 0; + meta->gspFwHeapVfPartitionCount = gsp->fb.wpr2.max_vgpu_count; meta->verified = 0; return 0; } -- 2.34.1
Zhi Wang
2024-Nov-22 12:57 UTC
[RFC 8/8] drm/nouveau: introduce the scrubber on Ada in a kernel doc
Introduce a kernel doc to explain the scrubber on Ada. Cc: Milos Tijanic <mtijanic at nvidia.com> Signed-off-by: Zhi Wang <zhiw at nvidia.com> --- drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c index 80d6d73fe352..327e733e3e8b 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/ad102.c @@ -24,6 +24,20 @@ #include <engine/sec2.h> #include "priv.h" +/* + * DOC: Pre-scrubbed FB memory on Ada + * + * https://github.com/NVIDIA/open-gpu-kernel-modules/blob/565.57.01/src/nvidia/src/kernel/gpu/gsp/kernel_gsp.c#L3151 + * + * The size of the pre-scrubbed FB memory on Ada is 256MB. When allocating + * a GSP WPR2 heap larger than 256MB, the scrubber ucode image is required + * to be exeucted before executing any other ucode images. Or, GSP + * firmware hangs when booting. + * + * The large GSP WPR2 heap is required especially by vGPU when supporting + * max vGPU count. The required size on Ada is at least 549MB. + */ + static bool is_scrubber_completed(struct nvkm_gsp *gsp) { return ((nvkm_rd32(gsp->subdev.device, 0x001180fc) >> 29) >= 0x3); -- 2.34.1
Timur Tabi
2024-Nov-22 16:37 UTC
[RFC 0/8] drm/nouveau: scrubber ucode image support for vGPU
On Fri, 2024-11-22 at 04:57 -0800, Zhi Wang wrote:> diff --git a/nouveau/extract-firmware-nouveau.py b/nouveau/extract-firmware- > nouveau.py > index 837edc8d..6268934c 100755 > --- a/nouveau/extract-firmware-nouveau.py > +++ b/nouveau/extract-firmware-nouveau.py > @@ -335,7 +335,7 @@ def main(): > ???? booter("ad102", "load", 384) > ???? booter("ad102", "unload", 384) > ???? bootloader("ad102", "_prod_") > -??? # scrubber("ad102", 384) # Not currently used by Nouveau > +??? scrubber("ad102", 384) # Not currently used by NouveauShould I go ahead and submit this change to chips_a?