Daniel Vetter
2024-Jun-17 13:55 UTC
[RFC] GPU driver with separate "core" and "DRM" modules
On Fri, Jun 14, 2024 at 03:02:09AM +1000, Ben Skeggs wrote:> NVIDIA has been exploring ways to better support the effort for an > upstream kernel mode driver for GPUs that are capable of running GSP-RM > firmware, since the introduction[1] to Nova. > > Use cases have been identified for which separating the core GPU > programming out of the full DRM driver stack is a strong requirement > from our key customers. > > An upstreamed NVIDIA GPU driver should be able to support current and > emerging customer use cases for vGPU hosts. NVIDIA's vGPU deployments > to date do not support compute or graphics functionality within the > hypervisor host, and have no dependency on the Linux graphics subsystem, > instead implementing the minimal functionality required to run vGPU > guest VMs. > > For security-sensitive environments such as cloud infrastructure, it's > important to continue support for running a minimal footprint vGPU host > driver in a stripped-down / barebones kernel environment. > > This can be achieved by supporting both VFIO and DRM drivers as clients > of a core driver, without requiring a full-fledged DRM driver (or the > DRM subsystem itself) to be built into the host kernel. > > A core driver would be responsible for booting and communicating with > GSP-RM, enumeration of HW configuration, shared/partitioned resource > management, exception handling, and event dispatch. > > The DRM driver would do all the standard things a DRM driver does, and > implement GPU memory management (TTM/HMM), KMS, command submission etc, > as well as providing UAPI for userspace clients. These features would > be implemented using HW resources allocated from a core driver, rather > than the DRM driver being directly responsible for HW programming. > > As Nouveau's KMD is already split (in the logical sense) along similar > lines, we're using it here for the purposes of this RFC to demonstrate > the feasibility of such an architecture, and open it up for discussion.Sounds reasonable. Only bikeshed I have to add is that the blessed way (according to the cool kernel maintainers at least or something) to structure this is using auxbus. Definitely when you end up with more than one driver binding to the core (like maybe some system management interface thing, or perhaps a special compute-only kernel driver). https://dri.freedesktop.org/docs/drm/driver-api/auxiliary_bus.html Cheers, Sima> > A link[2] to a tree containing the patches is below. > > [1] https://lore.kernel.org/all/3ed356488c9b0ca93845501425d427309f4cf616.camel at redhat.com/ > [2] https://gitlab.freedesktop.org/bskeggs/nouveau/-/tree/00.03-module > > *** BLURB HERE *** > > Ben Skeggs (2): > drm/nouveau/nvkm: export symbols needed by the drm driver > drm/nouveau/nvkm: separate out into nvkm.ko > > drivers/gpu/drm/nouveau/Kbuild | 4 ++-- > drivers/gpu/drm/nouveau/include/nvkm/core/module.h | 3 --- > drivers/gpu/drm/nouveau/nouveau_drm.c | 10 +--------- > drivers/gpu/drm/nouveau/nvkm/core/driver.c | 1 + > drivers/gpu/drm/nouveau/nvkm/core/gpuobj.c | 2 ++ > drivers/gpu/drm/nouveau/nvkm/core/mm.c | 4 ++++ > drivers/gpu/drm/nouveau/nvkm/device/acpi.c | 1 + > drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c | 1 + > drivers/gpu/drm/nouveau/nvkm/module.c | 8 ++++++-- > drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/bios/pll.c | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/fb/base.c | 3 +++ > drivers/gpu/drm/nouveau/nvkm/subdev/gpio/base.c | 3 +++ > drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 2 ++ > drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/therm/fan.c | 1 + > drivers/gpu/drm/nouveau/nvkm/subdev/volt/base.c | 1 + > 19 files changed, 33 insertions(+), 16 deletions(-) > > -- > 2.44.0 >-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On 17/6/24 23:55, Daniel Vetter wrote:> On Fri, Jun 14, 2024 at 03:02:09AM +1000, Ben Skeggs wrote: >> NVIDIA has been exploring ways to better support the effort for an >> upstream kernel mode driver for GPUs that are capable of running GSP-RM >> firmware, since the introduction[1] to Nova. >> >> Use cases have been identified for which separating the core GPU >> programming out of the full DRM driver stack is a strong requirement >> from our key customers. >> >> An upstreamed NVIDIA GPU driver should be able to support current and >> emerging customer use cases for vGPU hosts. NVIDIA's vGPU deployments >> to date do not support compute or graphics functionality within the >> hypervisor host, and have no dependency on the Linux graphics subsystem, >> instead implementing the minimal functionality required to run vGPU >> guest VMs. >> >> For security-sensitive environments such as cloud infrastructure, it's >> important to continue support for running a minimal footprint vGPU host >> driver in a stripped-down / barebones kernel environment. >> >> This can be achieved by supporting both VFIO and DRM drivers as clients >> of a core driver, without requiring a full-fledged DRM driver (or the >> DRM subsystem itself) to be built into the host kernel. >> >> A core driver would be responsible for booting and communicating with >> GSP-RM, enumeration of HW configuration, shared/partitioned resource >> management, exception handling, and event dispatch. >> >> The DRM driver would do all the standard things a DRM driver does, and >> implement GPU memory management (TTM/HMM), KMS, command submission etc, >> as well as providing UAPI for userspace clients. These features would >> be implemented using HW resources allocated from a core driver, rather >> than the DRM driver being directly responsible for HW programming. >> >> As Nouveau's KMD is already split (in the logical sense) along similar >> lines, we're using it here for the purposes of this RFC to demonstrate >> the feasibility of such an architecture, and open it up for discussion. > Sounds reasonable. > > Only bikeshed I have to add is that the blessed way (according to the cool > kernel maintainers at least or something) to structure this is using > auxbus. Definitely when you end up with more than one driver binding to > the core (like maybe some system management interface thing, or perhaps a > special compute-only kernel driver). > > https://dri.freedesktop.org/docs/drm/driver-api/auxiliary_bus.htmlHey! Yes indeed.? I sent this[1] series at the same time, which was initially written to so that nouveau.ko would still get auto-loaded alongside nvkm.ko. Ben. [1] https://lists.freedesktop.org/archives/nouveau/2024-June/044861.html> > Cheers, Sima > >> A link[2] to a tree containing the patches is below. >> >> [1] https://lore.kernel.org/all/3ed356488c9b0ca93845501425d427309f4cf616.camel at redhat.com/ >> [2] https://gitlab.freedesktop.org/bskeggs/nouveau/-/tree/00.03-module >> >> *** BLURB HERE *** >> >> Ben Skeggs (2): >> drm/nouveau/nvkm: export symbols needed by the drm driver >> drm/nouveau/nvkm: separate out into nvkm.ko >> >> drivers/gpu/drm/nouveau/Kbuild | 4 ++-- >> drivers/gpu/drm/nouveau/include/nvkm/core/module.h | 3 --- >> drivers/gpu/drm/nouveau/nouveau_drm.c | 10 +--------- >> drivers/gpu/drm/nouveau/nvkm/core/driver.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/core/gpuobj.c | 2 ++ >> drivers/gpu/drm/nouveau/nvkm/core/mm.c | 4 ++++ >> drivers/gpu/drm/nouveau/nvkm/device/acpi.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/engine/gr/base.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/module.c | 8 ++++++-- >> drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/subdev/bios/pll.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/subdev/fb/base.c | 3 +++ >> drivers/gpu/drm/nouveau/nvkm/subdev/gpio/base.c | 3 +++ >> drivers/gpu/drm/nouveau/nvkm/subdev/i2c/base.c | 2 ++ >> drivers/gpu/drm/nouveau/nvkm/subdev/i2c/bus.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/subdev/iccsense/base.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/subdev/therm/fan.c | 1 + >> drivers/gpu/drm/nouveau/nvkm/subdev/volt/base.c | 1 + >> 19 files changed, 33 insertions(+), 16 deletions(-) >> >> -- >> 2.44.0 >>