Jason Gunthorpe
2024-Sep-23 15:02 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
On Mon, Sep 23, 2024 at 06:22:33AM +0000, Tian, Kevin wrote:> > From: Zhi Wang <zhiw at nvidia.com> > > Sent: Sunday, September 22, 2024 8:49 PM > > > [...] > > > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > > extended management and features, e.g. selecting the vGPU types, support > > live migration and driver warm update. > > > > Like other devices that VFIO supports, VFIO provides the standard > > userspace APIs for device lifecycle management and advance feature > > support. > > > > The NVIDIA vGPU manager provides necessary support to the NVIDIA vGPU VFIO > > variant driver to create/destroy vGPUs, query available vGPU types, select > > the vGPU type, etc. > > > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core driver, > > which provide necessary support to reach the HW functions. > > > > I'm not sure VFIO is the right place to host the NVIDIA vGPU manager. > It's very NVIDIA specific and naturally fit in the PF driver.drm isn't a particularly logical place for that either :| Jason
Tian, Kevin
2024-Sep-26 06:43 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
> From: Jason Gunthorpe <jgg at nvidia.com> > Sent: Monday, September 23, 2024 11:02 PM > > On Mon, Sep 23, 2024 at 06:22:33AM +0000, Tian, Kevin wrote: > > > From: Zhi Wang <zhiw at nvidia.com> > > > Sent: Sunday, September 22, 2024 8:49 PM > > > > > [...] > > > > > > The NVIDIA vGPU VFIO module together with VFIO sits on VFs, provides > > > extended management and features, e.g. selecting the vGPU types, > support > > > live migration and driver warm update. > > > > > > Like other devices that VFIO supports, VFIO provides the standard > > > userspace APIs for device lifecycle management and advance feature > > > support. > > > > > > The NVIDIA vGPU manager provides necessary support to the NVIDIA > vGPU VFIO > > > variant driver to create/destroy vGPUs, query available vGPU types, select > > > the vGPU type, etc. > > > > > > On the other side, NVIDIA vGPU manager talks to the NVIDIA GPU core > driver, > > > which provide necessary support to reach the HW functions. > > > > > > > I'm not sure VFIO is the right place to host the NVIDIA vGPU manager. > > It's very NVIDIA specific and naturally fit in the PF driver. > > drm isn't a particularly logical place for that either :| >This RFC doesn't expose any new uAPI in the vGPU manager, e.g. with the vGPU type hard-coded to L40-24Q. In this way the boundary between code in VFIO and code in PF driver is probably more a vendor specific choice. However according to the cover letter it's reasonable for future extension to implement new uAPI for admin to select the vGPU type and potentially do more manual configurations before the target VF can be used: Then there comes an open whether VFIO is a right place to host such vendor specific provisioning interface. The existing mdev type based provisioning mechanism was considered a bad fit already. IIRC the previous discussion came to suggest putting the provisioning interface in the PF driver. There may be chance to generalize and move to VFIO but no idea what it will be until multiple drivers already demonstrate their own implementations as the base for discussion. But now seems you prefer to vendors putting their own provisioning interface in VFIO directly? Thanks Kevin