Jason Gunthorpe
2024-Sep-23 15:01 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote:> > 2. Proposal for upstream > > =======================> > What is the strategy in the mid / long term with this? > > As you know, we're trying to move to Nova and the blockers with the device / > driver infrastructure have been resolved and we're able to move forward. Besides > that, Dave made great progress on the firmware abstraction side of things. > > Is this more of a proof of concept? Do you plan to work on Nova in general and > vGPU support for Nova?This is intended to be a real product that customers would use, it is not a proof of concept. There is alot of demand for this kind of simplified virtualization infrastructure in the host side. The series here is the first attempt at making thin host infrastructure and Zhi/etc are doing it with an upstream-first approach.>From the VFIO side I would like to see something like this merged innearish future as it would bring a previously out of tree approach to be fully intree using our modern infrastructure. This is a big win for the VFIO world. As a commercial product this will be backported extensively to many old kernels and that is harder/impossible if it isn't exclusively in C. So, I think nova needs to co-exist in some way. Jason
Danilo Krummrich
2024-Sep-23 22:50 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote:> On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > 2. Proposal for upstream > > > =======================> > > > What is the strategy in the mid / long term with this? > > > > As you know, we're trying to move to Nova and the blockers with the device / > > driver infrastructure have been resolved and we're able to move forward. Besides > > that, Dave made great progress on the firmware abstraction side of things. > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and > > vGPU support for Nova? > > This is intended to be a real product that customers would use, it is > not a proof of concept. There is alot of demand for this kind of > simplified virtualization infrastructure in the host side.I see...> The series > here is the first attempt at making thin host infrastructure and > Zhi/etc are doing it with an upstream-first approach.This is great!> > From the VFIO side I would like to see something like this merged in > nearish future as it would bring a previously out of tree approach to > be fully intree using our modern infrastructure. This is a big win for > the VFIO world. > > As a commercial product this will be backported extensively to many > old kernels and that is harder/impossible if it isn't exclusively in > C. So, I think nova needs to co-exist in some way.We'll surely not support two drivers for the same thing in the long term, neither does it make sense, nor is it sustainable. We have a lot of good reasons why we decided to move forward with Nova as a successor of Nouveau for GSP-based GPUs in the long term -- I also just held a talk about this at LPC. For the short/mid term I think it may be reasonable to start with Nouveau, but this must be based on some agreements, for instance: - take responsibility, e.g. commitment to help with maintainance with some of NVKM / NVIDIA GPU core (or whatever we want to call it) within Nouveau - commitment to help with Nova in general and, once applicable, move the vGPU parts over to Nova But I think the very last one naturally happens if we stop further support for new HW in Nouveau at some point.> > Jason >
Greg KH
2024-Sep-26 09:14 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
On Mon, Sep 23, 2024 at 12:01:40PM -0300, Jason Gunthorpe wrote:> On Mon, Sep 23, 2024 at 10:49:07AM +0200, Danilo Krummrich wrote: > > > 2. Proposal for upstream > > > =======================> > > > What is the strategy in the mid / long term with this? > > > > As you know, we're trying to move to Nova and the blockers with the device / > > driver infrastructure have been resolved and we're able to move forward. Besides > > that, Dave made great progress on the firmware abstraction side of things. > > > > Is this more of a proof of concept? Do you plan to work on Nova in general and > > vGPU support for Nova? > > This is intended to be a real product that customers would use, it is > not a proof of concept. There is alot of demand for this kind of > simplified virtualization infrastructure in the host side. The series > here is the first attempt at making thin host infrastructure and > Zhi/etc are doing it with an upstream-first approach. > > >From the VFIO side I would like to see something like this merged in > nearish future as it would bring a previously out of tree approach to > be fully intree using our modern infrastructure. This is a big win for > the VFIO world. > > As a commercial product this will be backported extensively to many > old kernels and that is harder/impossible if it isn't exclusively in > C. So, I think nova needs to co-exist in some way.Please never make design decisions based on old ancient commercial kernels that have any relevance to upstream kernel development today. If you care about those kernels, work with the companies that get paid to support such things. Otherwise development upstream would just completely stall and never go forward, as you well know. As it seems that future support for this hardware is going to be in rust, just use those apis going forward and backport the small number of missing infrastructure patches to the relevant ancient kernels as well, it's not like that would even be noticed in the overall number of patches they take for normal subsystem improvements :) thanks, greg k-h