Jason Gunthorpe
2024-Sep-25 00:53 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote:> Currently - and please correct me if I'm wrong - you make it sound to me as if > you're not willing to respect the decisions that have been taken by Nouveau and > DRM maintainers.I've never said anything about your work, go do Nova, have fun. I'm just not agreeing to being forced into taking Rust dependencies in VFIO because Nova is participating in the Rust Experiment. I think the reasonable answer is to accept some code duplication, or try to consolidate around a small C core. I understad this is different than you may have planned so far for Nova, but all projects are subject to community feedback, especially when faced with new requirements. I think this discussion is getting a little overheated, there is lots of space here for everyone to do their things. Let's not get too excited.> I encourage that NVIDIA wants to move things upstream and I'm absolutely willing > to collaborate and help with the use-cases and goals NVIDIA has. But it really > has to be a collaboration and this starts with acknowledging the goals of *each > other*.I've always acknowledged Nova's goal - it is fine. It is just quite incompatible with the VFIO side requirement of no Rust in our stack until the ecosystem can consume it. I belive there is no reason we can't find an agreeable compromise.> > I expect the core code would continue to support new HW going forward > > to support the VFIO driver, even if nouveau doesn't use it, until Rust > > reaches some full ecosystem readyness for the server space. > > From an upstream perspective the kernel doesn't need to consider OOT drivers, > i.e. the guest driver.?? VFIO already took the decision that it is agnostic to what is running in the VM. Run Windows-only VMs for all we care, it is still supposed to be virtualized correctly.> > There are going to be a lot of users of this code, let's not rush to > > harm them please. > > Please abstain from such kind of unconstructive insinuations; it's ridiculous to > imply that upstream kernel developers and maintainers would harm the users of > NVIDIA GPUs.You literally just said you'd want to effectively block usable VFIO support for new GPU HW when "we stop further support for new HW in Nouveau at some point" and "move the vGPU parts over to Nova(& rust)". I don't agree to that, it harms VFIO users, and is not acknowledging that conflicting goals exist. VFIO will decide when it starts to depend on rust, Nova should not force that decision on VFIO. They are very different ecosystems with different needs. Jason
Dave Airlie
2024-Sep-25 01:08 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
On Wed, 25 Sept 2024 at 10:53, Jason Gunthorpe <jgg at nvidia.com> wrote:> > On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > Currently - and please correct me if I'm wrong - you make it sound to me as if > > you're not willing to respect the decisions that have been taken by Nouveau and > > DRM maintainers. > > I've never said anything about your work, go do Nova, have fun. > > I'm just not agreeing to being forced into taking Rust dependencies in > VFIO because Nova is participating in the Rust Experiment. > > I think the reasonable answer is to accept some code duplication, or > try to consolidate around a small C core. I understad this is > different than you may have planned so far for Nova, but all projects > are subject to community feedback, especially when faced with new > requirements. > > I think this discussion is getting a little overheated, there is lots > of space here for everyone to do their things. Let's not get too > excited.How do you intend to solve the stable ABI problem caused by the GSP firmware? If you haven't got an answer to that, that is reasonable, you can talk about VFIO and DRM and who is in charge all you like, but it doesn't matter. Fundamentally the problem is the unstable API exposure isn't something you can build a castle on top of, the nova idea is to use rust to solve a fundamental problem with the NVIDIA driver design process forces on us (vfio included), I'm not seeing anything constructive as to why doing that in C would be worth the investment. Nothing has changed from when we designed nova, this idea was on the table then, it has all sorts of problems leaking the unstable ABI that have to be solved, and when I see a solution for that in C that is maintainable and doesn't leak like a sieve I might be interested, but you know keep thinking we are using rust so we can have fun, not because we are using it to solve maintainability problems caused by an internal NVIDIA design decision over which we have zero influence. Dave.
Danilo Krummrich
2024-Sep-25 10:55 UTC
[RFC 00/29] Introduce NVIDIA GPU Virtualization (vGPU) Support
On Tue, Sep 24, 2024 at 09:53:19PM -0300, Jason Gunthorpe wrote:> On Tue, Sep 24, 2024 at 09:56:58PM +0200, Danilo Krummrich wrote: > > > Currently - and please correct me if I'm wrong - you make it sound to me as if > > you're not willing to respect the decisions that have been taken by Nouveau and > > DRM maintainers. > > I've never said anything about your work, go do Nova, have fun.See, that's the attitude that doesn't get us anywhere. You act as if we'd just be toying around to have fun, position yourself as the one who wants to do the "real deal" and just claim that our decisions would harm users. And at the same time you proof that you did not get up to speed on what were the reasons to move in this direction and what are the problems we try to solve. This just won't lead to a constructive discussion that addresses your concerns. Try to not go like a bull at a gate. Instead start with asking questions to understand why we chose this direction and then feel free to raise concerns. I assure you, we will hear and recognize them! And I'm also sure that we'll find solutions and compromises.> > I'm just not agreeing to being forced into taking Rust dependencies in > VFIO because Nova is participating in the Rust Experiment. > > I think the reasonable answer is to accept some code duplication, or > try to consolidate around a small C core. I understad this is > different than you may have planned so far for Nova, but all projects > are subject to community feedback, especially when faced with new > requirements.Fully agree, and I'm absolutely open to consider feedback and new requirements. But again, consider what I said above -- you're creating counterproposals out of thin air, without considering what we have planned for so far at all. So, I wonder what kind of reaction you expect approaching things this way?> > I think this discussion is getting a little overheated, there is lots > of space here for everyone to do their things. Let's not get too > excited. > > > I encourage that NVIDIA wants to move things upstream and I'm absolutely willing > > to collaborate and help with the use-cases and goals NVIDIA has. But it really > > has to be a collaboration and this starts with acknowledging the goals of *each > > other*. > > I've always acknowledged Nova's goal - it is fine. > > It is just quite incompatible with the VFIO side requirement of no > Rust in our stack until the ecosystem can consume it. > > I belive there is no reason we can't find an agreeable compromise.I'm pretty sure we indeed can find agreeable compromise. But again, please understand that the way of approaching this you've chosen so far won't get us there.> > > > I expect the core code would continue to support new HW going forward > > > to support the VFIO driver, even if nouveau doesn't use it, until Rust > > > reaches some full ecosystem readyness for the server space. > > > > From an upstream perspective the kernel doesn't need to consider OOT drivers, > > i.e. the guest driver. > > ?? VFIO already took the decision that it is agnostic to what is > running in the VM. Run Windows-only VMs for all we care, it is still > supposed to be virtualized correctly. > > > > There are going to be a lot of users of this code, let's not rush to > > > harm them please. > > > > Please abstain from such kind of unconstructive insinuations; it's ridiculous to > > imply that upstream kernel developers and maintainers would harm the users of > > NVIDIA GPUs. > > You literally just said you'd want to effectively block usable VFIO > support for new GPU HW when "we stop further support for new HW in > Nouveau at some point" and "move the vGPU parts over to Nova(& rust)".Well, working on a successor means that once it's in place the support for the replaced thing has to end at some point. This doesn't mean that we can't work out ways to address your concerns. You just make it a binary thing and claim that if we don't choose 1 we harm users. This effectively denies looking for solutions of your concerns in the first place. And again, this won't get us anywhere. It just creates the impression that you're not interested in solutions, but push through your agenda.> > I don't agree to that, it harms VFIO users, and is not acknowledging > that conflicting goals exist. > > VFIO will decide when it starts to depend on rust, Nova should not > force that decision on VFIO. They are very different ecosystems with > different needs. > > Jason >