Zhi Wang
2025-Oct-07 06:51 UTC
[PATCH 0/2] rust: pci: expose is_virtfn() and reject VFs in nova-core
On 2.10.2025 17.31, Jason Gunthorpe wrote:> On Thu, Oct 02, 2025 at 02:29:09PM +0000, Zhi Wang wrote: >> On 2.10.2025 16.42, Jason Gunthorpe wrote: >>> On Thu, Oct 02, 2025 at 12:59:59PM +0000, Zhi Wang wrote: >>>> On 2.10.2025 14.58, Jason Gunthorpe wrote: >>>>> On Wed, Oct 01, 2025 at 09:13:33PM +0000, Zhi Wang wrote: >>>>> >>>>>> Right, I also mentioned the same use cases of NIC/GPU in another reply >>>>>> to Danilo. But what I get is NVIDIA doesn't use bare metal VF to support >>>>>> linux container, >>>>> >>>>> I don't think it matter what "NVIDIA" does - this is the upstream >>>>> architecture it should be followed unless there is some significant >>>>> reason. >>>> >>>> Hmm. Can you elaborate why? >>>> >>>> From the device vendor's stance, they know what is the best approach >>>> to offer the better the user experience according to their device >>>> characteristic. >>> >>> You can easially push the code to nova core not vfio and make it work >>> generically, some significant reason is needed beyond "the vendor >>> doesn't want to". > > You'd have to be more specific, I didn't see really any mediation > stuff in the vfio driver to explain why the VF in the VM would act so > differently that it "couldn't work" >From the device vendor?s perspective, we have no support or use case for a bare-metal VF model, not now and not in the foreseeable future. Even hypothetically, such support would not come from nova-core.ko, since that would defeat the purpose of maintaining a trimmed-down kernel module where minimizing the attack surface and preserving strict security boundaries are primary design goals.> Even if there is some small FW issue, it is better to still structure > things in the normal way and assume it will get fixed sometime later > than to forever close that door. > > Jason
Danilo Krummrich
2025-Oct-07 10:14 UTC
[PATCH 0/2] rust: pci: expose is_virtfn() and reject VFs in nova-core
On Tue Oct 7, 2025 at 8:51 AM CEST, Zhi Wang wrote:> From the device vendor?s perspective, we have no support or use case for > a bare-metal VF model, not now and not in the foreseeable future.Who is we? I think there'd be a ton of users that do see such use-cases. What does "no support" mean? Are there technical limitation that prevent an implementation (I haven't seen any so far)?> Even > hypothetically, such support would not come from nova-core.ko, since > that would defeat the purpose of maintaining a trimmed-down kernel > module where minimizing the attack surface and preserving strict > security boundaries are primary design goals.I wouldn't say the *primary* design goal is to be as trimmed-down as possible. The primary design goals are rather proper firmware abstraction, addressing design incompatibilities with modern graphics and compute APIs, memory safety concerns and general maintainability. It does make sense to not run the vGPU use-case on top of all the additional DRM stuff that will go into nova-drm, since this is clearly not needed in the vGPU use-case. But, it doesn't mean that we have to keep everything out of nova-core for this purpose. I think the bare-metal VF model is a very interesting use-case and if it is technically feasable we should support it. And I think it should be in nova-core. The difference between nova-core running on a bare metal VF and nova-core running on the same VF in a VM shouldn't be that different anyways, no?
Jason Gunthorpe
2025-Oct-07 11:26 UTC
[PATCH 0/2] rust: pci: expose is_virtfn() and reject VFs in nova-core
On Tue, Oct 07, 2025 at 06:51:47AM +0000, Zhi Wang wrote:> > You'd have to be more specific, I didn't see really any mediation > > stuff in the vfio driver to explain why the VF in the VM would act so > > differently that it "couldn't work" > > From the device vendor?s perspective, we have no support or use case for > a bare-metal VF model, not now and not in the foreseeable future.Again be specific, exactly what mediation in vfio is missing.> Even hypothetically, such support would not come from nova-core.ko, > since that would defeat the purpose of maintaining a trimmed-down > kernel module where minimizing the attack surface and preserving > strict security boundaries are primary design goals.Nonsense. If you moved stuff from vfio to noca-core it doesn't change the "trimmed-down" nature one bit. I'm strongly against adding that profiling stuff to vfio, and I'm not hearing any reasons why nova is special and it must be done that way. Jason