Benjamin Herrenschmidt
2014-Sep-24 22:04 UTC
[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
On Wed, 2014-09-24 at 14:59 -0700, Andy Lutomirski wrote:> Scratch that idea, then. > > The best that I can currently come up with is to say that pre-1.0 > devices on PPC bypass the IOMMU and that 1.0 devices on PPC and all > devices on all other architectures do not bypass the IOMMU.Well, the thing is, we *want* them to bypass the IOMMU for performance reasons in the long run. Today we have no ways to tell our guests that a PCI bus doesn't have an IOMMU, they always do ! Also qemu can mix and match devices on a given PCI bus so making this a bus property wouldn't work either.> I agree that this is far from ideal. Any ideas? Is there any other > channel that QEMU can use to signal to a PPC guest that a specific PCI > virtio device isn't really behind an IOMMU?Any reason why this can't be a virtio capability ?> From my POV, the main > consideration is that existing QEMU versions hosting Xen hypervisors > should work, which is not the case in current kernels, but which is > the case with my patches without any known regressions. > > Is there some evil trick that a PPC guest could use to detect whether > the IOMMU is honored? As an example that I don't like at all, the > guest could program the IOMMU so that the ring's physical address maps > to a second copy of the ring and then see which one works.Cheers, Ben.
Andy Lutomirski
2014-Sep-24 22:15 UTC
[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
On Wed, Sep 24, 2014 at 3:04 PM, Benjamin Herrenschmidt <benh at kernel.crashing.org> wrote:> On Wed, 2014-09-24 at 14:59 -0700, Andy Lutomirski wrote: > >> Scratch that idea, then. >> >> The best that I can currently come up with is to say that pre-1.0 >> devices on PPC bypass the IOMMU and that 1.0 devices on PPC and all >> devices on all other architectures do not bypass the IOMMU. > > Well, the thing is, we *want* them to bypass the IOMMU for performance > reasons in the long run. Today we have no ways to tell our guests that > a PCI bus doesn't have an IOMMU, they always do !Alternatively, the IOMMU could be extended to allow an efficient identity map of the entire physical address space.> > Also qemu can mix and match devices on a given PCI bus so making this a > bus property wouldn't work either. > >> I agree that this is far from ideal. Any ideas? Is there any other >> channel that QEMU can use to signal to a PPC guest that a specific PCI >> virtio device isn't really behind an IOMMU? > > Any reason why this can't be a virtio capability ? >For Xen. On Xen, unless we do something quite kludgey, QEMU *cannot* bypass the DMA API, and QEMU doesn't even know that Xen is there. On Linux kernels on Xen, the "physical" address (as returned by sg_phys, etc) is not a real physical address. The DMA API translates back and forth. Without using the DMA API, all of the DMA ends up in the wrong place, and everything breaks. We could teach the driver to bypass the IOMMU but to still use Xen's mapping correctly, but that's complicated, and I think it serves little purpose. As I understand it, there is no reason to ever bypass an IOMMU on x86, since x86 IOMMUs can be programmed to efficiently identity map everything. So, on x86, I think that the right long-term solution is to just use the DMA API always. There is a potential security issue, though. If virtio-pci sets up an identity map, then a malicious PCI device could pretend to speak virtio and then take over the system using that identity map. Using a virtio capability here doesn't help, since a malicious device will just set that capability to indicate that it should be identity mapped. I guess that we ideally want a secure way for the guest to distinguish between three cases: a) IOMMU should be used. b) Device is trusted but the IOMMU still works. The guest may want to set up an identity map. c) (PPC only) Device bypasses the IOMMU. There's no security issue here: a malicious device wouldn't be able to bypass the IOMMU, so it would be unable to do DMA at all. --Andy>> From my POV, the main >> consideration is that existing QEMU versions hosting Xen hypervisors >> should work, which is not the case in current kernels, but which is >> the case with my patches without any known regressions. >> >> Is there some evil trick that a PPC guest could use to detect whether >> the IOMMU is honored? As an example that I don't like at all, the >> guest could program the IOMMU so that the ring's physical address maps >> to a second copy of the ring and then see which one works. > > Cheers, > Ben. > >-- Andy Lutomirski AMA Capital Management, LLC
Benjamin Herrenschmidt
2014-Sep-24 22:38 UTC
[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
On Wed, 2014-09-24 at 15:15 -0700, Andy Lutomirski wrote:> On Wed, Sep 24, 2014 at 3:04 PM, Benjamin Herrenschmidt > <benh at kernel.crashing.org> wrote: > > On Wed, 2014-09-24 at 14:59 -0700, Andy Lutomirski wrote: > > > >> Scratch that idea, then. > >> > >> The best that I can currently come up with is to say that pre-1.0 > >> devices on PPC bypass the IOMMU and that 1.0 devices on PPC and all > >> devices on all other architectures do not bypass the IOMMU. > > > > Well, the thing is, we *want* them to bypass the IOMMU for performance > > reasons in the long run. Today we have no ways to tell our guests that > > a PCI bus doesn't have an IOMMU, they always do ! > > Alternatively, the IOMMU could be extended to allow an efficient > identity map of the entire physical address space.We have that option, but I'm a bit reluctant to rely on it, it has its issues, but it's something we can look into.> > Also qemu can mix and match devices on a given PCI bus so making this a > > bus property wouldn't work either. > > > >> I agree that this is far from ideal. Any ideas? Is there any other > >> channel that QEMU can use to signal to a PPC guest that a specific PCI > >> virtio device isn't really behind an IOMMU? > > > > Any reason why this can't be a virtio capability ? > > > > For Xen.And ? I don't see the problem... When running Xen, you don't put the capability in. They are negociated... Why can't qemu set the capability accordingly based on whether it's using KVM or Xen ? It's not like we care about backward compat of ppc on Xen anyway... .../...> a) IOMMU should be used. > b) Device is trusted but the IOMMU still works. The guest may want to > set up an identity map. > c) (PPC only) Device bypasses the IOMMU. There's no security issue > here: a malicious device wouldn't be able to bypass the IOMMU, so it > would be unable to do DMA at all.I wouldn't make it "ppc only" at this point but yes. Also keep in mind that we do have ppc cases where we will want to use the iommu (in which case the capability would be enforced by the other half). Cheers, Ben.> --Andy > > >> From my POV, the main > >> consideration is that existing QEMU versions hosting Xen hypervisors > >> should work, which is not the case in current kernels, but which is > >> the case with my patches without any known regressions. > >> > >> Is there some evil trick that a PPC guest could use to detect whether > >> the IOMMU is honored? As an example that I don't like at all, the > >> guest could program the IOMMU so that the ring's physical address maps > >> to a second copy of the ring and then see which one works. > > > > Cheers, > > Ben. > > > > > > >
Possibly Parallel Threads
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible