Benjamin Herrenschmidt
2014-Sep-24 21:50 UTC
[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
On Wed, 2014-09-24 at 14:41 -0700, Andy Lutomirski wrote:> On Sat, Sep 20, 2014 at 10:05 PM, Benjamin Herrenschmidt > <benh at kernel.crashing.org> wrote: > > On Sun, 2014-09-21 at 15:03 +1000, Benjamin Herrenschmidt wrote: > > > >> The exception I mentioned is that I would really like the virtio device > >> to expose via whatever transport we chose to use (though capability > >> exchange sounds like a reasonable one) whether the "server" > >> implementation is bypassing IOMMUs or not instead on relying on client > >> side heuristics. > >> > >> IE. Basically, we are trying to "guess" with an ifdef CONFIG_PPC, what > >> is essentially an attribute of the server-side, ie, whether is bypasses > >> the iommu for the PCI bus it resides on. > > > >> I believe all the arguments about whether this should be a bus property > >> or whether the x86 case can be worked around via ACPI tables etc... are > >> all moot. Today, qemu implementation can put virtio devices on busses > >> with an iommu and bypass it, so at the very least for backward > >> compatibility, we should expose that appropriately from the "server" > >> side. > > > > And of course, since we are talking about backward compatibility with > > existing qemus here, the capability should be the opposite, ie "honor > > iommu", with the assumption that without it, the implementation bypasses > > it, which reflects what the current qemu implementation does on any > > architecture, whether you configure the bus to have an iommu emulated on > > it or not. > > Can PPC do this using a new devicetree property?The DT props for PCI devices are created by the FW inside the guest from standard PCI probing, so it would have to get the info from qemu via config or register space. Cheers, Ben.
Andy Lutomirski
2014-Sep-24 21:59 UTC
[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
On Wed, Sep 24, 2014 at 2:50 PM, Benjamin Herrenschmidt <benh at kernel.crashing.org> wrote:> On Wed, 2014-09-24 at 14:41 -0700, Andy Lutomirski wrote: >> On Sat, Sep 20, 2014 at 10:05 PM, Benjamin Herrenschmidt >> <benh at kernel.crashing.org> wrote: >> > On Sun, 2014-09-21 at 15:03 +1000, Benjamin Herrenschmidt wrote: >> > >> >> The exception I mentioned is that I would really like the virtio device >> >> to expose via whatever transport we chose to use (though capability >> >> exchange sounds like a reasonable one) whether the "server" >> >> implementation is bypassing IOMMUs or not instead on relying on client >> >> side heuristics. >> >> >> >> IE. Basically, we are trying to "guess" with an ifdef CONFIG_PPC, what >> >> is essentially an attribute of the server-side, ie, whether is bypasses >> >> the iommu for the PCI bus it resides on. >> > >> >> I believe all the arguments about whether this should be a bus property >> >> or whether the x86 case can be worked around via ACPI tables etc... are >> >> all moot. Today, qemu implementation can put virtio devices on busses >> >> with an iommu and bypass it, so at the very least for backward >> >> compatibility, we should expose that appropriately from the "server" >> >> side. >> > >> > And of course, since we are talking about backward compatibility with >> > existing qemus here, the capability should be the opposite, ie "honor >> > iommu", with the assumption that without it, the implementation bypasses >> > it, which reflects what the current qemu implementation does on any >> > architecture, whether you configure the bus to have an iommu emulated on >> > it or not. >> >> Can PPC do this using a new devicetree property? > > The DT props for PCI devices are created by the FW inside the guest from > standard PCI probing, so it would have to get the info from qemu via > config or register space. >Scratch that idea, then. The best that I can currently come up with is to say that pre-1.0 devices on PPC bypass the IOMMU and that 1.0 devices on PPC and all devices on all other architectures do not bypass the IOMMU. I agree that this is far from ideal. Any ideas? Is there any other channel that QEMU can use to signal to a PPC guest that a specific PCI virtio device isn't really behind an IOMMU? From my POV, the main consideration is that existing QEMU versions hosting Xen hypervisors should work, which is not the case in current kernels, but which is the case with my patches without any known regressions. Is there some evil trick that a PPC guest could use to detect whether the IOMMU is honored? As an example that I don't like at all, the guest could program the IOMMU so that the ring's physical address maps to a second copy of the ring and then see which one works. --Andy
Benjamin Herrenschmidt
2014-Sep-24 22:04 UTC
[PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
On Wed, 2014-09-24 at 14:59 -0700, Andy Lutomirski wrote:> Scratch that idea, then. > > The best that I can currently come up with is to say that pre-1.0 > devices on PPC bypass the IOMMU and that 1.0 devices on PPC and all > devices on all other architectures do not bypass the IOMMU.Well, the thing is, we *want* them to bypass the IOMMU for performance reasons in the long run. Today we have no ways to tell our guests that a PCI bus doesn't have an IOMMU, they always do ! Also qemu can mix and match devices on a given PCI bus so making this a bus property wouldn't work either.> I agree that this is far from ideal. Any ideas? Is there any other > channel that QEMU can use to signal to a PPC guest that a specific PCI > virtio device isn't really behind an IOMMU?Any reason why this can't be a virtio capability ?> From my POV, the main > consideration is that existing QEMU versions hosting Xen hypervisors > should work, which is not the case in current kernels, but which is > the case with my patches without any known regressions. > > Is there some evil trick that a PPC guest could use to detect whether > the IOMMU is honored? As an example that I don't like at all, the > guest could program the IOMMU so that the ring's physical address maps > to a second copy of the ring and then see which one works.Cheers, Ben.
Possibly Parallel Threads
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible
- [PATCH v5 2/3] virtio_pci: Use the DMA API for virtqueues when possible