Benjamin Herrenschmidt
2015-Nov-10 10:37 UTC
[PATCH v4 0/6] virtio core DMA API conversion
On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote:> > We could do it the other way around: on powerpc, if a PCI device is in > that range and doesn't have the "bypass" property at all, then it's > assumed to bypass the IOMMU.??This means that everything that > currently works continues working.??If someone builds a physical > virtio device or uses another system in PCIe target mode speaking > virtio, then it won't work until they upgrade their firmware to set > bypass=0.??Meanwhile everyone using hypothetical new QEMU also gets > bypass=0 and no ambiguity. > > vfio will presumably notice the bypass and correctly refuse to map any > current virtio devices. > > Would that work?That would be extremely strange from a platform perspective. Any device in that vendor/device range would bypass the iommu unless some new property "actually-works-like-a-real-pci-device" happens to exist in the device-tree, which we would then need to define somewhere and handle accross at least 3 different platforms who get their device-tree from widly different places. Also if tomorrow I create a PCI device that implements virtio-net and put it in a machine running IBM proprietary firmware (or Apple's or Sun's), it won't have that property... This is not hypothetical. People are using virtio to do point-to-point communication between machines via PCIe today. Cheers, Ben.
On Tue, Nov 10, 2015 at 09:37:54PM +1100, Benjamin Herrenschmidt wrote:> On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote: > > > > We could do it the other way around: on powerpc, if a PCI device is in > > that range and doesn't have the "bypass" property at all, then it's > > assumed to bypass the IOMMU.??This means that everything that > > currently works continues working.??If someone builds a physical > > virtio device or uses another system in PCIe target mode speaking > > virtio, then it won't work until they upgrade their firmware to set > > bypass=0.??Meanwhile everyone using hypothetical new QEMU also gets > > bypass=0 and no ambiguity. > > > > vfio will presumably notice the bypass and correctly refuse to map any > > current virtio devices. > > > > Would that work? > > That would be extremely strange from a platform perspective. Any device > in that vendor/device range would bypass the iommu unless some new > property "actually-works-like-a-real-pci-device" happens to exist in > the device-tree, which we would then need to define somewhere and > handle accross at least 3 different platforms who get their device-tree > from widly different places.Then we are back to virtio driver telling DMA core whether it wants a 1:1 mapping in the iommu? If that's acceptable to others, I don't think that's too bad.> Also if tomorrow I create a PCI device that implements virtio-net and > put it in a machine running IBM proprietary firmware (or Apple's or > Sun's), it won't have that property... > > This is not hypothetical. People are using virtio to do point-to-point > communication between machines via PCIe today. > > Cheers, > Ben.But not virtio-pci I think - that's broken for that usecase since we use weaker barriers than required for real IO, as these have measureable overhead. We could have a feature "is a real PCI device", that's completely reasonable. -- MST
On Nov 10, 2015 2:38 AM, "Benjamin Herrenschmidt" <benh at kernel.crashing.org> wrote:> > On Mon, 2015-11-09 at 21:35 -0800, Andy Lutomirski wrote: > > > > We could do it the other way around: on powerpc, if a PCI device is in > > that range and doesn't have the "bypass" property at all, then it's > > assumed to bypass the IOMMU. This means that everything that > > currently works continues working. If someone builds a physical > > virtio device or uses another system in PCIe target mode speaking > > virtio, then it won't work until they upgrade their firmware to set > > bypass=0. Meanwhile everyone using hypothetical new QEMU also gets > > bypass=0 and no ambiguity. > > > > vfio will presumably notice the bypass and correctly refuse to map any > > current virtio devices. > > > > Would that work? > > That would be extremely strange from a platform perspective. Any device > in that vendor/device range would bypass the iommu unless some new > property "actually-works-like-a-real-pci-device" happens to exist in > the device-tree, which we would then need to define somewhere and > handle accross at least 3 different platforms who get their device-tree > from widly different places. > > Also if tomorrow I create a PCI device that implements virtio-net and > put it in a machine running IBM proprietary firmware (or Apple's or > Sun's), it won't have that property... > > This is not hypothetical. People are using virtio to do point-to-point > communication between machines via PCIe today.Does that work on powerpc on existing kernels? Anyway, here's another crazy idea: make the quirk assume that the IOMMU is bypasses if and only if the weak barriers bit is set on systems that are missing the new DT binding. --Andy> > Cheers, > Ben. > >
Benjamin Herrenschmidt
2015-Nov-10 19:37 UTC
[PATCH v4 0/6] virtio core DMA API conversion
On Tue, 2015-11-10 at 14:43 +0200, Michael S. Tsirkin wrote:> But not virtio-pci I think - that's broken for that usecase since we use > weaker barriers than required for real IO, as these have measureable > overhead.? We could have a feature "is a real PCI device", > that's completely reasonable.Do we use weaker barriers on the Linux driver side ? I didn't think so ...? Cheers, Ben.
Benjamin Herrenschmidt
2015-Nov-10 22:27 UTC
[PATCH v4 0/6] virtio core DMA API conversion
On Tue, 2015-11-10 at 10:54 -0800, Andy Lutomirski wrote:>? > Does that work on powerpc on existing kernels? > > Anyway, here's another crazy idea: make the quirk assume that the > IOMMU is bypasses if and only if the weak barriers bit is set on > systems that are missing the new DT binding."New DT bindings" doesn't mean much ... how do we change DT bindings on existing machines with a FW in flash ? What about partition <-> partition virtio such as what we could do on PAPR systems. That would have the weak barrier bit. Ben.