On Wed, Oct 28, 2015 at 05:09:47PM +0900, David Woodhouse wrote:> On Wed, 2015-10-28 at 16:40 +0900, Christian Borntraeger wrote: > > Am 28.10.2015 um 16:17 schrieb Michael S. Tsirkin: > > > On Tue, Oct 27, 2015 at 11:38:57PM -0700, Andy Lutomirski wrote: > > > > This switches virtio to use the DMA API unconditionally. I'm sure > > > > it breaks things, but it seems to work on x86 using virtio-pci, with > > > > and without Xen, and using both the modern 1.0 variant and the > > > > legacy variant. > > > > > > I'm very glad to see work on this making progress. > > > > > > I suspect we'll have to find a way to make this optional though, and > > > keep doing the non-DMA API thing with old devices. And I've been > > > debating with myself whether a pci specific thing or a feature bit is > > > preferable. > > > > > > > We have discussed that at kernel summit. I will try to implement a dummy dma_ops for > > s390 that does 1:1 mapping and Ben will look into doing some quirk to handle "old" > > code in addition to also make it possible to mark devices as iommu bypass (IIRC, > > via device tree, Ben?) > > Right. You never eschew the DMA API in the *driver* ? you just expect > the DMA API to do the right thing for devices which don't need > translation (with platforms using per-device dma_ops and generally > getting their act together). > We're pushing that on the platforms where it's currently an issue, > including Power, SPARC and S390. > > -- > dwmw2 > >Well APIs are just that - internal kernel APIs. If the only user of an API is virtio, we can strick the code in virtio.h just as well. I think controlling this dynamically and not statically in e.g. devicetree is important though. E.g. on intel x86, there's an option iommu=pt which does the 1:1 thing for devices when used by kernel, but enables the iommu if used by userspace/VMs. Something like this would be needed for other platforms IMHO. And given that 1. virtio seems the only user so far 2. supporting this per device seems like something that might become useful in the future maybe we'd better make this part of virtio transports. -- MST
On Wed, 2015-10-28 at 13:35 +0200, Michael S. Tsirkin wrote:> E.g. on intel x86, there's an option iommu=pt which does the 1:1 > thing for devices when used by kernel, but enables > the iommu if used by userspace/VMs.That's none of your business. You call the DMA API when you do DMA. That's all there is to it. If the IOMMU happens to be in passthrough mode, or your device happens to not to be routed through an IOMMU today, then I/O virtual address you get back from the DMA API will look a *lot* like the physical address you asked the DMA to map. You might think there's no IOMMU. We couldn't possibly comment. Use the DMA API. Always. Let the platform worry about whether it actually needs to *do* anything or not. -- dwmw2 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5691 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20151028/ad5d5739/attachment.bin>
On Wed, Oct 28, 2015 at 10:35:27PM +0900, David Woodhouse wrote:> On Wed, 2015-10-28 at 13:35 +0200, Michael S. Tsirkin wrote: > > E.g. on intel x86, there's an option iommu=pt which does the 1:1 > > thing for devices when used by kernel, but enables > > the iommu if used by userspace/VMs. > > That's none of your business. > > You call the DMA API when you do DMA. That's all there is to it. > > If the IOMMU happens to be in passthrough mode, or your device happens > to not to be routed through an IOMMU today, then I/O virtual address > you get back from the DMA API will look a *lot* like the physical > address you asked the DMA to map. You might think there's no IOMMU. We > couldn't possibly comment. > > Use the DMA API. Always. Let the platform worry about whether it > actually needs to *do* anything or not. > -- > dwmw2 > >Short answer - platforms need a way to discover, and express different security requirements of different devices. If they continue to lack that, we'll need a custom API in virtio, and while this seems a bit less elegant, I would not see that as the end of the world at all, there are not that many virtio drivers. And hey - that's just an internal API. We can change it later at a whim. Long answer - PV is weird. It's not always the same as real hardware. For PV, it's generally hypervisor doing writes into memory. If it's monolitic with device emulation in same memory space as the hypervisor (e.g. in the case of the current QEMU, or using vhost in host kernel), then you gain *no security* by "restricting" it by means of the IOMMU - the IOMMU is part of the same hypervisor. If it is modular with device emulation in a separate memory space (e.g. in case of Xen, or vhost-user in modern QEMU) then you do gain security: the part emulating the IOMMU limits the part doing DMA. In both cases for assigned devices, it is always modular in a sense, so you do gain security since that is restricted by the hardware IOMMU. The way things are set up at the moment, it's mostly global, with iommu=pt on intel being a kind of exception. We need host/guest and API interfaces that are more nuanced than that. -- MST