Currently, a lot of the virtio code assumes that bus (i.e. hypervisor) addresses are the same as physical address. This is false on Xen, so virtio is completely broken. I wouldn't be surprised if it also becomes a problem the first time that someone sticks a physical "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM. Would you accept patches to convert virtio_ring and virtio_pci to use the DMA APIs? I think that the only real catch will be that virtio_ring's approach to freeing indirect blocks is currently incompatible with the DMA API -- it assumes that knowing the bus address is enough to call kfree, and I don't think that the DMA API provides a reverse mapping like that. --Andy -- Andy Lutomirski AMA Capital Management, LLC
On Mon, Aug 25, 2014 at 10:18:46AM -0700, Andy Lutomirski wrote:> Currently, a lot of the virtio code assumes that bus (i.e. hypervisor) > addresses are the same as physical address. This is false on Xen, so > virtio is completely broken. I wouldn't be surprised if it also > becomes a problem the first time that someone sticks a physical > "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM. > > Would you accept patches to convert virtio_ring and virtio_pci to use > the DMA APIs? I think that the only real catch will be that > virtio_ring's approach to freeing indirect blocks is currently > incompatible with the DMA API -- it assumes that knowing the bus > address is enough to call kfree, and I don't think that the DMA API > provides a reverse mapping like that.If you use the dma_map/unmap_sg all of that ends up being stuck in the sg structure (sg->dma_address ends with the DMA addr, sg_phys(sg) gives you the physical address).> > --Andy > > -- > Andy Lutomirski > AMA Capital Management, LLC
On Mon, Aug 25, 2014 at 11:54 AM, Konrad Rzeszutek Wilk <konrad.wilk at oracle.com> wrote:> On Mon, Aug 25, 2014 at 10:18:46AM -0700, Andy Lutomirski wrote: >> Currently, a lot of the virtio code assumes that bus (i.e. hypervisor) >> addresses are the same as physical address. This is false on Xen, so >> virtio is completely broken. I wouldn't be surprised if it also >> becomes a problem the first time that someone sticks a physical >> "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM. >> >> Would you accept patches to convert virtio_ring and virtio_pci to use >> the DMA APIs? I think that the only real catch will be that >> virtio_ring's approach to freeing indirect blocks is currently >> incompatible with the DMA API -- it assumes that knowing the bus >> address is enough to call kfree, and I don't think that the DMA API >> provides a reverse mapping like that. > > If you use the dma_map/unmap_sg all of that ends up being stuck in the > sg structure (sg->dma_address ends with the DMA addr, sg_phys(sg) gives > you the physical address).Unfortunately, virtio_ring doesn't hang on to the sg structure until complation. I don't think it can, either -- if I read it right, the virtio_net driver uses one scatterlist per queue instead of one scatterlist per pending skb, so the sg entries could be overwritten by the time virtio_ring should unmap it. Fortunately, I think that dma_unmap_single can handle this case just fine. I have a WIP here: https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=virtio_ring_xen It works, but it's mostly missing unmap calls. If there's no iommu or swiotlb, then there's nothing to leak, so it's okay. If you do, then this driver will eventually explode. I'll send patches once I have it fixed up. --Andy
Andy Lutomirski <luto at amacapital.net> writes:> Currently, a lot of the virtio code assumes that bus (i.e. hypervisor) > addresses are the same as physical address. This is false on Xen, so > virtio is completely broken. I wouldn't be surprised if it also > becomes a problem the first time that someone sticks a physical > "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM. > > Would you accept patches to convert virtio_ring and virtio_pci to use > the DMA APIs? I think that the only real catch will be that > virtio_ring's approach to freeing indirect blocks is currently > incompatible with the DMA API -- it assumes that knowing the bus > address is enough to call kfree, and I don't think that the DMA API > provides a reverse mapping like that.Hi Andy, This has long been a source of contention. virtio assumes that the hypervisor can decode guest-physical addresses. PowerPC, in particular, doesn't want to pay the cost of IOMMU manipulations, and all arguments presented so far for using an IOMMU for a virtio device are weak. And changing to use DMA APIs would break them anyway. Of course, it's Just A Matter of Code, so it's possible to create a Xen-specific variant which uses the DMA APIs. I'm not sure what that would look like in the virtio standard, however. Cheers, Rusty.
On Wed, Aug 27, 2014 at 08:40:51PM +0930, Rusty Russell wrote:> Andy Lutomirski <luto at amacapital.net> writes: > > Currently, a lot of the virtio code assumes that bus (i.e. hypervisor) > > addresses are the same as physical address. This is false on Xen, so > > virtio is completely broken. I wouldn't be surprised if it also > > becomes a problem the first time that someone sticks a physical > > "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM. > > > > Would you accept patches to convert virtio_ring and virtio_pci to use > > the DMA APIs? I think that the only real catch will be that > > virtio_ring's approach to freeing indirect blocks is currently > > incompatible with the DMA API -- it assumes that knowing the bus > > address is enough to call kfree, and I don't think that the DMA API > > provides a reverse mapping like that. > > Hi Andy, > > This has long been a source of contention. virtio assumes that > the hypervisor can decode guest-physical addresses. > > PowerPC, in particular, doesn't want to pay the cost of IOMMU > manipulations, and all arguments presented so far for using an IOMMU for > a virtio device are weak. And changing to use DMA APIs would break them > anyway. > > Of course, it's Just A Matter of Code, so it's possible to > create a Xen-specific variant which uses the DMA APIs. I'm not sure > what that would look like in the virtio standard, however. > > Cheers, > Rusty.For x86 as of QEMU 2.0 there's no iommu. So a reasonable thing to do for that platform might be to always use iommu *if it's there*. My understanding is this isn't the case for powerpc? -- MST
On Aug 27, 2014 4:30 AM, "Rusty Russell" <rusty at rustcorp.com.au> wrote:> > Andy Lutomirski <luto at amacapital.net> writes: > > Currently, a lot of the virtio code assumes that bus (i.e. hypervisor) > > addresses are the same as physical address. This is false on Xen, so > > virtio is completely broken. I wouldn't be surprised if it also > > becomes a problem the first time that someone sticks a physical > > "virtio" device on a 32-bit bus on an ARM SOC with more than 4G RAM. > > > > Would you accept patches to convert virtio_ring and virtio_pci to use > > the DMA APIs? I think that the only real catch will be that > > virtio_ring's approach to freeing indirect blocks is currently > > incompatible with the DMA API -- it assumes that knowing the bus > > address is enough to call kfree, and I don't think that the DMA API > > provides a reverse mapping like that. > > Hi Andy, > > This has long been a source of contention. virtio assumes that > the hypervisor can decode guest-physical addresses. > > PowerPC, in particular, doesn't want to pay the cost of IOMMU > manipulations, and all arguments presented so far for using an IOMMU for > a virtio device are weak. And changing to use DMA APIs would break them > anyway. > > Of course, it's Just A Matter of Code, so it's possible to > create a Xen-specific variant which uses the DMA APIs. I'm not sure > what that would look like in the virtio standard, however.I'll reply in the other thread to keep everything in one place.> > Cheers, > Rusty.
On Wed, 2014-08-27 at 20:40 +0930, Rusty Russell wrote:> Hi Andy, > > This has long been a source of contention. virtio assumes that > the hypervisor can decode guest-physical addresses. > > PowerPC, in particular, doesn't want to pay the cost of IOMMU > manipulations, and all arguments presented so far for using an IOMMU for > a virtio device are weak. And changing to use DMA APIs would break them > anyway. > > Of course, it's Just A Matter of Code, so it's possible to > create a Xen-specific variant which uses the DMA APIs. I'm not sure > what that would look like in the virtio standard, however.So this has popped up in the past a few times already from people who want to use virtio as a transport between physical systems connected via a bus like PCI using non-transparent bridges for example. There's a way to get both here that isn't too nasty... we can make the virtio drivers use the dma_map_* APIs and just switch the dma_ops in the struct device based on the hypervisor requirements. IE. For KVM we could attach a set of ops that basically just return the physical address, real PCI transport would use the normal callbacks etc... The only problem at the moment is that the dma_map_ops, while defined generically, aren't plumbed into the generic struct device but instead on some architectures dev_archdata. This includes powerpc, ARM and x86 (under a CONFIG option for the latter which is only enabled on x86_64 and some oddball i386 variant). So either we switch to have all architectures we care about always use the generic DMA ops and move the pointer to struct device, or we create another inline "indirection" to deal with the cases without the dma_map_ops... Cheers, Ben.