Hi all, this patch series introduces support for running Linux on top of Xen inside a virtual machine with virtio devices (nested virt scenario). The problem is that Linux virtio drivers use virt_to_phys to get the guest pseudo-physical addresses to pass to the backend, which doesn't work as expected on Xen. Switching the virtio drivers to the dma APIs (dma_alloc_coherent, dma_map/unmap_single and dma_map/unmap_sg) would solve the problem, as Xen support in Linux provides an implementation of the dma API which takes care of the additional address conversions. However using the dma API would increase the complexity of the non-Xen case too. We would also need to keep track of the physical or virtual address in addition to the dma address for each vring_desc to be able to free the memory in detach_buf (see patch #3). Instead this series adds few obvious checks to perform address translations in a couple of key places, without changing non-Xen code paths. You are welcome to suggest improvements or alternative implementations. Thanks, Stefano Stefano Stabellini (3): xen: export xen_phys_to_bus, xen_bus_to_phys and xen_virt_to_bus xen/virtio: allocate a contiguous region to be use as virtio queue xen/virtio_ring: introduce cpu_to_virtio_addr and virtio_addr_to_cpu drivers/virtio/virtio_pci_legacy.c | 19 +++++++++++++++---- drivers/virtio/virtio_ring.c | 9 +++++---- drivers/xen/swiotlb-xen.c | 31 ------------------------------- include/linux/virtio_config.h | 14 ++++++++++++++ include/xen/swiotlb-xen.h | 32 ++++++++++++++++++++++++++++++++ 5 files changed, 66 insertions(+), 39 deletions(-)
Stefano Stabellini
2015-Dec-07 16:19 UTC
[PATCH RFC 1/3] xen: export xen_phys_to_bus, xen_bus_to_phys and xen_virt_to_bus
Signed-off-by: Stefano Stabellini <stefano.stabellini at eu.citrix.com> CC: konrad.wilk at oracle.com CC: boris.ostrovsky at oracle.com CC: david.vrabel at citrix.com --- drivers/xen/swiotlb-xen.c | 31 ------------------------------- include/xen/swiotlb-xen.h | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 31 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 79bc493..56014d5 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -75,37 +75,6 @@ static unsigned long xen_io_tlb_nslabs; static u64 start_dma_addr; -/* - * Both of these functions should avoid PFN_PHYS because phys_addr_t - * can be 32bit when dma_addr_t is 64bit leading to a loss in - * information if the shift is done before casting to 64bit. - */ -static inline dma_addr_t xen_phys_to_bus(phys_addr_t paddr) -{ - unsigned long bfn = pfn_to_bfn(PFN_DOWN(paddr)); - dma_addr_t dma = (dma_addr_t)bfn << PAGE_SHIFT; - - dma |= paddr & ~PAGE_MASK; - - return dma; -} - -static inline phys_addr_t xen_bus_to_phys(dma_addr_t baddr) -{ - unsigned long pfn = bfn_to_pfn(PFN_DOWN(baddr)); - dma_addr_t dma = (dma_addr_t)pfn << PAGE_SHIFT; - phys_addr_t paddr = dma; - - paddr |= baddr & ~PAGE_MASK; - - return paddr; -} - -static inline dma_addr_t xen_virt_to_bus(void *address) -{ - return xen_phys_to_bus(virt_to_phys(address)); -} - static int check_pages_physically_contiguous(unsigned long pfn, unsigned int offset, size_t length) diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h index 8b2eb93..d55aee8 100644 --- a/include/xen/swiotlb-xen.h +++ b/include/xen/swiotlb-xen.h @@ -3,9 +3,41 @@ #include <linux/dma-direction.h> #include <linux/swiotlb.h> +#include <asm/xen/page.h> extern int xen_swiotlb_init(int verbose, bool early); +/* + * Both of these functions should avoid PFN_PHYS because phys_addr_t + * can be 32bit when dma_addr_t is 64bit leading to a loss in + * information if the shift is done before casting to 64bit. + */ +static inline dma_addr_t xen_phys_to_bus(phys_addr_t paddr) +{ + unsigned long bfn = pfn_to_bfn(PFN_DOWN(paddr)); + dma_addr_t dma = (dma_addr_t)bfn << PAGE_SHIFT; + + dma |= paddr & ~PAGE_MASK; + + return dma; +} + +static inline phys_addr_t xen_bus_to_phys(dma_addr_t baddr) +{ + unsigned long pfn = bfn_to_pfn(PFN_DOWN(baddr)); + dma_addr_t dma = (dma_addr_t)pfn << PAGE_SHIFT; + phys_addr_t paddr = dma; + + paddr |= baddr & ~PAGE_MASK; + + return paddr; +} + +static inline dma_addr_t xen_virt_to_bus(void *address) +{ + return xen_phys_to_bus(virt_to_phys(address)); +} + extern void *xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, dma_addr_t *dma_handle, gfp_t flags, -- 1.7.10.4
Stefano Stabellini
2015-Dec-07 16:19 UTC
[PATCH RFC 2/3] xen/virtio: allocate a contiguous region to be use as virtio queue
When running on Xen inside as virtual machine (nested virt scenario), memory allocated by alloc_pages_exact might not actually be contiguous. Call xen_swiotlb_alloc_coherent instead, which is going to take care of making the buffer contiguous in machine memory. No changes in behavior for the non-Xen case. Signed-off-by: Stefano Stabellini <stefano.stabellini at eu.citrix.com> --- Alternatively we could call dma_alloc_coherent in all cases, but that would make the non-Xen code path more complex. --- drivers/virtio/virtio_pci_legacy.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c index 48bc979..27359ac 100644 --- a/drivers/virtio/virtio_pci_legacy.c +++ b/drivers/virtio/virtio_pci_legacy.c @@ -18,6 +18,8 @@ */ #include "virtio_pci_common.h" +#include <xen/xen.h> +#include <xen/swiotlb-xen.h> /* virtio config->get_features() implementation */ static u64 vp_get_features(struct virtio_device *vdev) @@ -122,6 +124,7 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev, unsigned long size; u16 num; int err; + dma_addr_t dma_addr; /* Select the queue we're interested in */ iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL); @@ -135,12 +138,20 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev, info->msix_vector = msix_vec; size = PAGE_ALIGN(vring_size(num, VIRTIO_PCI_VRING_ALIGN)); - info->queue = alloc_pages_exact(size, GFP_KERNEL|__GFP_ZERO); + /* activate the queue */ + if (xen_domain()) { + info->queue = xen_swiotlb_alloc_coherent(NULL, + size, + &dma_addr, + GFP_KERNEL|__GFP_ZERO, + NULL); + } else { + info->queue = alloc_pages_exact(size, GFP_KERNEL|__GFP_ZERO); + dma_addr = virt_to_phys(info->queue); + } if (info->queue == NULL) return ERR_PTR(-ENOMEM); - - /* activate the queue */ - iowrite32(virt_to_phys(info->queue) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT, + iowrite32(dma_addr >> VIRTIO_PCI_QUEUE_ADDR_SHIFT, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN); /* create the vring */ -- 1.7.10.4
Stefano Stabellini
2015-Dec-07 16:19 UTC
[PATCH RFC 3/3] xen/virtio_ring: introduce cpu_to_virtio_addr and virtio_addr_to_cpu
When running on Xen inside as virtual machine (nested virt scenario), addresses need to be translated from phys to machine to get the actual guest pseudo-physical address. Introduce a new pair of functions, cpu_to_virtio_addr and virtio_addr_to_cpu, which call the appriopriate __virtio64_to_cpu and __cpu_to_virtio64 functions after doing the phys_to_bus and bus_to_phys translations for Xen. No changes in behavior for the non-Xen case. Signed-off-by: Stefano Stabellini <stefano.stabellini at eu.citrix.com> --- I realize that this patch is not very nice, but at least it is easy to understand. I welcome any suggestions on how to improve it. I considered introducing regular dma API calls, like dma_map/unmap_single and dma_map/unmap_sg. However they would make the non-Xen code path more complex than it is today. We would also need to keep track of the physical or virtual address in addition to the dma address for each vring_desc to be able to free the memory in detach_buf. --- drivers/virtio/virtio_ring.c | 9 +++++---- include/linux/virtio_config.h | 14 ++++++++++++++ 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 096b857..34a1d42 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -16,6 +16,7 @@ * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ +#include <linux/dma-mapping.h> #include <linux/virtio.h> #include <linux/virtio_ring.h> #include <linux/virtio_config.h> @@ -172,7 +173,7 @@ static inline int virtqueue_add(struct virtqueue *_vq, if (desc) { /* Use a single buffer which doesn't continue */ vq->vring.desc[head].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_INDIRECT); - vq->vring.desc[head].addr = cpu_to_virtio64(_vq->vdev, virt_to_phys(desc)); + vq->vring.desc[head].addr = cpu_to_virtio_addr(_vq->vdev, virt_to_phys(desc)); /* avoid kmemleak false positive (hidden by virt_to_phys) */ kmemleak_ignore(desc); vq->vring.desc[head].len = cpu_to_virtio32(_vq->vdev, total_sg * sizeof(struct vring_desc)); @@ -206,7 +207,7 @@ static inline int virtqueue_add(struct virtqueue *_vq, for (n = 0; n < out_sgs; n++) { for (sg = sgs[n]; sg; sg = sg_next(sg)) { desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT); - desc[i].addr = cpu_to_virtio64(_vq->vdev, sg_phys(sg)); + desc[i].addr = cpu_to_virtio_addr(_vq->vdev, sg_phys(sg)); desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length); prev = i; i = virtio16_to_cpu(_vq->vdev, desc[i].next); @@ -215,7 +216,7 @@ static inline int virtqueue_add(struct virtqueue *_vq, for (; n < (out_sgs + in_sgs); n++) { for (sg = sgs[n]; sg; sg = sg_next(sg)) { desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT | VRING_DESC_F_WRITE); - desc[i].addr = cpu_to_virtio64(_vq->vdev, sg_phys(sg)); + desc[i].addr = cpu_to_virtio_addr(_vq->vdev, sg_phys(sg)); desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length); prev = i; i = virtio16_to_cpu(_vq->vdev, desc[i].next); @@ -433,7 +434,7 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head) /* Free the indirect table */ if (vq->vring.desc[i].flags & cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)) - kfree(phys_to_virt(virtio64_to_cpu(vq->vq.vdev, vq->vring.desc[i].addr))); + kfree(phys_to_virt(virtio_addr_to_cpu(vq->vq.vdev, vq->vring.desc[i].addr))); while (vq->vring.desc[i].flags & cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_NEXT)) { i = virtio16_to_cpu(vq->vq.vdev, vq->vring.desc[i].next); diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index e5ce8ab..861803f 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -6,6 +6,8 @@ #include <linux/virtio.h> #include <linux/virtio_byteorder.h> #include <uapi/linux/virtio_config.h> +#include <xen/xen.h> +#include <xen/swiotlb-xen.h> /** * virtio_config_ops - operations for configuring a virtio device @@ -237,11 +239,23 @@ static inline u64 virtio64_to_cpu(struct virtio_device *vdev, __virtio64 val) return __virtio64_to_cpu(virtio_is_little_endian(vdev), val); } +static inline u64 virtio_addr_to_cpu(struct virtio_device *vdev, __virtio64 val) +{ + val = xen_pv_domain() ? xen_bus_to_phys(val) : val; + return __virtio64_to_cpu(virtio_is_little_endian(vdev), val); +} + static inline __virtio64 cpu_to_virtio64(struct virtio_device *vdev, u64 val) { return __cpu_to_virtio64(virtio_is_little_endian(vdev), val); } +static inline __virtio64 cpu_to_virtio_addr(struct virtio_device *vdev, u64 val) +{ + val = xen_pv_domain() ? xen_phys_to_bus(val) : val; + return __cpu_to_virtio64(virtio_is_little_endian(vdev), val); +} + /* Config space accessors. */ #define virtio_cread(vdev, structname, member, ptr) \ do { \ -- 1.7.10.4
On 07/12/15 16:19, Stefano Stabellini wrote:> Hi all, > > this patch series introduces support for running Linux on top of Xen > inside a virtual machine with virtio devices (nested virt scenario). > The problem is that Linux virtio drivers use virt_to_phys to get the > guest pseudo-physical addresses to pass to the backend, which doesn't > work as expected on Xen. > > Switching the virtio drivers to the dma APIs (dma_alloc_coherent, > dma_map/unmap_single and dma_map/unmap_sg) would solve the problem, as > Xen support in Linux provides an implementation of the dma API which > takes care of the additional address conversions. However using the dma > API would increase the complexity of the non-Xen case too. We would also > need to keep track of the physical or virtual address in addition to the > dma address for each vring_desc to be able to free the memory in > detach_buf (see patch #3). > > Instead this series adds few obvious checks to perform address > translations in a couple of key places, without changing non-Xen code > paths. You are welcome to suggest improvements or alternative > implementations.Andy Lutomirski also looked at this. Andy what happened to this work? David
On Mon, Dec 14, 2015 at 02:00:05PM +0000, David Vrabel wrote:> On 07/12/15 16:19, Stefano Stabellini wrote: > > Hi all, > > > > this patch series introduces support for running Linux on top of Xen > > inside a virtual machine with virtio devices (nested virt scenario). > > The problem is that Linux virtio drivers use virt_to_phys to get the > > guest pseudo-physical addresses to pass to the backend, which doesn't > > work as expected on Xen. > > > > Switching the virtio drivers to the dma APIs (dma_alloc_coherent, > > dma_map/unmap_single and dma_map/unmap_sg) would solve the problem, as > > Xen support in Linux provides an implementation of the dma API which > > takes care of the additional address conversions. However using the dma > > API would increase the complexity of the non-Xen case too. We would also > > need to keep track of the physical or virtual address in addition to the > > dma address for each vring_desc to be able to free the memory in > > detach_buf (see patch #3). > > > > Instead this series adds few obvious checks to perform address > > translations in a couple of key places, without changing non-Xen code > > paths. You are welcome to suggest improvements or alternative > > implementations. > > Andy Lutomirski also looked at this. Andy what happened to this work? > > DavidThe approach there was to try and convert all virtio to use DMA API unconditionally. This is reasonable if there's a way for devices to request 1:1 mappings individually. As that is currently missing, that patchset can not be merged yet. -- MST
David Vrabel
2015-Dec-14 14:12 UTC
[Xen-devel] [PATCH RFC 1/3] xen: export xen_phys_to_bus, xen_bus_to_phys and xen_virt_to_bus
On 07/12/15 16:19, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini at eu.citrix.com>Can you add a brief description about why these are being moved? Then, assuming this is needed in the end: Acked-by: David Vrabel <david.vrabel at citrix.com> David