This patch series is the follow up on the discussions we had before about the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation for virito devices (https://patchwork.kernel.org/patch/10417371/). There were suggestions about doing away with two different paths of transactions with the host/QEMU, first being the direct GPA and the other being the DMA API based translations. First patch attempts to create a direct GPA mapping based DMA operations structure called 'virtio_direct_dma_ops' with exact same implementation of the direct GPA path which virtio core currently has but just wrapped in a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the existing semantics. The second patch does exactly that inside the function virtio_finalize_features(). The third patch removes the default direct GPA path from virtio core forcing it to use DMA API callbacks for all devices. Now with that change, every device must have a DMA operations structure associated with it. The fourth patch adds an additional hook which gives the platform an opportunity to do yet another override if required. This platform hook can be used on POWER Ultravisor based protected guests to load up SWIOTLB DMA callbacks to do the required (as discussed previously in the above mentioned thread how host is allowed to access only parts of the guest GPA range) bounce buffering into the shared memory for all I/O scatter gather buffers to be consumed on the host side. Please go through these patches and review whether this approach broadly makes sense. I will appreciate suggestions, inputs, comments regarding the patches or the approach in general. Thank you. Anshuman Khandual (4): virtio: Define virtio_direct_dma_ops structure virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively virtio: Force virtio core to use DMA API callbacks for all virtio devices virtio: Add platform specific DMA API translation for virito devices arch/powerpc/include/asm/dma-mapping.h | 6 +++ arch/powerpc/platforms/pseries/iommu.c | 6 +++ drivers/virtio/virtio.c | 72 ++++++++++++++++++++++++++++++++++ drivers/virtio/virtio_pci_common.h | 3 ++ drivers/virtio/virtio_ring.c | 65 +----------------------------- 5 files changed, 89 insertions(+), 63 deletions(-) -- 2.9.3
Anshuman Khandual
2018-Jul-20 03:59 UTC
[RFC 1/4] virtio: Define virtio_direct_dma_ops structure
Current implementation of DMA API inside virtio core calls device's DMA OPS callback functions when the flag VIRTIO_F_IOMMU_PLATFORM flag is set. But in absence of the flag, virtio core falls back calling basic transformation of the incoming SG addresses as GPA. Going forward virtio should only call DMA API based transformations generating either GPA or IOVA depending on QEMU expectations again based on VIRTIO_F_IOMMU_PLATFORM flag. It requires removing existing fallback code path for GPA transformation and replacing that with a direct map DMA OPS structure. This adds that direct mapping DMA OPS structure to be used in later patches which will make virtio core call DMA API all the time for all virtio devices. Signed-off-by: Anshuman Khandual <khandual at linux.vnet.ibm.com> --- drivers/virtio/virtio.c | 60 ++++++++++++++++++++++++++++++++++++++ drivers/virtio/virtio_pci_common.h | 3 ++ 2 files changed, 63 insertions(+) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 59e36ef..7907ad3 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -3,6 +3,7 @@ #include <linux/virtio_config.h> #include <linux/module.h> #include <linux/idr.h> +#include <linux/dma-mapping.h> #include <uapi/linux/virtio_ids.h> /* Unique numbering for virtio devices. */ @@ -442,3 +443,62 @@ core_initcall(virtio_init); module_exit(virtio_exit); MODULE_LICENSE("GPL"); + +/* + * Virtio direct mapping DMA API operations structure + * + * This defines DMA API structure for all virtio devices which would not + * either bring in their own DMA OPS from architecture or they would not + * like to use architecture specific IOMMU based DMA OPS because QEMU + * expects GPA instead of an IOVA in absence of VIRTIO_F_IOMMU_PLATFORM. + */ +dma_addr_t virtio_direct_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction dir, + unsigned long attrs) +{ + return page_to_phys(page) + offset; +} + +void virtio_direct_unmap_page(struct device *hwdev, dma_addr_t dev_addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ +} + +int virtio_direct_mapping_error(struct device *hwdev, dma_addr_t dma_addr) +{ + return 0; +} + +void *virtio_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, + gfp_t gfp, unsigned long attrs) +{ + void *queue = alloc_pages_exact(PAGE_ALIGN(size), gfp); + + if (queue) { + phys_addr_t phys_addr = virt_to_phys(queue); + *dma_handle = (dma_addr_t)phys_addr; + + if (WARN_ON_ONCE(*dma_handle != phys_addr)) { + free_pages_exact(queue, PAGE_ALIGN(size)); + return NULL; + } + } + return queue; +} + +void virtio_direct_free(struct device *dev, size_t size, void *vaddr, + dma_addr_t dma_addr, unsigned long attrs) +{ + free_pages_exact(vaddr, PAGE_ALIGN(size)); +} + +const struct dma_map_ops virtio_direct_dma_ops = { + .alloc = virtio_direct_alloc, + .free = virtio_direct_free, + .map_page = virtio_direct_map_page, + .unmap_page = virtio_direct_unmap_page, + .mapping_error = virtio_direct_mapping_error, +}; +EXPORT_SYMBOL(virtio_direct_dma_ops); diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index 135ee3c..ec44d2f 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -31,6 +31,9 @@ #include <linux/highmem.h> #include <linux/spinlock.h> +extern struct dma_map_ops virtio_direct_dma_ops; + + struct virtio_pci_vq_info { /* the actual virtqueue */ struct virtqueue *vq; -- 2.9.3
Anshuman Khandual
2018-Jul-20 03:59 UTC
[RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
Now that virtio core always needs all virtio devices to have DMA OPS, we need to make sure that the structure it points is the right one. In the absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel. In such case, virtio device must use default virtio_direct_dma_ops DMA OPS structure which transforms scatter gather buffer addresses as GPA. This DMA OPS override must happen as early as possible during virtio device initializatin sequence before virtio core starts using given device's DMA OPS callbacks for I/O transactions. This change detects device's IOMMU flag and does the override in case the flag is cleared. Signed-off-by: Anshuman Khandual <khandual at linux.vnet.ibm.com> --- drivers/virtio/virtio.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 7907ad3..6b13987 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status) } EXPORT_SYMBOL_GPL(virtio_add_status); +const struct dma_map_ops virtio_direct_dma_ops; + int virtio_finalize_features(struct virtio_device *dev) { int ret = dev->config->finalize_features(dev); @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev) if (ret) return ret; + if (virtio_has_iommu_quirk(dev)) + set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops); + if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1)) return 0; -- 2.9.3
Anshuman Khandual
2018-Jul-20 03:59 UTC
[RFC 3/4] virtio: Force virtio core to use DMA API callbacks for all virtio devices
Virtio core should use DMA API callbacks for all virtio devices which may generate either GAP or IOVA depending on VIRTIO_F_IOMMU_PLATFORM flag and resulting QEMU expectations. This implies that every virtio device needs to have a DMA OPS structure. This removes previous GPA fallback code paths. Signed-off-by: Anshuman Khandual <khandual at linux.vnet.ibm.com> --- drivers/virtio/virtio_ring.c | 65 ++------------------------------------------ 1 file changed, 2 insertions(+), 63 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 814b395..c265964 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -141,26 +141,6 @@ struct vring_virtqueue { * unconditionally on data path. */ -static bool vring_use_dma_api(struct virtio_device *vdev) -{ - if (!virtio_has_iommu_quirk(vdev)) - return true; - - /* Otherwise, we are left to guess. */ - /* - * In theory, it's possible to have a buggy QEMU-supposed - * emulated Q35 IOMMU and Xen enabled at the same time. On - * such a configuration, virtio has never worked and will - * not work without an even larger kludge. Instead, enable - * the DMA API if we're a Xen guest, which at least allows - * all of the sensible Xen configurations to work correctly. - */ - if (xen_domain()) - return true; - - return false; -} - /* * The DMA ops on various arches are rather gnarly right now, and * making all of the arch DMA ops work on the vring device itself @@ -176,9 +156,6 @@ static dma_addr_t vring_map_one_sg(const struct vring_virtqueue *vq, struct scatterlist *sg, enum dma_data_direction direction) { - if (!vring_use_dma_api(vq->vq.vdev)) - return (dma_addr_t)sg_phys(sg); - /* * We can't use dma_map_sg, because we don't use scatterlists in * the way it expects (we don't guarantee that the scatterlist @@ -193,9 +170,6 @@ static dma_addr_t vring_map_single(const struct vring_virtqueue *vq, void *cpu_addr, size_t size, enum dma_data_direction direction) { - if (!vring_use_dma_api(vq->vq.vdev)) - return (dma_addr_t)virt_to_phys(cpu_addr); - return dma_map_single(vring_dma_dev(vq), cpu_addr, size, direction); } @@ -205,9 +179,6 @@ static void vring_unmap_one(const struct vring_virtqueue *vq, { u16 flags; - if (!vring_use_dma_api(vq->vq.vdev)) - return; - flags = virtio16_to_cpu(vq->vq.vdev, desc->flags); if (flags & VRING_DESC_F_INDIRECT) { @@ -228,9 +199,6 @@ static void vring_unmap_one(const struct vring_virtqueue *vq, static int vring_mapping_error(const struct vring_virtqueue *vq, dma_addr_t addr) { - if (!vring_use_dma_api(vq->vq.vdev)) - return 0; - return dma_mapping_error(vring_dma_dev(vq), addr); } @@ -1016,43 +984,14 @@ EXPORT_SYMBOL_GPL(__vring_new_virtqueue); static void *vring_alloc_queue(struct virtio_device *vdev, size_t size, dma_addr_t *dma_handle, gfp_t flag) { - if (vring_use_dma_api(vdev)) { - return dma_alloc_coherent(vdev->dev.parent, size, + return dma_alloc_coherent(vdev->dev.parent, size, dma_handle, flag); - } else { - void *queue = alloc_pages_exact(PAGE_ALIGN(size), flag); - if (queue) { - phys_addr_t phys_addr = virt_to_phys(queue); - *dma_handle = (dma_addr_t)phys_addr; - - /* - * Sanity check: make sure we dind't truncate - * the address. The only arches I can find that - * have 64-bit phys_addr_t but 32-bit dma_addr_t - * are certain non-highmem MIPS and x86 - * configurations, but these configurations - * should never allocate physical pages above 32 - * bits, so this is fine. Just in case, throw a - * warning and abort if we end up with an - * unrepresentable address. - */ - if (WARN_ON_ONCE(*dma_handle != phys_addr)) { - free_pages_exact(queue, PAGE_ALIGN(size)); - return NULL; - } - } - return queue; - } } static void vring_free_queue(struct virtio_device *vdev, size_t size, void *queue, dma_addr_t dma_handle) { - if (vring_use_dma_api(vdev)) { - dma_free_coherent(vdev->dev.parent, size, queue, dma_handle); - } else { - free_pages_exact(queue, PAGE_ALIGN(size)); - } + dma_free_coherent(vdev->dev.parent, size, queue, dma_handle); } struct virtqueue *vring_create_virtqueue( -- 2.9.3
Anshuman Khandual
2018-Jul-20 03:59 UTC
[RFC 4/4] virtio: Add platform specific DMA API translation for virito devices
This adds a hook which a platform can define in order to allow it to override virtio device's DMA OPS irrespective of whether it has the flag VIRTIO_F_IOMMU_PLATFORM set or not. We want to use this to do bounce-buffering of data on the new secure pSeries platform, currently under development, where a KVM host cannot access all of the memory space of a secure KVM guest. The host can only access the pages which the guest has explicitly requested to be shared with the host, thus the virtio implementation in the guest has to copy data to and from shared pages. With this hook, the platform code in the secure guest can force the use of swiotlb for virtio buffers, with a back-end for swiotlb which will use a pool of pre-allocated shared pages. Thus all data being sent or received by virtio devices will be copied through pages which the host has access to. Signed-off-by: Anshuman Khandual <khandual at linux.vnet.ibm.com> --- arch/powerpc/include/asm/dma-mapping.h | 6 ++++++ arch/powerpc/platforms/pseries/iommu.c | 6 ++++++ drivers/virtio/virtio.c | 7 +++++++ 3 files changed, 19 insertions(+) diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h index 8fa3945..bc5a9d3 100644 --- a/arch/powerpc/include/asm/dma-mapping.h +++ b/arch/powerpc/include/asm/dma-mapping.h @@ -116,3 +116,9 @@ extern u64 __dma_get_required_mask(struct device *dev); #endif /* __KERNEL__ */ #endif /* _ASM_DMA_MAPPING_H */ + +#define platform_override_dma_ops platform_override_dma_ops + +struct virtio_device; + +extern void platform_override_dma_ops(struct virtio_device *vdev); diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index 06f0296..5773bc7 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -38,6 +38,7 @@ #include <linux/of.h> #include <linux/iommu.h> #include <linux/rculist.h> +#include <linux/virtio.h> #include <asm/io.h> #include <asm/prom.h> #include <asm/rtas.h> @@ -1396,3 +1397,8 @@ static int __init disable_multitce(char *str) __setup("multitce=", disable_multitce); machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init); + +void platform_override_dma_ops(struct virtio_device *vdev) +{ + /* Override vdev->parent.dma_ops if required */ +} diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 6b13987..432c332 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -168,6 +168,12 @@ EXPORT_SYMBOL_GPL(virtio_add_status); const struct dma_map_ops virtio_direct_dma_ops; +#ifndef platform_override_dma_ops +static inline void platform_override_dma_ops(struct virtio_device *vdev) +{ +} +#endif + int virtio_finalize_features(struct virtio_device *dev) { int ret = dev->config->finalize_features(dev); @@ -179,6 +185,7 @@ int virtio_finalize_features(struct virtio_device *dev) if (virtio_has_iommu_quirk(dev)) set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops); + platform_override_dma_ops(dev); if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1)) return 0; -- 2.9.3
Michael S. Tsirkin
2018-Jul-20 13:15 UTC
[RFC 4/4] virtio: Add platform specific DMA API translation for virito devices
On Fri, Jul 20, 2018 at 09:29:41AM +0530, Anshuman Khandual wrote:>Subject: Re: [RFC 4/4] virtio: Add platform specific DMA API translation for > virito devicess/virito/virtio/> This adds a hook which a platform can define in order to allow it to > override virtio device's DMA OPS irrespective of whether it has the > flag VIRTIO_F_IOMMU_PLATFORM set or not. We want to use this to do > bounce-buffering of data on the new secure pSeries platform, currently > under development, where a KVM host cannot access all of the memory > space of a secure KVM guest. The host can only access the pages which > the guest has explicitly requested to be shared with the host, thus > the virtio implementation in the guest has to copy data to and from > shared pages. > > With this hook, the platform code in the secure guest can force the > use of swiotlb for virtio buffers, with a back-end for swiotlb which > will use a pool of pre-allocated shared pages. Thus all data being > sent or received by virtio devices will be copied through pages which > the host has access to. > > Signed-off-by: Anshuman Khandual <khandual at linux.vnet.ibm.com> > --- > arch/powerpc/include/asm/dma-mapping.h | 6 ++++++ > arch/powerpc/platforms/pseries/iommu.c | 6 ++++++ > drivers/virtio/virtio.c | 7 +++++++ > 3 files changed, 19 insertions(+) > > diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h > index 8fa3945..bc5a9d3 100644 > --- a/arch/powerpc/include/asm/dma-mapping.h > +++ b/arch/powerpc/include/asm/dma-mapping.h > @@ -116,3 +116,9 @@ extern u64 __dma_get_required_mask(struct device *dev); > > #endif /* __KERNEL__ */ > #endif /* _ASM_DMA_MAPPING_H */ > + > +#define platform_override_dma_ops platform_override_dma_ops > + > +struct virtio_device; > + > +extern void platform_override_dma_ops(struct virtio_device *vdev); > diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c > index 06f0296..5773bc7 100644 > --- a/arch/powerpc/platforms/pseries/iommu.c > +++ b/arch/powerpc/platforms/pseries/iommu.c > @@ -38,6 +38,7 @@ > #include <linux/of.h> > #include <linux/iommu.h> > #include <linux/rculist.h> > +#include <linux/virtio.h> > #include <asm/io.h> > #include <asm/prom.h> > #include <asm/rtas.h> > @@ -1396,3 +1397,8 @@ static int __init disable_multitce(char *str) > __setup("multitce=", disable_multitce); > > machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init); > + > +void platform_override_dma_ops(struct virtio_device *vdev) > +{ > + /* Override vdev->parent.dma_ops if required */ > +} > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c > index 6b13987..432c332 100644 > --- a/drivers/virtio/virtio.c > +++ b/drivers/virtio/virtio.c > @@ -168,6 +168,12 @@ EXPORT_SYMBOL_GPL(virtio_add_status); > > const struct dma_map_ops virtio_direct_dma_ops; > > +#ifndef platform_override_dma_ops > +static inline void platform_override_dma_ops(struct virtio_device *vdev) > +{ > +} > +#endif > + > int virtio_finalize_features(struct virtio_device *dev) > { > int ret = dev->config->finalize_features(dev); > @@ -179,6 +185,7 @@ int virtio_finalize_features(struct virtio_device *dev) > if (virtio_has_iommu_quirk(dev)) > set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops); > > + platform_override_dma_ops(dev);Is there a single place where virtio_has_iommu_quirk is called now? If so, we could put this into virtio_has_iommu_quirk then.> if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1)) > return 0; > > -- > 2.9.3
On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote:> This patch series is the follow up on the discussions we had before about > the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation > for virito devices (https://patchwork.kernel.org/patch/10417371/). There > were suggestions about doing away with two different paths of transactions > with the host/QEMU, first being the direct GPA and the other being the DMA > API based translations. > > First patch attempts to create a direct GPA mapping based DMA operations > structure called 'virtio_direct_dma_ops' with exact same implementation > of the direct GPA path which virtio core currently has but just wrapped in > a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of > the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the > existing semantics. The second patch does exactly that inside the function > virtio_finalize_features(). The third patch removes the default direct GPA > path from virtio core forcing it to use DMA API callbacks for all devices. > Now with that change, every device must have a DMA operations structure > associated with it. The fourth patch adds an additional hook which gives > the platform an opportunity to do yet another override if required. This > platform hook can be used on POWER Ultravisor based protected guests to > load up SWIOTLB DMA callbacks to do the required (as discussed previously > in the above mentioned thread how host is allowed to access only parts of > the guest GPA range) bounce buffering into the shared memory for all I/O > scatter gather buffers to be consumed on the host side. > > Please go through these patches and review whether this approach broadly > makes sense. I will appreciate suggestions, inputs, comments regarding > the patches or the approach in general. Thank you.I like how patches 1-3 look. Could you test performance with/without to see whether the extra indirection through use of DMA ops causes a measurable slow-down?> Anshuman Khandual (4): > virtio: Define virtio_direct_dma_ops structure > virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively > virtio: Force virtio core to use DMA API callbacks for all virtio devices > virtio: Add platform specific DMA API translation for virito devices > > arch/powerpc/include/asm/dma-mapping.h | 6 +++ > arch/powerpc/platforms/pseries/iommu.c | 6 +++ > drivers/virtio/virtio.c | 72 ++++++++++++++++++++++++++++++++++ > drivers/virtio/virtio_pci_common.h | 3 ++ > drivers/virtio/virtio_ring.c | 65 +----------------------------- > 5 files changed, 89 insertions(+), 63 deletions(-) > > -- > 2.9.3
On 07/20/2018 06:46 PM, Michael S. Tsirkin wrote:> On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote: >> This patch series is the follow up on the discussions we had before about >> the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation >> for virito devices (https://patchwork.kernel.org/patch/10417371/). There >> were suggestions about doing away with two different paths of transactions >> with the host/QEMU, first being the direct GPA and the other being the DMA >> API based translations. >> >> First patch attempts to create a direct GPA mapping based DMA operations >> structure called 'virtio_direct_dma_ops' with exact same implementation >> of the direct GPA path which virtio core currently has but just wrapped in >> a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of >> the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the >> existing semantics. The second patch does exactly that inside the function >> virtio_finalize_features(). The third patch removes the default direct GPA >> path from virtio core forcing it to use DMA API callbacks for all devices. >> Now with that change, every device must have a DMA operations structure >> associated with it. The fourth patch adds an additional hook which gives >> the platform an opportunity to do yet another override if required. This >> platform hook can be used on POWER Ultravisor based protected guests to >> load up SWIOTLB DMA callbacks to do the required (as discussed previously >> in the above mentioned thread how host is allowed to access only parts of >> the guest GPA range) bounce buffering into the shared memory for all I/O >> scatter gather buffers to be consumed on the host side. >> >> Please go through these patches and review whether this approach broadly >> makes sense. I will appreciate suggestions, inputs, comments regarding >> the patches or the approach in general. Thank you. > I like how patches 1-3 look. Could you test performance > with/without to see whether the extra indirection through > use of DMA ops causes a measurable slow-down?I ran this simple DD command 10 times where /dev/vda is a virtio block device of 10GB size. dd if=/dev/zero of=/dev/vda bs=8M count=1024 oflag=direct With and without patches bandwidth which has a bit wide range does not look that different from each other. Without patches ============== ---------- 1 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.95557 s, 4.4 GB/s ---------- 2 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.05176 s, 4.2 GB/s ---------- 3 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.88314 s, 4.6 GB/s ---------- 4 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.84899 s, 4.6 GB/s ---------- 5 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.37184 s, 1.6 GB/s ---------- 6 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.9205 s, 4.5 GB/s ---------- 7 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.85166 s, 1.3 GB/s ---------- 8 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.74049 s, 4.9 GB/s ---------- 9 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.31699 s, 1.4 GB/s ---------- 10 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.47057 s, 3.5 GB/s With patches =========== ---------- 1 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.25993 s, 3.8 GB/s ---------- 2 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.82438 s, 4.7 GB/s ---------- 3 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.93856 s, 4.4 GB/s ---------- 4 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.83405 s, 4.7 GB/s ---------- 5 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 7.50199 s, 1.1 GB/s ---------- 6 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.28742 s, 3.8 GB/s ---------- 7 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.74958 s, 1.5 GB/s ---------- 8 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 1.99149 s, 4.3 GB/s ---------- 9 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 5.67647 s, 1.5 GB/s ---------- 10 --------- 1024+0 records in 1024+0 records out 8589934592 bytes (8.6 GB, 8.0 GiB) copied, 2.93957 s, 2.9 GB/s Does this look okay ?
Hi Anshuman, On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote:> This patch series is the follow up on the discussions we had before about > the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation > for virito devices (https://patchwork.kernel.org/patch/10417371/). There > were suggestions about doing away with two different paths of transactions > with the host/QEMU, first being the direct GPA and the other being the DMA > API based translations. > > First patch attempts to create a direct GPA mapping based DMA operations > structure called 'virtio_direct_dma_ops' with exact same implementation > of the direct GPA path which virtio core currently has but just wrapped in > a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of > the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the > existing semantics. The second patch does exactly that inside the function > virtio_finalize_features(). The third patch removes the default direct GPA > path from virtio core forcing it to use DMA API callbacks for all devices. > Now with that change, every device must have a DMA operations structure > associated with it. The fourth patch adds an additional hook which gives > the platform an opportunity to do yet another override if required. This > platform hook can be used on POWER Ultravisor based protected guests to > load up SWIOTLB DMA callbacks to do the required (as discussed previously > in the above mentioned thread how host is allowed to access only parts of > the guest GPA range) bounce buffering into the shared memory for all I/O > scatter gather buffers to be consumed on the host side. > > Please go through these patches and review whether this approach broadly > makes sense. I will appreciate suggestions, inputs, comments regarding > the patches or the approach in general. Thank you.I just wanted to say that this patch series provides a means for us to force the coherent DMA ops for legacy virtio devices on arm64, which in turn means that we can enable the SMMU with legacy devices in our fastmodel emulation platform (which is slowly being upgraded to virtio 1.0) without hanging during boot. Patch below. So: Acked-by: Will Deacon <will.deacon at arm.com> Tested-by: Will Deacon <will.deacon at arm.com> Thanks! Will --->8>From 4ef39e9de2c87c97bf046816ca762832f92e39b5 Mon Sep 17 00:00:00 2001From: Will Deacon <will.deacon at arm.com> Date: Fri, 27 Jul 2018 10:49:25 +0100 Subject: [PATCH] arm64: dma: Override DMA ops for legacy virtio devices Virtio devices are always cache-coherent, so force use of the coherent DMA ops for legacy virtio devices where the dma-coherent is known to be omitted by QEMU for the MMIO transport. Signed-off-by: Will Deacon <will.deacon at arm.com> --- arch/arm64/include/asm/dma-mapping.h | 6 ++++++ arch/arm64/mm/dma-mapping.c | 19 +++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h index b7847eb8a7bb..30aa8fb62dc3 100644 --- a/arch/arm64/include/asm/dma-mapping.h +++ b/arch/arm64/include/asm/dma-mapping.h @@ -44,6 +44,12 @@ void arch_teardown_dma_ops(struct device *dev); #define arch_teardown_dma_ops arch_teardown_dma_ops #endif +#ifdef CONFIG_VIRTIO +struct virtio_device; +void platform_override_dma_ops(struct virtio_device *vdev); +#define platform_override_dma_ops platform_override_dma_ops +#endif + /* do not use this function in a driver */ static inline bool is_device_dma_coherent(struct device *dev) { diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 61e93f0b5482..f9ca61b1b34d 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -891,3 +891,22 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, } #endif } + +#ifdef CONFIG_VIRTIO +#include <linux/virtio_config.h> + +void platform_override_dma_ops(struct virtio_device *vdev) +{ + struct device *dev = vdev->dev.parent; + const struct dma_map_ops *dma_ops = &arm64_swiotlb_dma_ops; + + if (virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) + return; + + dev->archdata.dma_coherent = true; + if (iommu_get_domain_for_dev(dev)) + dma_ops = &iommu_dma_ops; + + set_dma_ops(dev, dma_ops); +} +#endif /* CONFIG_VIRTIO */ -- 2.1.4
On 07/27/2018 03:28 PM, Will Deacon wrote:> Hi Anshuman, > > On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote: >> This patch series is the follow up on the discussions we had before about >> the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation >> for virito devices (https://patchwork.kernel.org/patch/10417371/). There >> were suggestions about doing away with two different paths of transactions >> with the host/QEMU, first being the direct GPA and the other being the DMA >> API based translations. >> >> First patch attempts to create a direct GPA mapping based DMA operations >> structure called 'virtio_direct_dma_ops' with exact same implementation >> of the direct GPA path which virtio core currently has but just wrapped in >> a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of >> the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the >> existing semantics. The second patch does exactly that inside the function >> virtio_finalize_features(). The third patch removes the default direct GPA >> path from virtio core forcing it to use DMA API callbacks for all devices. >> Now with that change, every device must have a DMA operations structure >> associated with it. The fourth patch adds an additional hook which gives >> the platform an opportunity to do yet another override if required. This >> platform hook can be used on POWER Ultravisor based protected guests to >> load up SWIOTLB DMA callbacks to do the required (as discussed previously >> in the above mentioned thread how host is allowed to access only parts of >> the guest GPA range) bounce buffering into the shared memory for all I/O >> scatter gather buffers to be consumed on the host side. >> >> Please go through these patches and review whether this approach broadly >> makes sense. I will appreciate suggestions, inputs, comments regarding >> the patches or the approach in general. Thank you. > I just wanted to say that this patch series provides a means for us to > force the coherent DMA ops for legacy virtio devices on arm64, which in turn > means that we can enable the SMMU with legacy devices in our fastmodel > emulation platform (which is slowly being upgraded to virtio 1.0) without > hanging during boot. Patch below. > > So: > > Acked-by: Will Deacon <will.deacon at arm.com> > Tested-by: Will Deacon <will.deacon at arm.com>Thanks Will.
Anshuman Khandual
2018-Jul-28 08:56 UTC
[RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
On 07/20/2018 09:29 AM, Anshuman Khandual wrote:> Now that virtio core always needs all virtio devices to have DMA OPS, we > need to make sure that the structure it points is the right one. In the > absence of VIRTIO_F_IOMMU_PLATFORM flag QEMU expects GPA from guest kernel. > In such case, virtio device must use default virtio_direct_dma_ops DMA OPS > structure which transforms scatter gather buffer addresses as GPA. This > DMA OPS override must happen as early as possible during virtio device > initializatin sequence before virtio core starts using given device's DMA > OPS callbacks for I/O transactions. This change detects device's IOMMU flag > and does the override in case the flag is cleared. > > Signed-off-by: Anshuman Khandual <khandual at linux.vnet.ibm.com> > --- > drivers/virtio/virtio.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c > index 7907ad3..6b13987 100644 > --- a/drivers/virtio/virtio.c > +++ b/drivers/virtio/virtio.c > @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status) > } > EXPORT_SYMBOL_GPL(virtio_add_status); > > +const struct dma_map_ops virtio_direct_dma_ops; > + > int virtio_finalize_features(struct virtio_device *dev) > { > int ret = dev->config->finalize_features(dev); > @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev) > if (ret) > return ret;The previous patch removed the code block for XEN guests which forced the use of DMA API all the time irrespective of VIRTIO_F_IOMMU_PLATFORM flag on the device. Here is what I have removed with patch 2/4 which breaks the existing semantics on XEN guests. -static bool vring_use_dma_api(struct virtio_device *vdev) -{ - if (!virtio_has_iommu_quirk(vdev)) - return true; - - /* Otherwise, we are left to guess. */ - /* - * In theory, it's possible to have a buggy QEMU-supposed - * emulated Q35 IOMMU and Xen enabled at the same time. On - * such a configuration, virtio has never worked and will - * not work without an even larger kludge. Instead, enable - * the DMA API if we're a Xen guest, which at least allows - * all of the sensible Xen configurations to work correctly. - */ - if (xen_domain()) - return true; - - return false; -} XEN guests would not like override with virtio_direct_dma_ops in any case irrespective of the flag VIRTIO_F_IOMMU_PLATFORM. So the existing semantics can be preserved with something like this. It just assumes that dev->dma_ops is non-NULL and a valid one set by the architecture. If required we can add those tests here before skipping the override. diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 7907ad3..6b13987 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -166,6 +166,8 @@ void virtio_add_status(struct virtio_device *dev, unsigned int status) } EXPORT_SYMBOL_GPL(virtio_add_status); +const struct dma_map_ops virtio_direct_dma_ops; + int virtio_finalize_features(struct virtio_device *dev) { int ret = dev->config->finalize_features(dev); @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev) if (ret) return ret; + + if (xen_domain()) + goto skip_override; + + if (virtio_has_iommu_quirk(dev)) + set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops); + + skip_override: + if (!virtio_has_feature(dev, VIRTIO_F_VERSION_1)) return 0 Will incorporate these changes in the next version.
Christoph Hellwig
2018-Jul-30 09:24 UTC
[RFC 1/4] virtio: Define virtio_direct_dma_ops structure
> +/* > + * Virtio direct mapping DMA API operations structure > + * > + * This defines DMA API structure for all virtio devices which would not > + * either bring in their own DMA OPS from architecture or they would not > + * like to use architecture specific IOMMU based DMA OPS because QEMU > + * expects GPA instead of an IOVA in absence of VIRTIO_F_IOMMU_PLATFORM. > + */ > +dma_addr_t virtio_direct_map_page(struct device *dev, struct page *page, > + unsigned long offset, size_t size, > + enum dma_data_direction dir, > + unsigned long attrs)All these functions should probably be marked static.> +void virtio_direct_unmap_page(struct device *hwdev, dma_addr_t dev_addr, > + size_t size, enum dma_data_direction dir, > + unsigned long attrs) > +{ > +}No need to implement no-op callbacks in struct dma_map_ops.> + > +int virtio_direct_mapping_error(struct device *hwdev, dma_addr_t dma_addr) > +{ > + return 0; > +}Including this one.> +void *virtio_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > + gfp_t gfp, unsigned long attrs) > +{ > + void *queue = alloc_pages_exact(PAGE_ALIGN(size), gfp); > + > + if (queue) { > + phys_addr_t phys_addr = virt_to_phys(queue); > + *dma_handle = (dma_addr_t)phys_addr; > + > + if (WARN_ON_ONCE(*dma_handle != phys_addr)) { > + free_pages_exact(queue, PAGE_ALIGN(size)); > + return NULL; > + } > + } > + return queue;queue is a very odd name in a generic memory allocator.> +void virtio_direct_free(struct device *dev, size_t size, void *vaddr, > + dma_addr_t dma_addr, unsigned long attrs) > +{ > + free_pages_exact(vaddr, PAGE_ALIGN(size)); > +} > + > +const struct dma_map_ops virtio_direct_dma_ops = { > + .alloc = virtio_direct_alloc, > + .free = virtio_direct_free, > + .map_page = virtio_direct_map_page, > + .unmap_page = virtio_direct_unmap_page, > + .mapping_error = virtio_direct_mapping_error, > +};This is missing a dma_map_sg implementation. In general this is mandatory for dma_ops. So either you implement it or explain in a common why you think you can skip it.> +EXPORT_SYMBOL(virtio_direct_dma_ops);EXPORT_SYMBOL_GPL like all virtio symbols, please.
Christoph Hellwig
2018-Jul-30 09:25 UTC
[RFC 2/4] virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively
> +const struct dma_map_ops virtio_direct_dma_ops;This belongs into a header if it is non-static. If you only use it in this file anyway please mark it static and avoid a forward declaration.> + > int virtio_finalize_features(struct virtio_device *dev) > { > int ret = dev->config->finalize_features(dev); > @@ -174,6 +176,9 @@ int virtio_finalize_features(struct virtio_device *dev) > if (ret) > return ret; > > + if (virtio_has_iommu_quirk(dev)) > + set_dma_ops(dev->dev.parent, &virtio_direct_dma_ops);This needs a big fat comment explaining what is going on here. Also not new, but I find the existance of virtio_has_iommu_quirk and its name horribly confusing. It might be better to open code it here once only a single caller is left.
On Fri, Jul 27, 2018 at 10:58:05AM +0100, Will Deacon wrote:> > I just wanted to say that this patch series provides a means for us to > force the coherent DMA ops for legacy virtio devices on arm64, which in turn > means that we can enable the SMMU with legacy devices in our fastmodel > emulation platform (which is slowly being upgraded to virtio 1.0) without > hanging during boot. Patch below.Yikes, this is a nightmare. That is exactly where I do not want things to end up. We really need to distinguish between legacy virtual crappy virtio (and that includes v1) that totally ignores the bus it pretends to be on, and sane virtio (to be defined) that sit on a real (or properly emulated including iommu and details for dma mapping) bus. Having a mumble jumble of arch specific undocumented magic as in the powerpc patch replied to or this arm patch is a complete no-go. Nacked-by: Christoph Hellwig <hch at lst.de> for both.
On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote:> This patch series is the follow up on the discussions we had before about > the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation > for virito devices (https://patchwork.kernel.org/patch/10417371/). There > were suggestions about doing away with two different paths of transactions > with the host/QEMU, first being the direct GPA and the other being the DMA > API based translations. > > First patch attempts to create a direct GPA mapping based DMA operations > structure called 'virtio_direct_dma_ops' with exact same implementation > of the direct GPA path which virtio core currently has but just wrapped in > a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of > the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the > existing semantics. The second patch does exactly that inside the function > virtio_finalize_features(). The third patch removes the default direct GPA > path from virtio core forcing it to use DMA API callbacks for all devices. > Now with that change, every device must have a DMA operations structure > associated with it. The fourth patch adds an additional hook which gives > the platform an opportunity to do yet another override if required. This > platform hook can be used on POWER Ultravisor based protected guests to > load up SWIOTLB DMA callbacks to do the required (as discussed previously > in the above mentioned thread how host is allowed to access only parts of > the guest GPA range) bounce buffering into the shared memory for all I/O > scatter gather buffers to be consumed on the host side. > > Please go through these patches and review whether this approach broadly > makes sense. I will appreciate suggestions, inputs, comments regarding > the patches or the approach in general. Thank you.Jason did some work on profiling this. Unfortunately he reports about 4% extra overhead from this switch on x86 with no vIOMMU. I expect he's writing up the data in more detail, but just wanted to let you know this would be one more thing to debug before we can just switch to DMA APIs.> Anshuman Khandual (4): > virtio: Define virtio_direct_dma_ops structure > virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively > virtio: Force virtio core to use DMA API callbacks for all virtio devices > virtio: Add platform specific DMA API translation for virito devices > > arch/powerpc/include/asm/dma-mapping.h | 6 +++ > arch/powerpc/platforms/pseries/iommu.c | 6 +++ > drivers/virtio/virtio.c | 72 ++++++++++++++++++++++++++++++++++ > drivers/virtio/virtio_pci_common.h | 3 ++ > drivers/virtio/virtio_ring.c | 65 +----------------------------- > 5 files changed, 89 insertions(+), 63 deletions(-) > > -- > 2.9.3
On 2018?08?03? 04:55, Michael S. Tsirkin wrote:> On Fri, Jul 20, 2018 at 09:29:37AM +0530, Anshuman Khandual wrote: >> This patch series is the follow up on the discussions we had before about >> the RFC titled [RFC,V2] virtio: Add platform specific DMA API translation >> for virito devices (https://patchwork.kernel.org/patch/10417371/). There >> were suggestions about doing away with two different paths of transactions >> with the host/QEMU, first being the direct GPA and the other being the DMA >> API based translations. >> >> First patch attempts to create a direct GPA mapping based DMA operations >> structure called 'virtio_direct_dma_ops' with exact same implementation >> of the direct GPA path which virtio core currently has but just wrapped in >> a DMA API format. Virtio core must use 'virtio_direct_dma_ops' instead of >> the arch default in absence of VIRTIO_F_IOMMU_PLATFORM flag to preserve the >> existing semantics. The second patch does exactly that inside the function >> virtio_finalize_features(). The third patch removes the default direct GPA >> path from virtio core forcing it to use DMA API callbacks for all devices. >> Now with that change, every device must have a DMA operations structure >> associated with it. The fourth patch adds an additional hook which gives >> the platform an opportunity to do yet another override if required. This >> platform hook can be used on POWER Ultravisor based protected guests to >> load up SWIOTLB DMA callbacks to do the required (as discussed previously >> in the above mentioned thread how host is allowed to access only parts of >> the guest GPA range) bounce buffering into the shared memory for all I/O >> scatter gather buffers to be consumed on the host side. >> >> Please go through these patches and review whether this approach broadly >> makes sense. I will appreciate suggestions, inputs, comments regarding >> the patches or the approach in general. Thank you. > Jason did some work on profiling this. Unfortunately he reports > about 4% extra overhead from this switch on x86 with no vIOMMU.The test is rather simple, just run pktgen (pktgen_sample01_simple.sh) in guest and measure PPS on tap on host. Thanks> > I expect he's writing up the data in more detail, but > just wanted to let you know this would be one more > thing to debug before we can just switch to DMA APIs. > > >> Anshuman Khandual (4): >> virtio: Define virtio_direct_dma_ops structure >> virtio: Override device's DMA OPS with virtio_direct_dma_ops selectively >> virtio: Force virtio core to use DMA API callbacks for all virtio devices >> virtio: Add platform specific DMA API translation for virito devices >> >> arch/powerpc/include/asm/dma-mapping.h | 6 +++ >> arch/powerpc/platforms/pseries/iommu.c | 6 +++ >> drivers/virtio/virtio.c | 72 ++++++++++++++++++++++++++++++++++ >> drivers/virtio/virtio_pci_common.h | 3 ++ >> drivers/virtio/virtio_ring.c | 65 +----------------------------- >> 5 files changed, 89 insertions(+), 63 deletions(-) >> >> -- >> 2.9.3