Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 0/19] enable swiotlb-xen on arm and arm64
Hi all, this patch series enables xen-swiotlb on arm and arm64. Considering that all guests, including dom0, run on xen on arm with second stage translation enabled, it follows that without an IOMMU no guests could actually drive the hardware. The solution for platforms without an IOMMU is to use swiotlb-xen, adapted to autotranslate guests. swiotlb-xen provides a set of dma_ops that can be used by Linux to setup a contiguous buffer in stage-2 addresses and use it for dma operations. Basically Linux asks Xen to make a buffer contiguous and gets the machine address for it. This buffer is going to be used by lib/swiotlb.c to allocate bounce buffers. In few cases we can actually avoid going through the swiotlb bounce buffer: if a dma request involves only a single page we can try to pin the page passed in as an argument and return its machine address. In order to do this we need to keep track of the machine to physical mappings of the pinned pages. Since this approach is preferable to bouncing all the times but it only works with single pages, we force biovec requests not to be merged. Cheers, Stefano Changes in v6: - check for dev->dma_mask being NULL in dma_capable; - update the comments and the hypercalls structures; - add a xen_dma_info entry to the rbtree in xen_swiotlb_alloc_coherent to keep track of the new mapping. Free the entry in xen_swiotlb_free_coherent; - rename xen_dma_seg to dma_info in xen_swiotlb_alloc/free_coherent to avoid confusions; - introduce and export xen_dma_ops; - call xen_mm_init from as arch_initcall; - call __get_dma_ops to get the native dma_ops pointer on arm; - do not merge biovecs; - add single page optimization: pin the page rather than bouncing. Changes in v5: - dropped the first two patches, already in the Xen tree; - implement dma_mark_clean using dmac_flush_range on arm; - add "arm64: define DMA_ERROR_CODE" - better comment for XENMEM_exchange_and_pin return codes; - fix xen_dma_add_entry error path; - remove the spin_lock: the red-black tree is not modified at run time; - add "swiotlb-xen: introduce xen_swiotlb_set_dma_mask"; - add "xen: introduce xen_alloc/free_coherent_pages"; - add "swiotlb-xen: use xen_alloc/free_coherent_pages"; - add "swiotlb: don''t assume that io_tlb_start-io_tlb_end is coherent". Changes in v4: - rename XENMEM_get_dma_buf to XENMEM_exchange_and_pin; - rename XENMEM_put_dma_buf to XENMEM_unpin; - improve the documentation of the new hypercalls; - add a note about out.address_bits for XENMEM_exchange; - code style fixes; - add err_out label in xen_dma_add_entry; - remove INVALID_ADDRESS, use DMA_ERROR_CODE instead; - add in-code comments regarding the usage of xen_dma_seg[0].dma_addr. Changes in v3: - add a patch to compile SWIOTLB without CONFIG_NEED_SG_DMA_LENGTH; - add a patch to compile SWIOTLB_XEN without CONFIG_NEED_SG_DMA_LENGTH; - arm/dma_capable: do not treat dma_mask as a limit; - arm/dmabounce: keep using arm_dma_ops; - add missing __init in xen_early_init declaration; - many code style and name changes in swiotlb-xen.c; - improve error checks in xen_dma_add_entry; - warn on XENMEM_put_dma_buf failures. Changes in v2: - fixed a couple of errors in xen_bus_to_phys, xen_phys_to_bus and xen_swiotlb_fixup. Julien Grall (1): ASoC: Samsung: Rename dma_ops by samsung_dma_ops Stefano Stabellini (18): arm: make SWIOTLB available arm64: define DMA_ERROR_CODE xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin xen: make xen_create_contiguous_region return the dma address swiotlb-xen: support autotranslate guests xen/arm,arm64: enable SWIOTLB_XEN swiotlb-xen: introduce xen_swiotlb_set_dma_mask arm/xen: get_dma_ops: return xen_dma_ops if we are running on Xen arm64/xen: get_dma_ops: return xen_dma_ops if we are running on Xen xen: introduce xen_alloc/free_coherent_pages swiotlb-xen: use xen_alloc/free_coherent_pages swiotlb: don''t assume that io_tlb_start-io_tlb_end is coherent swiotlb: print a warning when the swiotlb is full swiotlb-xen: call dma_capable only if dev->dma_mask is allocated arm,arm64: do not always merge biovec if we are running on Xen xen: introduce XENMEM_pin swiotlb-xen: introduce a rbtree to track phys to bus mappings swiotlb-xen: instead of bouncing on the swiotlb, pin single pages arch/arm/Kconfig | 7 + arch/arm/include/asm/dma-mapping.h | 50 +++- arch/arm/include/asm/io.h | 8 + arch/arm/include/asm/xen/hypervisor.h | 2 + arch/arm/include/asm/xen/page-coherent.h | 22 ++ arch/arm/include/asm/xen/page.h | 2 + arch/arm/xen/Makefile | 2 +- arch/arm/xen/mm.c | 137 ++++++++ arch/arm64/Kconfig | 1 + arch/arm64/include/asm/dma-mapping.h | 14 +- arch/arm64/include/asm/io.h | 9 + arch/arm64/include/asm/xen/page-coherent.h | 24 ++ arch/arm64/xen/Makefile | 2 +- arch/ia64/include/asm/xen/page-coherent.h | 24 ++ arch/x86/include/asm/xen/page-coherent.h | 24 ++ arch/x86/xen/mmu.c | 18 +- drivers/xen/Kconfig | 1 - drivers/xen/biomerge.c | 4 +- drivers/xen/swiotlb-xen.c | 464 +++++++++++++++++++++++++--- include/xen/interface/memory.h | 83 +++++ include/xen/swiotlb-xen.h | 2 + include/xen/xen-ops.h | 8 +- lib/swiotlb.c | 14 +- sound/soc/samsung/dma.c | 4 +- 24 files changed, 870 insertions(+), 56 deletions(-) create mode 100644 arch/arm/include/asm/xen/page-coherent.h create mode 100644 arch/arm/xen/mm.c create mode 100644 arch/arm64/include/asm/xen/page-coherent.h create mode 100644 arch/ia64/include/asm/xen/page-coherent.h create mode 100644 arch/x86/include/asm/xen/page-coherent.h git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git swiotlb-xen-6
IOMMU_HELPER is needed because SWIOTLB calls iommu_is_span_boundary, provided by lib/iommu_helper.c. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: will.deacon@arm.com CC: linux@arm.linux.org.uk Changes in v6: - check for dev->dma_mask being NULL in dma_capable. Changes in v5: - implement dma_mark_clean using dmac_flush_range. Changes in v3: - dma_capable: do not treat dma_mask as a limit; - remove SWIOTLB dependency on NEED_SG_DMA_LENGTH. --- arch/arm/Kconfig | 6 +++++ arch/arm/include/asm/dma-mapping.h | 37 ++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+), 0 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index ba412e0..c0bfb33 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1832,6 +1832,12 @@ config CC_STACKPROTECTOR neutralized via a kernel panic. This feature requires gcc version 4.2 or above. +config SWIOTLB + def_bool y + +config IOMMU_HELPER + def_bool SWIOTLB + config XEN_DOM0 def_bool y depends on XEN diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 5b579b9..8807124 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -10,6 +10,7 @@ #include <asm-generic/dma-coherent.h> #include <asm/memory.h> +#include <asm/cacheflush.h> #define DMA_ERROR_CODE (~0) extern struct dma_map_ops arm_dma_ops; @@ -86,6 +87,42 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) } #endif +static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr) +{ + unsigned int offset = paddr & ~PAGE_MASK; + return pfn_to_dma(dev, paddr >> PAGE_SHIFT) + offset; +} + +static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr) +{ + unsigned int offset = dev_addr & ~PAGE_MASK; + return (dma_to_pfn(dev, dev_addr) << PAGE_SHIFT) + offset; +} + +static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size) +{ + u64 limit, mask; + + if (!dev->dma_mask) + return 0; + + mask = *dev->dma_mask; + + limit = (mask + 1) & ~mask; + if (limit && size > limit) + return 0; + + if ((addr | (addr + size - 1)) & ~mask) + return 0; + + return 1; +} + +static inline void dma_mark_clean(void *addr, size_t size) +{ + dmac_flush_range(addr, addr + size); +} + /* * DMA errors are defined by all-bits-set in the DMA address. */ -- 1.7.2.5
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: catalin.marinas@arm.com CC: will.deacon@arm.com --- arch/arm64/include/asm/dma-mapping.h | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h index 8d18100..c2cb8a0 100644 --- a/arch/arm64/include/asm/dma-mapping.h +++ b/arch/arm64/include/asm/dma-mapping.h @@ -25,6 +25,7 @@ #define ARCH_HAS_DMA_GET_REQUIRED_MASK +#define DMA_ERROR_CODE (~0) extern struct dma_map_ops *dma_ops; static inline struct dma_map_ops *get_dma_ops(struct device *dev) -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 03/19] xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin
XENMEM_exchange can''t be used by autotranslate guests because of two severe limitations: - it does not copy back the mfns into the out field for autotranslate guests; - it does not guarantee that the hypervisor won''t change the p2m mappings for the exchanged pages while the guest is using them. Xen never promises to keep the p2m mapping stable for autotranslate guests in general. In practice it won''t happen unless one uses uncommon features like memory sharing or paging. To overcome these problems I am introducing two new hypercalls. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v6: - update the comments and the hypercalls structures. Changes in v5: - better comment for XENMEM_exchange_and_pin return codes; Changes in v4: - rename XENMEM_get_dma_buf to XENMEM_exchange_and_pin; - rename XENMEM_put_dma_buf to XENMEM_unpin; - improve the documentation of the new hypercalls; - add a note about out.address_bits for XENMEM_exchange. --- include/xen/interface/memory.h | 51 ++++++++++++++++++++++++++++++++++++++++ 1 files changed, 51 insertions(+), 0 deletions(-) diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h index 2ecfe4f..49db252 100644 --- a/include/xen/interface/memory.h +++ b/include/xen/interface/memory.h @@ -66,6 +66,9 @@ struct xen_memory_exchange { /* * [IN] Details of memory extents to be exchanged (GMFN bases). * Note that @in.address_bits is ignored and unused. + * @out.address_bits contains the maximum number of bits addressable + * by the caller. The addresses of the newly allocated pages have to + * meet this restriction. */ struct xen_memory_reservation in; @@ -263,4 +266,52 @@ struct xen_remove_from_physmap { }; DEFINE_GUEST_HANDLE_STRUCT(xen_remove_from_physmap); +/* + * This hypercall is similar to XENMEM_exchange: it takes the same + * struct as an argument and it exchanges the pages passed in with a new + * set of pages. The new pages are going to be "pinned": it''s guaranteed + * that their p2m mapping won''t be changed until explicitly "unpinned". + * Only normal guest r/w memory can be pinned: no granted pages or + * ballooned pages. + * If return code is zero then @out.extent_list provides the frame + * numbers of the newly-allocated memory. + * On X86 the frame numbers are machine frame numbers (mfns). + * On ARMv7 and ARMv8 the frame numbers are machine frame numbers (mfns). + * Returns zero on complete success, otherwise a negative error code. + * The most common error codes are: + * -ENOSYS if not implemented + * -EPERM if the domain is not privileged for this operation + * -EBUSY if the page is already pinned + * -EFAULT if an internal error occurs + * On complete success then always @nr_exchanged == @in.nr_extents. On + * partial success @nr_exchanged indicates how much work was done and a + * negative error code is returned. + */ +#define XENMEM_exchange_and_pin 26 + +/* + * XENMEM_unpin unpins a set of pages, previously pinned by + * XENMEM_exchange_and_pin. After this call the p2m mapping of the pages can + * be transparently changed by the hypervisor, as usual. The pages are + * still accessible from the guest. + */ +#define XENMEM_unpin 27 +struct xen_unpin { + /* + * [IN] Details of memory extents to be unpinned (GMFN bases). + * Note that @in.address_bits is ignored and unused. + */ + struct xen_memory_reservation in; + /* + * [OUT] Number of input extents that were successfully unpinned. + * 1. The first @nr_unpinned input extents were successfully + * unpinned. + * 2. All other input extents are untouched. + * 3. If not all input extents are unpinned then the return code of this + * command will be non-zero. + */ + xen_ulong_t nr_unpinned; +}; +DEFINE_GUEST_HANDLE_STRUCT(xen_unpin); + #endif /* __XEN_PUBLIC_MEMORY_H__ */ -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 04/19] xen: make xen_create_contiguous_region return the dma address
Modify xen_create_contiguous_region to return the dma address of the newly contiguous buffer. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Changes in v4: - use virt_to_machine instead of virt_to_bus. --- arch/x86/xen/mmu.c | 4 +++- drivers/xen/swiotlb-xen.c | 6 +++--- include/xen/xen-ops.h | 3 ++- 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index fdc3ba2..6c34d7c 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -2329,7 +2329,8 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in, } int xen_create_contiguous_region(unsigned long vstart, unsigned int order, - unsigned int address_bits) + unsigned int address_bits, + dma_addr_t *dma_handle) { unsigned long *in_frames = discontig_frames, out_frame; unsigned long flags; @@ -2368,6 +2369,7 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order, spin_unlock_irqrestore(&xen_reservation_lock, flags); + *dma_handle = virt_to_machine(vstart).maddr; return success ? 0 : -ENOMEM; } EXPORT_SYMBOL_GPL(xen_create_contiguous_region); diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 1b2277c..b72f31c 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -126,6 +126,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) { int i, rc; int dma_bits; + dma_addr_t dma_handle; dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT; @@ -137,7 +138,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) rc = xen_create_contiguous_region( (unsigned long)buf + (i << IO_TLB_SHIFT), get_order(slabs << IO_TLB_SHIFT), - dma_bits); + dma_bits, &dma_handle); } while (rc && dma_bits++ < max_dma_bits); if (rc) return rc; @@ -294,11 +295,10 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, *dma_handle = dev_addr; else { if (xen_create_contiguous_region(vstart, order, - fls64(dma_mask)) != 0) { + fls64(dma_mask), dma_handle) != 0) { free_pages(vstart, order); return NULL; } - *dma_handle = virt_to_machine(ret).maddr; } memset(ret, 0, size); return ret; diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index d6fe062..9ef704d 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -20,7 +20,8 @@ int xen_setup_shutdown_event(void); extern unsigned long *xen_contiguous_bitmap; int xen_create_contiguous_region(unsigned long vstart, unsigned int order, - unsigned int address_bits); + unsigned int address_bits, + dma_addr_t *dma_handle); void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order); -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 05/19] swiotlb-xen: support autotranslate guests
Support autotranslate guests in swiotlb-xen by keeping track of the phys-to-bus and bus-to-phys mappings of the swiotlb buffer (xen_io_tlb_start-xen_io_tlb_end). Use a simple direct access on a pre-allocated array for phys-to-bus queries. Use a red-black tree for bus-to-phys queries. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Changes in v6: - add a xen_dma_info entry to the rbtree in xen_swiotlb_alloc_coherent to keep track of the new mapping. Free the entry in xen_swiotlb_free_coherent; - rename xen_dma_seg to dma_info in xen_swiotlb_alloc/free_coherent to avoid confusions. Changes in v5: - fix xen_dma_add_entry error path; - remove the spin_lock: the red-black tree is not modified at run time. Changes in v4: - add err_out label in xen_dma_add_entry; - remove INVALID_ADDRESS, use DMA_ERROR_CODE instead; - code style fixes; - add in-code comments regarding the usage of xen_dma_seg[0].dma_addr. Changes in v3: - many code style and name changes; - improve error checks in xen_dma_add_entry. --- drivers/xen/swiotlb-xen.c | 177 +++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 161 insertions(+), 16 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index b72f31c..84aef43 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -38,32 +38,131 @@ #include <linux/bootmem.h> #include <linux/dma-mapping.h> #include <linux/export.h> +#include <linux/slab.h> +#include <linux/rbtree.h> #include <xen/swiotlb-xen.h> #include <xen/page.h> #include <xen/xen-ops.h> #include <xen/hvc-console.h> +#include <xen/features.h> /* * Used to do a quick range check in swiotlb_tbl_unmap_single and * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this * API. */ +#define NR_DMA_SEGS ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE) static char *xen_io_tlb_start, *xen_io_tlb_end; static unsigned long xen_io_tlb_nslabs; /* * Quick lookup value of the bus address of the IOTLB. */ -static u64 start_dma_addr; +struct xen_dma_info { + dma_addr_t dma_addr; + phys_addr_t phys_addr; + size_t size; + struct rb_node rbnode; +}; + +/* + * This array of struct xen_dma_info is indexed by physical addresses, + * starting from virt_to_phys(xen_io_tlb_start). Each entry maps + * (IO_TLB_SEGSIZE << IO_TLB_SHIFT) bytes, except the last one that is + * smaller. Getting the dma address corresponding to a given physical + * address can be done by direct access with the right index on the + * array. + */ +static struct xen_dma_info *xen_dma_seg; +/* + * This tree keeps track of bus address to physical address + * mappings. + */ +static struct rb_root bus_to_phys = RB_ROOT; + +static int xen_dma_add_entry(struct xen_dma_info *new) +{ + struct rb_node **link = &bus_to_phys.rb_node; + struct rb_node *parent = NULL; + struct xen_dma_info *entry; + int rc = 0; + + while (*link) { + parent = *link; + entry = rb_entry(parent, struct xen_dma_info, rbnode); + + if (new->dma_addr == entry->dma_addr) + goto err_out; + if (new->phys_addr == entry->phys_addr) + goto err_out; + + if (new->dma_addr < entry->dma_addr) + link = &(*link)->rb_left; + else + link = &(*link)->rb_right; + } + rb_link_node(&new->rbnode, parent, link); + rb_insert_color(&new->rbnode, &bus_to_phys); + goto out; + +err_out: + rc = -EINVAL; + pr_warn("%s: cannot add phys=%pa -> dma=%pa: phys=%pa -> dma=%pa already exists\n", + __func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr); +out: + return rc; +} + +static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) +{ + struct rb_node *n = bus_to_phys.rb_node; + struct xen_dma_info *entry; + + while (n) { + entry = rb_entry(n, struct xen_dma_info, rbnode); + if (entry->dma_addr <= dma_addr && + entry->dma_addr + entry->size > dma_addr) { + return entry; + } + if (dma_addr < entry->dma_addr) + n = n->rb_left; + else + n = n->rb_right; + } + + return NULL; +} static dma_addr_t xen_phys_to_bus(phys_addr_t paddr) { - return phys_to_machine(XPADDR(paddr)).maddr; + int nr_seg; + unsigned long offset; + char *vaddr; + + if (!xen_feature(XENFEAT_auto_translated_physmap)) + return phys_to_machine(XPADDR(paddr)).maddr; + + vaddr = (char *)phys_to_virt(paddr); + if (vaddr >= xen_io_tlb_end || vaddr < xen_io_tlb_start) + return DMA_ERROR_CODE; + + offset = vaddr - xen_io_tlb_start; + nr_seg = offset / (IO_TLB_SEGSIZE << IO_TLB_SHIFT); + + return xen_dma_seg[nr_seg].dma_addr + + (paddr - xen_dma_seg[nr_seg].phys_addr); } static phys_addr_t xen_bus_to_phys(dma_addr_t baddr) { - return machine_to_phys(XMADDR(baddr)).paddr; + if (xen_feature(XENFEAT_auto_translated_physmap)) { + struct xen_dma_info *dma = xen_get_dma_info_from_dma(baddr); + if (dma == NULL) + return DMA_ERROR_CODE; + else + return dma->phys_addr + (baddr - dma->dma_addr); + } else + return machine_to_phys(XMADDR(baddr)).paddr; } static dma_addr_t xen_virt_to_bus(void *address) @@ -107,6 +206,9 @@ static int is_xen_swiotlb_buffer(dma_addr_t dma_addr) unsigned long pfn = mfn_to_local_pfn(mfn); phys_addr_t paddr; + if (xen_feature(XENFEAT_auto_translated_physmap)) + return 1; + /* If the address is outside our domain, it CAN * have the same virtual address as another address * in our domain. Therefore _only_ check address within our domain. @@ -124,13 +226,12 @@ static int max_dma_bits = 32; static int xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) { - int i, rc; + int i, j, rc; int dma_bits; - dma_addr_t dma_handle; dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT; - i = 0; + i = j = 0; do { int slabs = min(nslabs - i, (unsigned long)IO_TLB_SEGSIZE); @@ -138,12 +239,18 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) rc = xen_create_contiguous_region( (unsigned long)buf + (i << IO_TLB_SHIFT), get_order(slabs << IO_TLB_SHIFT), - dma_bits, &dma_handle); + dma_bits, &xen_dma_seg[j].dma_addr); } while (rc && dma_bits++ < max_dma_bits); if (rc) return rc; + xen_dma_seg[j].phys_addr = virt_to_phys(buf + (i << IO_TLB_SHIFT)); + xen_dma_seg[j].size = slabs << IO_TLB_SHIFT; + rc = xen_dma_add_entry(&xen_dma_seg[j]); + if (rc != 0) + return rc; i += slabs; + j++; } while (i < nslabs); return 0; } @@ -193,9 +300,10 @@ retry: /* * Get IO TLB memory from any location. */ - if (early) + if (early) { xen_io_tlb_start = alloc_bootmem_pages(PAGE_ALIGN(bytes)); - else { + xen_dma_seg = alloc_bootmem(sizeof(struct xen_dma_info) * NR_DMA_SEGS); + } else { #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { @@ -210,6 +318,8 @@ retry: xen_io_tlb_nslabs = SLABS_PER_PAGE << order; bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT; } + xen_dma_seg = kzalloc(sizeof(struct xen_dma_info) * NR_DMA_SEGS, + GFP_KERNEL); } if (!xen_io_tlb_start) { m_ret = XEN_SWIOTLB_ENOMEM; @@ -232,7 +342,6 @@ retry: m_ret = XEN_SWIOTLB_EFIXUP; goto error; } - start_dma_addr = xen_virt_to_bus(xen_io_tlb_start); if (early) { if (swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs, verbose)) @@ -267,6 +376,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, unsigned long vstart; phys_addr_t phys; dma_addr_t dev_addr; + struct xen_dma_info *dma_info = NULL; /* * Ignore region specifiers - the kernel''s ideas of @@ -290,7 +400,8 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, phys = virt_to_phys(ret); dev_addr = xen_phys_to_bus(phys); - if (((dev_addr + size - 1 <= dma_mask)) && + if (!xen_feature(XENFEAT_auto_translated_physmap) && + ((dev_addr + size - 1 <= dma_mask)) && !range_straddles_page_boundary(phys, size)) *dma_handle = dev_addr; else { @@ -299,6 +410,22 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, free_pages(vstart, order); return NULL; } + + dma_info = kzalloc(sizeof(struct xen_dma_info), GFP_KERNEL); + if (!dma_info) { + pr_warn("cannot allocate xen_dma_info\n"); + xen_destroy_contiguous_region(phys, order); + return NULL; + } + dma_info->phys_addr = phys; + dma_info->size = size; + dma_info->dma_addr = *dma_handle; + if (xen_dma_add_entry(dma_info)) { + pr_warn("cannot add new entry to bus_to_phys\n"); + xen_destroy_contiguous_region(phys, order); + kfree(dma_info); + return NULL; + } } memset(ret, 0, size); return ret; @@ -312,6 +439,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, int order = get_order(size); phys_addr_t phys; u64 dma_mask = DMA_BIT_MASK(32); + struct xen_dma_info *dma_info = NULL; if (dma_release_from_coherent(hwdev, order, vaddr)) return; @@ -321,9 +449,14 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, phys = virt_to_phys(vaddr); - if (((dev_addr + size - 1 > dma_mask)) || - range_straddles_page_boundary(phys, size)) + if (xen_feature(XENFEAT_auto_translated_physmap) || + (((dev_addr + size - 1 > dma_mask)) || + range_straddles_page_boundary(phys, size))) { xen_destroy_contiguous_region((unsigned long)vaddr, order); + dma_info = xen_get_dma_info_from_dma(dev_addr); + rb_erase(&dma_info->rbnode, &bus_to_phys); + kfree(dma_info); + } free_pages((unsigned long)vaddr, order); } @@ -351,14 +484,19 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, * we can safely return the device addr and not worry about bounce * buffering it. */ - if (dma_capable(dev, dev_addr, size) && + if (!xen_feature(XENFEAT_auto_translated_physmap) && + dma_capable(dev, dev_addr, size) && !range_straddles_page_boundary(phys, size) && !swiotlb_force) return dev_addr; /* * Oh well, have to allocate and map a bounce buffer. + * Pass the dma_addr of the first slab in the iotlb buffer as + * argument so that swiotlb_tbl_map_single is free to allocate + * the bounce buffer anywhere appropriate in io_tlb_start - + * io_tlb_end. */ - map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir); + map = swiotlb_tbl_map_single(dev, xen_dma_seg[0].dma_addr, phys, size, dir); if (map == SWIOTLB_MAP_ERROR) return DMA_ERROR_CODE; @@ -494,10 +632,17 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, dma_addr_t dev_addr = xen_phys_to_bus(paddr); if (swiotlb_force || + xen_feature(XENFEAT_auto_translated_physmap) || !dma_capable(hwdev, dev_addr, sg->length) || range_straddles_page_boundary(paddr, sg->length)) { + /* + * Pass the dma_addr of the first slab in the iotlb buffer as + * argument so that swiotlb_tbl_map_single is free to allocate + * the bounce buffer anywhere appropriate in io_tlb_start - + * io_tlb_end. + */ phys_addr_t map = swiotlb_tbl_map_single(hwdev, - start_dma_addr, + xen_dma_seg[0].dma_addr, sg_phys(sg), sg->length, dir); -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 06/19] xen/arm,arm64: enable SWIOTLB_XEN
Xen on arm and arm64 needs SWIOTLB_XEN: when running on Xen we need to program the hardware with mfns rather than pfns for dma addresses. Remove SWIOTLB_XEN dependency on X86 and PCI and make XEN select SWIOTLB_XEN on arm and arm64. At the moment always rely on swiotlb-xen, but when Xen starts supporting hardware IOMMUs we''ll be able to avoid it conditionally on the presence of an IOMMU on the platform. Implement xen_create_contiguous_region on arm and arm64 by using XENMEM_exchange_and_pin. Initialize the xen-swiotlb from xen_early_init (before the native dma_ops are initialized), set xen_dma_ops to &xen_swiotlb_dma_ops. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v6: - introduce and export xen_dma_ops; - call xen_mm_init from as arch_initcall. Changes in v4: - remove redefinition of DMA_ERROR_CODE; - update the code to use XENMEM_exchange_and_pin and XENMEM_unpin; - add a note about hardware IOMMU in the commit message. Changes in v3: - code style changes; - warn on XENMEM_put_dma_buf failures. --- arch/arm/Kconfig | 1 + arch/arm/include/asm/xen/hypervisor.h | 2 + arch/arm/include/asm/xen/page.h | 2 + arch/arm/xen/Makefile | 2 +- arch/arm/xen/mm.c | 121 +++++++++++++++++++++++++++++++++ arch/arm64/Kconfig | 1 + arch/arm64/xen/Makefile | 2 +- drivers/xen/Kconfig | 1 - drivers/xen/swiotlb-xen.c | 16 +++++ 9 files changed, 145 insertions(+), 3 deletions(-) create mode 100644 arch/arm/xen/mm.c diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index c0bfb33..2c9d112 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1848,6 +1848,7 @@ config XEN depends on CPU_V7 && !CPU_V6 depends on !GENERIC_ATOMIC64 select ARM_PSCI + select SWIOTLB_XEN help Say Y if you want to run Linux in a Virtual Machine on Xen on ARM. diff --git a/arch/arm/include/asm/xen/hypervisor.h b/arch/arm/include/asm/xen/hypervisor.h index d7ab99a..1317ee4 100644 --- a/arch/arm/include/asm/xen/hypervisor.h +++ b/arch/arm/include/asm/xen/hypervisor.h @@ -16,4 +16,6 @@ static inline enum paravirt_lazy_mode paravirt_get_lazy_mode(void) return PARAVIRT_LAZY_NONE; } +extern struct dma_map_ops *xen_dma_ops; + #endif /* _ASM_ARM_XEN_HYPERVISOR_H */ diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h index 359a7b5..b0f7150 100644 --- a/arch/arm/include/asm/xen/page.h +++ b/arch/arm/include/asm/xen/page.h @@ -6,12 +6,14 @@ #include <linux/pfn.h> #include <linux/types.h> +#include <linux/dma-mapping.h> #include <xen/interface/grant_table.h> #define pfn_to_mfn(pfn) (pfn) #define phys_to_machine_mapping_valid(pfn) (1) #define mfn_to_pfn(mfn) (mfn) +#define mfn_to_local_pfn(m) (mfn_to_pfn(m)) #define mfn_to_virt(m) (__va(mfn_to_pfn(m) << PAGE_SHIFT)) #define pte_mfn pte_pfn diff --git a/arch/arm/xen/Makefile b/arch/arm/xen/Makefile index 4384103..66fc35d 100644 --- a/arch/arm/xen/Makefile +++ b/arch/arm/xen/Makefile @@ -1 +1 @@ -obj-y := enlighten.o hypercall.o grant-table.o +obj-y := enlighten.o hypercall.o grant-table.o mm.o diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c new file mode 100644 index 0000000..b065c98 --- /dev/null +++ b/arch/arm/xen/mm.c @@ -0,0 +1,121 @@ +#include <linux/bootmem.h> +#include <linux/gfp.h> +#include <linux/export.h> +#include <linux/slab.h> +#include <linux/types.h> +#include <linux/dma-mapping.h> +#include <linux/vmalloc.h> +#include <linux/swiotlb.h> + +#include <xen/xen.h> +#include <xen/interface/memory.h> +#include <xen/swiotlb-xen.h> + +#include <asm/cacheflush.h> +#include <asm/xen/page.h> +#include <asm/xen/hypercall.h> +#include <asm/xen/interface.h> + +static int xen_exchange_memory(xen_ulong_t extents_in, + unsigned int order_in, + xen_pfn_t *pfns_in, + xen_ulong_t extents_out, + unsigned int order_out, + xen_pfn_t *mfns_out, + unsigned int address_bits) +{ + long rc; + int success; + + struct xen_memory_exchange exchange = { + .in = { + .nr_extents = extents_in, + .extent_order = order_in, + .domid = DOMID_SELF + }, + .out = { + .nr_extents = extents_out, + .extent_order = order_out, + .address_bits = address_bits, + .domid = DOMID_SELF + } + }; + set_xen_guest_handle(exchange.in.extent_start, pfns_in); + set_xen_guest_handle(exchange.out.extent_start, mfns_out); + + BUG_ON(extents_in << order_in != extents_out << order_out); + + + rc = HYPERVISOR_memory_op(XENMEM_exchange_and_pin, &exchange); + success = (exchange.nr_exchanged == extents_in); + + BUG_ON(!success && ((exchange.nr_exchanged != 0) || (rc == 0))); + BUG_ON(success && (rc != 0)); + + return success; +} + +int xen_create_contiguous_region(unsigned long vstart, unsigned int order, + unsigned int address_bits, + dma_addr_t *dma_handle) +{ + phys_addr_t pstart = __pa(vstart); + xen_pfn_t in_frame, out_frame; + int success; + + /* Get a new contiguous memory extent. */ + in_frame = out_frame = pstart >> PAGE_SHIFT; + success = xen_exchange_memory(1, order, &in_frame, + 1, order, &out_frame, + address_bits); + + if (!success) + return -ENOMEM; + + *dma_handle = out_frame << PAGE_SHIFT; + + return success ? 0 : -ENOMEM; +} +EXPORT_SYMBOL_GPL(xen_create_contiguous_region); + +void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) +{ + xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT; + struct xen_unpin unpin = { + .in = { + .nr_extents = 1, + .extent_order = order, + .domid = DOMID_SELF + }, + }; + set_xen_guest_handle(unpin.in.extent_start, &in_frame); + + WARN_ON(HYPERVISOR_memory_op(XENMEM_unpin, &unpin)); +} +EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region); + +struct dma_map_ops *xen_dma_ops; +EXPORT_SYMBOL_GPL(xen_dma_ops); + +static struct dma_map_ops xen_swiotlb_dma_ops = { + .mapping_error = xen_swiotlb_dma_mapping_error, + .alloc = xen_swiotlb_alloc_coherent, + .free = xen_swiotlb_free_coherent, + .sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu, + .sync_single_for_device = xen_swiotlb_sync_single_for_device, + .sync_sg_for_cpu = xen_swiotlb_sync_sg_for_cpu, + .sync_sg_for_device = xen_swiotlb_sync_sg_for_device, + .map_sg = xen_swiotlb_map_sg_attrs, + .unmap_sg = xen_swiotlb_unmap_sg_attrs, + .map_page = xen_swiotlb_map_page, + .unmap_page = xen_swiotlb_unmap_page, + .dma_supported = xen_swiotlb_dma_supported, +}; + +int __init xen_mm_init(void) +{ + xen_swiotlb_init(1, false); + xen_dma_ops = &xen_swiotlb_dma_ops; + return 0; +} +arch_initcall(xen_mm_init); diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 9737e97..aa1f6fb 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -209,6 +209,7 @@ config XEN_DOM0 config XEN bool "Xen guest support on ARM64 (EXPERIMENTAL)" depends on ARM64 && OF + select SWIOTLB_XEN help Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64. diff --git a/arch/arm64/xen/Makefile b/arch/arm64/xen/Makefile index be24040..0ef9637 100644 --- a/arch/arm64/xen/Makefile +++ b/arch/arm64/xen/Makefile @@ -1,2 +1,2 @@ -xen-arm-y += $(addprefix ../../arm/xen/, enlighten.o grant-table.o) +xen-arm-y += $(addprefix ../../arm/xen/, enlighten.o grant-table.o mm.o) obj-y := xen-arm.o hypercall.o diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig index 9e02d60..7e83688 100644 --- a/drivers/xen/Kconfig +++ b/drivers/xen/Kconfig @@ -140,7 +140,6 @@ config XEN_GRANT_DEV_ALLOC config SWIOTLB_XEN def_bool y - depends on PCI && X86 select SWIOTLB config XEN_TMEM diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 84aef43..a1cb0f4 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -45,6 +45,8 @@ #include <xen/xen-ops.h> #include <xen/hvc-console.h> #include <xen/features.h> +#include <asm/dma-mapping.h> + /* * Used to do a quick range check in swiotlb_tbl_unmap_single and * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this @@ -58,6 +60,20 @@ static unsigned long xen_io_tlb_nslabs; * Quick lookup value of the bus address of the IOTLB. */ +#ifndef CONFIG_X86 +static unsigned long dma_alloc_coherent_mask(struct device *dev, + gfp_t gfp) +{ + unsigned long dma_mask = 0; + + dma_mask = dev->coherent_dma_mask; + if (!dma_mask) + dma_mask = (gfp & GFP_DMA) ? DMA_BIT_MASK(24) : DMA_BIT_MASK(32); + + return dma_mask; +} +#endif + struct xen_dma_info { dma_addr_t dma_addr; phys_addr_t phys_addr; -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 07/19] swiotlb-xen: introduce xen_swiotlb_set_dma_mask
Implement xen_swiotlb_set_dma_mask, use it for set_dma_mask on arm. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- arch/arm/xen/mm.c | 1 + drivers/xen/swiotlb-xen.c | 12 ++++++++++++ include/xen/swiotlb-xen.h | 2 ++ 3 files changed, 15 insertions(+), 0 deletions(-) diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c index b065c98..4330b15 100644 --- a/arch/arm/xen/mm.c +++ b/arch/arm/xen/mm.c @@ -110,6 +110,7 @@ static struct dma_map_ops xen_swiotlb_dma_ops = { .map_page = xen_swiotlb_map_page, .unmap_page = xen_swiotlb_unmap_page, .dma_supported = xen_swiotlb_dma_supported, + .set_dma_mask = xen_swiotlb_set_dma_mask, }; int __init xen_mm_init(void) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index a1cb0f4..deb9131 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -754,3 +754,15 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask) return xen_virt_to_bus(xen_io_tlb_end - 1) <= mask; } EXPORT_SYMBOL_GPL(xen_swiotlb_dma_supported); + +int +xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask) +{ + if (!dev->dma_mask || !xen_swiotlb_dma_supported(dev, dma_mask)) + return -EIO; + + *dev->dma_mask = dma_mask; + + return 0; +} +EXPORT_SYMBOL_GPL(xen_swiotlb_set_dma_mask); diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h index de8bcc6..7b64465 100644 --- a/include/xen/swiotlb-xen.h +++ b/include/xen/swiotlb-xen.h @@ -55,4 +55,6 @@ xen_swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr); extern int xen_swiotlb_dma_supported(struct device *hwdev, u64 mask); +extern int +xen_swiotlb_set_dma_mask(struct device *dev, u64 dma_mask); #endif /* __LINUX_SWIOTLB_XEN_H */ -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 08/19] arm/xen: get_dma_ops: return xen_dma_ops if we are running on Xen
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> CC: will.deacon@arm.com CC: linux@arm.linux.org.uk --- arch/arm/include/asm/dma-mapping.h | 13 ++++++++++++- 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 8807124..b638d6d 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -12,17 +12,28 @@ #include <asm/memory.h> #include <asm/cacheflush.h> +#include <xen/xen.h> +#include <asm/xen/hypervisor.h> + #define DMA_ERROR_CODE (~0) extern struct dma_map_ops arm_dma_ops; extern struct dma_map_ops arm_coherent_dma_ops; -static inline struct dma_map_ops *get_dma_ops(struct device *dev) +static inline struct dma_map_ops *__get_dma_ops(struct device *dev) { if (dev && dev->archdata.dma_ops) return dev->archdata.dma_ops; return &arm_dma_ops; } +static inline struct dma_map_ops *get_dma_ops(struct device *dev) +{ + if (xen_domain()) + return xen_dma_ops; + else + return __get_dma_ops(dev); +} + static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops) { BUG_ON(!dev); -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 09/19] arm64/xen: get_dma_ops: return xen_dma_ops if we are running on Xen
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> CC: catalin.marinas@arm.com CC: will.deacon@arm.com --- arch/arm64/include/asm/dma-mapping.h | 13 ++++++++++++- 1 files changed, 12 insertions(+), 1 deletions(-) diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h index c2cb8a0..1dbd1d5 100644 --- a/arch/arm64/include/asm/dma-mapping.h +++ b/arch/arm64/include/asm/dma-mapping.h @@ -23,12 +23,15 @@ #include <asm-generic/dma-coherent.h> +#include <xen/xen.h> +#include <asm/xen/hypervisor.h> + #define ARCH_HAS_DMA_GET_REQUIRED_MASK #define DMA_ERROR_CODE (~0) extern struct dma_map_ops *dma_ops; -static inline struct dma_map_ops *get_dma_ops(struct device *dev) +static inline struct dma_map_ops *__get_dma_ops(struct device *dev) { if (unlikely(!dev) || !dev->archdata.dma_ops) return dma_ops; @@ -36,6 +39,14 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev) return dev->archdata.dma_ops; } +static inline struct dma_map_ops *get_dma_ops(struct device *dev) +{ + if (xen_domain()) + return xen_dma_ops; + else + return __get_dma_ops(dev); +} + #include <asm-generic/dma-mapping-common.h> static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr) -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 10/19] xen: introduce xen_alloc/free_coherent_pages
xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu and devices. On native x86 and ARMv8 is sufficient to call __get_free_pages in order to get a coherent buffer, while on ARM we need to call the native dma_ops->alloc implementation. When arm64 stops using the swiotlb by default and starts having multiple dma_ops implementations, we''ll use __get_dma_ops there too. Introduce xen_alloc_coherent_pages to abstract the arch specific buffer allocation. Similarly introduce xen_free_coherent_pages to free a coherent buffer: on x86 and ARM64 is simply a call to free_pages while on ARM is arm_dma_ops.free. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v6: - call __get_dma_ops to get the native dma_ops pointer on arm. --- arch/arm/include/asm/xen/page-coherent.h | 22 ++++++++++++++++++++++ arch/arm64/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++ arch/ia64/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++ arch/x86/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++ 4 files changed, 94 insertions(+), 0 deletions(-) create mode 100644 arch/arm/include/asm/xen/page-coherent.h create mode 100644 arch/arm64/include/asm/xen/page-coherent.h create mode 100644 arch/ia64/include/asm/xen/page-coherent.h create mode 100644 arch/x86/include/asm/xen/page-coherent.h diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h new file mode 100644 index 0000000..92b7a8a --- /dev/null +++ b/arch/arm/include/asm/xen/page-coherent.h @@ -0,0 +1,22 @@ +#ifndef _ASM_ARM_XEN_PAGE_COHERENT_H +#define _ASM_ARM_XEN_PAGE_COHERENT_H + +#include <asm/page.h> +#include <linux/dma-attrs.h> +#include <linux/dma-mapping.h> + +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, + dma_addr_t *dma_handle, gfp_t flags, + struct dma_attrs *attrs) +{ + return __get_dma_ops(hwdev)->alloc(hwdev, size, dma_handle, flags, attrs); +} + +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, + void *cpu_addr, dma_addr_t dma_handle, + struct dma_attrs *attrs) +{ + __get_dma_ops(hwdev)->free(hwdev, size, cpu_addr, dma_handle, attrs); +} + +#endif /* _ASM_ARM_XEN_PAGE_COHERENT_H */ diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h new file mode 100644 index 0000000..0d6ad25 --- /dev/null +++ b/arch/arm64/include/asm/xen/page-coherent.h @@ -0,0 +1,24 @@ +#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H +#define _ASM_ARM64_XEN_PAGE_COHERENT_H + +#include <asm/page.h> +#include <linux/dma-attrs.h> +#include <linux/dma-mapping.h> + +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, + dma_addr_t *dma_handle, gfp_t flags, + struct dma_attrs *attrs) +{ + void *vstart = (void*)__get_free_pages(flags, get_order(size)); + *dma_handle = virt_to_phys(vstart); + return vstart; +} + +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, + void *cpu_addr, dma_addr_t dma_handle, + struct dma_attrs *attrs) +{ + free_pages((unsigned long) cpu_addr, get_order(size)); +} + +#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */ diff --git a/arch/ia64/include/asm/xen/page-coherent.h b/arch/ia64/include/asm/xen/page-coherent.h new file mode 100644 index 0000000..37b929c --- /dev/null +++ b/arch/ia64/include/asm/xen/page-coherent.h @@ -0,0 +1,24 @@ +#ifndef _ASM_IA64_XEN_PAGE_COHERENT_H +#define _ASM_IA64_XEN_PAGE_COHERENT_H + +#include <asm/page.h> +#include <linux/dma-attrs.h> +#include <linux/dma-mapping.h> + +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, + dma_addr_t *dma_handle, gfp_t flags, + struct dma_attrs *attrs) +{ + void *vstart = (void*)__get_free_pages(flags, get_order(size)); + *dma_handle = virt_to_phys(vstart); + return vstart; +} + +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, + void *cpu_addr, dma_addr_t dma_handle, + struct dma_attrs *attrs) +{ + free_pages((unsigned long) cpu_addr, get_order(size)); +} + +#endif /* _ASM_IA64_XEN_PAGE_COHERENT_H */ diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h new file mode 100644 index 0000000..31de2e0 --- /dev/null +++ b/arch/x86/include/asm/xen/page-coherent.h @@ -0,0 +1,24 @@ +#ifndef _ASM_X86_XEN_PAGE_COHERENT_H +#define _ASM_X86_XEN_PAGE_COHERENT_H + +#include <asm/page.h> +#include <linux/dma-attrs.h> +#include <linux/dma-mapping.h> + +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, + dma_addr_t *dma_handle, gfp_t flags, + struct dma_attrs *attrs) +{ + void *vstart = (void*)__get_free_pages(flags, get_order(size)); + *dma_handle = virt_to_phys(vstart); + return vstart; +} + +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, + void *cpu_addr, dma_addr_t dma_handle, + struct dma_attrs *attrs) +{ + free_pages((unsigned long) cpu_addr, get_order(size)); +} + +#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */ -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:09 UTC
[PATCH v6 11/19] swiotlb-xen: use xen_alloc/free_coherent_pages
Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or free coherent pages. We need to be careful handling the pointer returned by xen_alloc_coherent_pages, because on ARM the pointer is not equal to phys_to_virt(*dma_handle). In fact virt_to_phys only works for kernel direct mapped RAM memory. In ARM case the pointer could be an ioremap address, therefore passing it to virt_to_phys would give you another physical address that doesn''t correspond to it. Make xen_create_contiguous_region take a phys_addr_t as start parameter to avoid the virt_to_phys calls which would be incorrect. Changes in v6: - remove extra spaces. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- arch/arm/xen/mm.c | 7 +++---- arch/x86/xen/mmu.c | 7 +++++-- drivers/xen/swiotlb-xen.c | 31 +++++++++++++++++++++---------- include/xen/xen-ops.h | 4 ++-- 4 files changed, 31 insertions(+), 18 deletions(-) diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c index 4330b15..b305b94 100644 --- a/arch/arm/xen/mm.c +++ b/arch/arm/xen/mm.c @@ -55,11 +55,10 @@ static int xen_exchange_memory(xen_ulong_t extents_in, return success; } -int xen_create_contiguous_region(unsigned long vstart, unsigned int order, +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, unsigned int address_bits, dma_addr_t *dma_handle) { - phys_addr_t pstart = __pa(vstart); xen_pfn_t in_frame, out_frame; int success; @@ -78,9 +77,9 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order, } EXPORT_SYMBOL_GPL(xen_create_contiguous_region); -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order) { - xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT; + xen_pfn_t in_frame = pstart >> PAGE_SHIFT; struct xen_unpin unpin = { .in = { .nr_extents = 1, diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 6c34d7c..8830883 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -2328,13 +2328,14 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in, return success; } -int xen_create_contiguous_region(unsigned long vstart, unsigned int order, +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, unsigned int address_bits, dma_addr_t *dma_handle) { unsigned long *in_frames = discontig_frames, out_frame; unsigned long flags; int success; + unsigned long vstart = (unsigned long)phys_to_virt(pstart); /* * Currently an auto-translated guest will not perform I/O, nor will @@ -2374,11 +2375,12 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order, } EXPORT_SYMBOL_GPL(xen_create_contiguous_region); -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order) { unsigned long *out_frames = discontig_frames, in_frame; unsigned long flags; int success; + unsigned long vstart; if (xen_feature(XENFEAT_auto_translated_physmap)) return; @@ -2386,6 +2388,7 @@ void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) if (unlikely(order > MAX_CONTIG_ORDER)) return; + vstart = (unsigned long)phys_to_virt(pstart); memset((void *) vstart, 0, PAGE_SIZE << order); spin_lock_irqsave(&xen_reservation_lock, flags); diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index deb9131..96ad316 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -46,6 +46,7 @@ #include <xen/hvc-console.h> #include <xen/features.h> #include <asm/dma-mapping.h> +#include <asm/xen/page-coherent.h> /* * Used to do a quick range check in swiotlb_tbl_unmap_single and @@ -244,6 +245,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) { int i, j, rc; int dma_bits; + phys_addr_t p = virt_to_phys(buf); dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT; @@ -253,7 +255,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) do { rc = xen_create_contiguous_region( - (unsigned long)buf + (i << IO_TLB_SHIFT), + p + (i << IO_TLB_SHIFT), get_order(slabs << IO_TLB_SHIFT), dma_bits, &xen_dma_seg[j].dma_addr); } while (rc && dma_bits++ < max_dma_bits); @@ -389,7 +391,6 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, void *ret; int order = get_order(size); u64 dma_mask = DMA_BIT_MASK(32); - unsigned long vstart; phys_addr_t phys; dma_addr_t dev_addr; struct xen_dma_info *dma_info = NULL; @@ -405,8 +406,12 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, if (dma_alloc_from_coherent(hwdev, size, dma_handle, &ret)) return ret; - vstart = __get_free_pages(flags, order); - ret = (void *)vstart; + /* On ARM this function returns an ioremap''ped virtual address for + * which virt_to_phys doesn''t return the corresponding physical + * address. In fact on ARM virt_to_phys only works for kernel direct + * mapped RAM memory. Also see comment below. + */ + ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs); if (!ret) return ret; @@ -414,16 +419,20 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, if (hwdev && hwdev->coherent_dma_mask) dma_mask = dma_alloc_coherent_mask(hwdev, flags); - phys = virt_to_phys(ret); + /* At this point dma_handle is the physical address, next we are + * going to set it to the machine address. + * Do not use virt_to_phys(ret) because on ARM it doesn''t correspond + * to *dma_handle. */ + phys = *dma_handle; dev_addr = xen_phys_to_bus(phys); if (!xen_feature(XENFEAT_auto_translated_physmap) && ((dev_addr + size - 1 <= dma_mask)) && !range_straddles_page_boundary(phys, size)) *dma_handle = dev_addr; else { - if (xen_create_contiguous_region(vstart, order, + if (xen_create_contiguous_region(phys, order, fls64(dma_mask), dma_handle) != 0) { - free_pages(vstart, order); + xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs); return NULL; } @@ -463,18 +472,20 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, if (hwdev && hwdev->coherent_dma_mask) dma_mask = hwdev->coherent_dma_mask; - phys = virt_to_phys(vaddr); + /* do not use virt_to_phys because on ARM it doesn''t return you the + * physical address */ + phys = xen_bus_to_phys(dev_addr); if (xen_feature(XENFEAT_auto_translated_physmap) || (((dev_addr + size - 1 > dma_mask)) || range_straddles_page_boundary(phys, size))) { - xen_destroy_contiguous_region((unsigned long)vaddr, order); + xen_destroy_contiguous_region(phys, order); dma_info = xen_get_dma_info_from_dma(dev_addr); rb_erase(&dma_info->rbnode, &bus_to_phys); kfree(dma_info); } - free_pages((unsigned long)vaddr, order); + xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs); } EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent); diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index 9ef704d..fb2ea8f 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -19,11 +19,11 @@ void xen_arch_resume(void); int xen_setup_shutdown_event(void); extern unsigned long *xen_contiguous_bitmap; -int xen_create_contiguous_region(unsigned long vstart, unsigned int order, +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, unsigned int address_bits, dma_addr_t *dma_handle); -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order); +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order); struct vm_area_struct; int xen_remap_domain_mfn_range(struct vm_area_struct *vma, -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:10 UTC
[PATCH v6 12/19] swiotlb: don''t assume that io_tlb_start-io_tlb_end is coherent
The swiotlb code has appropriate calls to dma_mark_clean in place for buffers passed to swiotlb_map_page as an argument. However it assumes that the swiotlb bounce buffer (io_tlb_start-io_tlb_end) is already coherent and doesn''t need any calls to dma_mark_clean. On ARM the swiotlb bounce buffer is not coherent (the memory is writealloc while it should be bufferable) and therefore we need to call dma_mark_clean appropriately on the bounce buffer code paths too. Note that most architecures have an empty dma_mark_clean implementation anyway. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- lib/swiotlb.c | 13 ++++++++++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 4e8686c..eb45d17 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -515,6 +515,7 @@ found: io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT); if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE); + dma_mark_clean(phys_to_virt(tlb_addr), size); return tlb_addr; } @@ -547,7 +548,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, * First, sync the memory before unmapping the entry */ if (orig_addr && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))) + { + dma_mark_clean(phys_to_virt(tlb_addr), size); swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE); + } /* * Return the buffer to the free list by setting the corresponding @@ -587,17 +591,20 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr, switch (target) { case SYNC_FOR_CPU: - if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) + if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) { + dma_mark_clean(phys_to_virt(tlb_addr), size); swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE); + } else BUG_ON(dir != DMA_TO_DEVICE); break; case SYNC_FOR_DEVICE: - if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) + if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) { swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE); - else + dma_mark_clean(phys_to_virt(tlb_addr), size); + } else BUG_ON(dir != DMA_FROM_DEVICE); break; default: -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:10 UTC
[PATCH v6 13/19] ASoC: Samsung: Rename dma_ops by samsung_dma_ops
From: Julien Grall <julien.grall@linaro.org> The commit "arm: introduce a global dma_ops pointer" introduce compilation issue when CONFIG_SND_SOC_SAMSUNG is enabled. sound/soc/samsung/dma.c:345:27: error: conflicting types for ''dma_ops'' /local/home/julien/works/arndale/linux/arch/arm/include/asm/dma-mapping.h:16:28: note: previous declaration of ''dma_ops'' was here Signed-off-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- sound/soc/samsung/dma.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/soc/samsung/dma.c b/sound/soc/samsung/dma.c index 21b7926..827c0d8 100644 --- a/sound/soc/samsung/dma.c +++ b/sound/soc/samsung/dma.c @@ -341,7 +341,7 @@ static int dma_mmap(struct snd_pcm_substream *substream, runtime->dma_bytes); } -static struct snd_pcm_ops dma_ops = { +static struct snd_pcm_ops samsung_dma_ops = { .open = dma_open, .close = dma_close, .ioctl = snd_pcm_lib_ioctl, @@ -428,7 +428,7 @@ out: } static struct snd_soc_platform_driver samsung_asoc_platform = { - .ops = &dma_ops, + .ops = &samsung_dma_ops, .pcm_new = dma_new, .pcm_free = dma_free_dma_buffers, }; -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:10 UTC
[PATCH v6 14/19] swiotlb: print a warning when the swiotlb is full
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- drivers/xen/swiotlb-xen.c | 1 + lib/swiotlb.c | 1 + 2 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 96ad316..790c2eb 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -674,6 +674,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, sg->length, dir); if (map == SWIOTLB_MAP_ERROR) { + pr_warn("swiotlb buffer is full\n"); /* Don''t panic here, we expect map_sg users to do proper error handling. */ xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir, diff --git a/lib/swiotlb.c b/lib/swiotlb.c index eb45d17..f06da0d 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -502,6 +502,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, not_found: spin_unlock_irqrestore(&io_tlb_lock, flags); + pr_warn("swiotlb buffer is full\n"); return SWIOTLB_MAP_ERROR; found: spin_unlock_irqrestore(&io_tlb_lock, flags); -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:10 UTC
[PATCH v6 15/19] swiotlb-xen: call dma_capable only if dev->dma_mask is allocated
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- drivers/xen/swiotlb-xen.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 790c2eb..3011736 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -512,7 +512,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, * buffering it. */ if (!xen_feature(XENFEAT_auto_translated_physmap) && - dma_capable(dev, dev_addr, size) && + dev->dma_mask && dma_capable(dev, dev_addr, size) && !range_straddles_page_boundary(phys, size) && !swiotlb_force) return dev_addr; @@ -532,7 +532,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, /* * Ensure that the address returned is DMA''ble */ - if (!dma_capable(dev, dev_addr, size)) { + if (dev->dma_mask && !dma_capable(dev, dev_addr, size)) { swiotlb_tbl_unmap_single(dev, map, size, dir); dev_addr = 0; } @@ -660,7 +660,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, if (swiotlb_force || xen_feature(XENFEAT_auto_translated_physmap) || - !dma_capable(hwdev, dev_addr, sg->length) || + (hwdev->dma_mask && !dma_capable(hwdev, dev_addr, sg->length)) || range_straddles_page_boundary(paddr, sg->length)) { /* * Pass the dma_addr of the first slab in the iotlb buffer as -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:10 UTC
[PATCH v6 16/19] arm,arm64: do not always merge biovec if we are running on Xen
This is similar to what it is done on X86: biovecs are prevented from merging otherwise every dma requests would be forced to bounce on the swiotlb buffer. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- arch/arm/include/asm/io.h | 8 ++++++++ arch/arm64/include/asm/io.h | 9 +++++++++ drivers/xen/biomerge.c | 4 +++- 3 files changed, 20 insertions(+), 1 deletions(-) diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index d070741..c45effc 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -24,9 +24,11 @@ #ifdef __KERNEL__ #include <linux/types.h> +#include <linux/blk_types.h> #include <asm/byteorder.h> #include <asm/memory.h> #include <asm-generic/pci_iomap.h> +#include <xen/xen.h> /* * ISA I/O bus memory addresses are 1:1 with the physical address. @@ -372,6 +374,12 @@ extern void pci_iounmap(struct pci_dev *dev, void __iomem *addr); #define BIOVEC_MERGEABLE(vec1, vec2) \ ((bvec_to_phys((vec1)) + (vec1)->bv_len) == bvec_to_phys((vec2))) +extern bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, + const struct bio_vec *vec2); +#define BIOVEC_PHYS_MERGEABLE(vec1, vec2) \ + (__BIOVEC_PHYS_MERGEABLE(vec1, vec2) && \ + (!xen_domain() || xen_biovec_phys_mergeable(vec1, vec2))) + #ifdef CONFIG_MMU #define ARCH_HAS_VALID_PHYS_ADDR_RANGE extern int valid_phys_addr_range(phys_addr_t addr, size_t size); diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h index 1d12f89..c163287b 100644 --- a/arch/arm64/include/asm/io.h +++ b/arch/arm64/include/asm/io.h @@ -22,11 +22,14 @@ #ifdef __KERNEL__ #include <linux/types.h> +#include <linux/blk_types.h> #include <asm/byteorder.h> #include <asm/barrier.h> #include <asm/pgtable.h> +#include <xen/xen.h> + /* * Generic IO read/write. These perform native-endian accesses. */ @@ -263,5 +266,11 @@ extern int devmem_is_allowed(unsigned long pfn); */ #define xlate_dev_kmem_ptr(p) p +extern bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, + const struct bio_vec *vec2); +#define BIOVEC_PHYS_MERGEABLE(vec1, vec2) \ + (__BIOVEC_PHYS_MERGEABLE(vec1, vec2) && \ + (!xen_domain() || xen_biovec_phys_mergeable(vec1, vec2))) + #endif /* __KERNEL__ */ #endif /* __ASM_IO_H */ diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c index 0edb91c..f323a2d 100644 --- a/drivers/xen/biomerge.c +++ b/drivers/xen/biomerge.c @@ -2,6 +2,7 @@ #include <linux/io.h> #include <linux/export.h> #include <xen/page.h> +#include <xen/features.h> bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, const struct bio_vec *vec2) @@ -9,7 +10,8 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page)); unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page)); - return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) && + return !xen_feature(XENFEAT_auto_translated_physmap) && + __BIOVEC_PHYS_MERGEABLE(vec1, vec2) && ((mfn1 == mfn2) || ((mfn1+1) == mfn2)); } EXPORT_SYMBOL(xen_biovec_phys_mergeable); -- 1.7.2.5
Introduce a new hypercall to pin one or more pages whose machine addresses respect a dma_mask passed as an argument Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- arch/arm/xen/mm.c | 16 ++++++++++++++++ arch/x86/xen/mmu.c | 7 +++++++ include/xen/interface/memory.h | 32 ++++++++++++++++++++++++++++++++ include/xen/xen-ops.h | 1 + 4 files changed, 56 insertions(+), 0 deletions(-) diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c index b305b94..146c1c3 100644 --- a/arch/arm/xen/mm.c +++ b/arch/arm/xen/mm.c @@ -55,6 +55,22 @@ static int xen_exchange_memory(xen_ulong_t extents_in, return success; } +int xen_pin_page(xen_pfn_t *in_frame, unsigned int address_bits) +{ + struct xen_pin pin = { + .in = { + .nr_extents = 1, + .extent_order = 0, + .domid = DOMID_SELF, + .address_bits = address_bits + }, + }; + set_xen_guest_handle(pin.in.extent_start, in_frame); + + return HYPERVISOR_memory_op(XENMEM_pin, &pin); +} +EXPORT_SYMBOL_GPL(xen_pin_page); + int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, unsigned int address_bits, dma_addr_t *dma_handle) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 8830883..8f76ce2 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -2568,3 +2568,10 @@ int xen_unmap_domain_mfn_range(struct vm_area_struct *vma, return -EINVAL; } EXPORT_SYMBOL_GPL(xen_unmap_domain_mfn_range); + +int xen_pin_page(xen_pfn_t *in_frame, unsigned int address_bits) +{ + return -ENOSYS; +} +EXPORT_SYMBOL_GPL(xen_pin_page); + diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h index 49db252..66ab578 100644 --- a/include/xen/interface/memory.h +++ b/include/xen/interface/memory.h @@ -314,4 +314,36 @@ struct xen_unpin { }; DEFINE_GUEST_HANDLE_STRUCT(xen_unpin); +/* + * XENMEM_pin pins a set of pages to make sure that the hypervisor does + * not change the p2m mappings for them. + * + */ +#define XENMEM_pin 28 +struct xen_pin { + /* + * [IN/OUT] Details of memory extents to be pinned (GMFN bases). + * Xen copies back the MFNs corresponding to the GMFNs passed in as + * argument. + * @in.address_bits contains the maximum number of bits addressable + * by the caller. If the machine addresses of the pages to be pinned + * are not addressable according to @in.address_bits, the hypercall + * fails and returns an errors. The pages are not pinned. Otherwise + * the hypercall succeeds. + */ + struct xen_memory_reservation in; + + /* + * [OUT] Number of input extents that were successfully pinned. + * 1. The first @nr_pinned input extents were successfully + * pinned. + * 2. All other input extents are untouched. + * 3. If not all input extents are pinned then the return code of this + * command will be non-zero. + */ + xen_ulong_t nr_pinned; +}; +DEFINE_GUEST_HANDLE_STRUCT(xen_pin); + + #endif /* __XEN_PUBLIC_MEMORY_H__ */ diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index fb2ea8f..4cf4fc5 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -22,6 +22,7 @@ extern unsigned long *xen_contiguous_bitmap; int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, unsigned int address_bits, dma_addr_t *dma_handle); +int xen_pin_page(xen_pfn_t *in_frame, unsigned int address_bits); void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order); -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:10 UTC
[PATCH v6 18/19] swiotlb-xen: introduce a rbtree to track phys to bus mappings
Introduce a second red-back tree to track phys to bus mappings created after the initialization of the swiotlb buffer. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- drivers/xen/swiotlb-xen.c | 99 +++++++++++++++++++++++++++++++++++++------- 1 files changed, 83 insertions(+), 16 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 3011736..022bcaf 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -79,7 +79,8 @@ struct xen_dma_info { dma_addr_t dma_addr; phys_addr_t phys_addr; size_t size; - struct rb_node rbnode; + struct rb_node rbnode_dma; + struct rb_node rbnode_phys; }; /* @@ -96,8 +97,13 @@ static struct xen_dma_info *xen_dma_seg; * mappings. */ static struct rb_root bus_to_phys = RB_ROOT; +/* + * This tree keeps track of physical address to bus address + * mappings apart from the ones belonging to the initial swiotlb buffer. + */ +static struct rb_root phys_to_bus = RB_ROOT; -static int xen_dma_add_entry(struct xen_dma_info *new) +static int xen_dma_add_entry_bus(struct xen_dma_info *new) { struct rb_node **link = &bus_to_phys.rb_node; struct rb_node *parent = NULL; @@ -106,7 +112,7 @@ static int xen_dma_add_entry(struct xen_dma_info *new) while (*link) { parent = *link; - entry = rb_entry(parent, struct xen_dma_info, rbnode); + entry = rb_entry(parent, struct xen_dma_info, rbnode_dma); if (new->dma_addr == entry->dma_addr) goto err_out; @@ -118,8 +124,41 @@ static int xen_dma_add_entry(struct xen_dma_info *new) else link = &(*link)->rb_right; } - rb_link_node(&new->rbnode, parent, link); - rb_insert_color(&new->rbnode, &bus_to_phys); + rb_link_node(&new->rbnode_dma, parent, link); + rb_insert_color(&new->rbnode_dma, &bus_to_phys); + goto out; + +err_out: + rc = -EINVAL; + pr_warn("%s: cannot add phys=%pa -> dma=%pa: phys=%pa -> dma=%pa already exists\n", + __func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr); +out: + return rc; +} + +static int xen_dma_add_entry_phys(struct xen_dma_info *new) +{ + struct rb_node **link = &phys_to_bus.rb_node; + struct rb_node *parent = NULL; + struct xen_dma_info *entry; + int rc = 0; + + while (*link) { + parent = *link; + entry = rb_entry(parent, struct xen_dma_info, rbnode_phys); + + if (new->dma_addr == entry->dma_addr) + goto err_out; + if (new->phys_addr == entry->phys_addr) + goto err_out; + + if (new->phys_addr < entry->phys_addr) + link = &(*link)->rb_left; + else + link = &(*link)->rb_right; + } + rb_link_node(&new->rbnode_phys, parent, link); + rb_insert_color(&new->rbnode_phys, &phys_to_bus); goto out; err_out: @@ -130,13 +169,22 @@ out: return rc; } +static int xen_dma_add_entry(struct xen_dma_info *new) +{ + int rc; + if ((rc = xen_dma_add_entry_bus(new) < 0) || + (rc = xen_dma_add_entry_phys(new) < 0)) + return rc; + return 0; +} + static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) { struct rb_node *n = bus_to_phys.rb_node; struct xen_dma_info *entry; while (n) { - entry = rb_entry(n, struct xen_dma_info, rbnode); + entry = rb_entry(n, struct xen_dma_info, rbnode_dma); if (entry->dma_addr <= dma_addr && entry->dma_addr + entry->size > dma_addr) { return entry; @@ -150,11 +198,30 @@ static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) return NULL; } -static dma_addr_t xen_phys_to_bus(phys_addr_t paddr) +static struct xen_dma_info *xen_get_dma_info_from_phys(phys_addr_t phys) { - int nr_seg; - unsigned long offset; - char *vaddr; + struct rb_node *n = phys_to_bus.rb_node; + struct xen_dma_info *entry; + + while (n) { + entry = rb_entry(n, struct xen_dma_info, rbnode_phys); + if (entry->phys_addr <= phys && + entry->phys_addr + entry->size > phys) { + return entry; + } + if (phys < entry->phys_addr) + n = n->rb_left; + else + n = n->rb_right; + } + + return NULL; +} + +/* Only looks into the initial buffer allocation in case of + * XENFEAT_auto_translated_physmap guests. */ +static dma_addr_t xen_phys_to_bus_quick(phys_addr_t paddr) { int nr_seg; + unsigned long offset; char *vaddr; if (!xen_feature(XENFEAT_auto_translated_physmap)) return phys_to_machine(XPADDR(paddr)).maddr; @@ -184,7 +251,7 @@ static phys_addr_t xen_bus_to_phys(dma_addr_t baddr) static dma_addr_t xen_virt_to_bus(void *address) { - return xen_phys_to_bus(virt_to_phys(address)); + return xen_phys_to_bus_quick(virt_to_phys(address)); } static int check_pages_physically_contiguous(unsigned long pfn, @@ -424,7 +491,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, * Do not use virt_to_phys(ret) because on ARM it doesn''t correspond * to *dma_handle. */ phys = *dma_handle; - dev_addr = xen_phys_to_bus(phys); + dev_addr = xen_phys_to_bus_quick(phys); if (!xen_feature(XENFEAT_auto_translated_physmap) && ((dev_addr + size - 1 <= dma_mask)) && !range_straddles_page_boundary(phys, size)) @@ -503,7 +570,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, struct dma_attrs *attrs) { phys_addr_t map, phys = page_to_phys(page) + offset; - dma_addr_t dev_addr = xen_phys_to_bus(phys); + dma_addr_t dev_addr = xen_phys_to_bus_quick(phys); BUG_ON(dir == DMA_NONE); /* @@ -527,7 +594,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, if (map == SWIOTLB_MAP_ERROR) return DMA_ERROR_CODE; - dev_addr = xen_phys_to_bus(map); + dev_addr = xen_phys_to_bus_quick(map); /* * Ensure that the address returned is DMA''ble @@ -656,7 +723,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, for_each_sg(sgl, sg, nelems, i) { phys_addr_t paddr = sg_phys(sg); - dma_addr_t dev_addr = xen_phys_to_bus(paddr); + dma_addr_t dev_addr = xen_phys_to_bus_quick(paddr); if (swiotlb_force || xen_feature(XENFEAT_auto_translated_physmap) || @@ -682,7 +749,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, sg_dma_len(sgl) = 0; return DMA_ERROR_CODE; } - sg->dma_address = xen_phys_to_bus(map); + sg->dma_address = xen_phys_to_bus_quick(map); } else sg->dma_address = dev_addr; sg_dma_len(sg) = sg->length; -- 1.7.2.5
Stefano Stabellini
2013-Sep-27 16:10 UTC
[PATCH v6 19/19] swiotlb-xen: instead of bouncing on the swiotlb, pin single pages
If we are dealing with single page mappings that don''t cross page boundaries, we can try to pin the page and get the corresponding mfn, using xen_pin_page. This avoids going through the swiotlb bounce buffer. If xen_pin_page fails (because the underlying mfn doesn''t respect the dma_mask) fall back to the swiotlb bounce buffer. Add a ref count to xen_dma_info, so that we can avoid pinnig pages that are already pinned. Use a spinlock to protect accesses, insertions and deletions in the rbtrees. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- drivers/xen/swiotlb-xen.c | 152 ++++++++++++++++++++++++++++++++++++++++++--- 1 files changed, 143 insertions(+), 9 deletions(-) diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 022bcaf..6f94285 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -57,6 +57,8 @@ #define NR_DMA_SEGS ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE) static char *xen_io_tlb_start, *xen_io_tlb_end; static unsigned long xen_io_tlb_nslabs; +spinlock_t swiotlb_lock; + /* * Quick lookup value of the bus address of the IOTLB. */ @@ -79,6 +81,7 @@ struct xen_dma_info { dma_addr_t dma_addr; phys_addr_t phys_addr; size_t size; + atomic_t refs; struct rb_node rbnode_dma; struct rb_node rbnode_phys; }; @@ -254,6 +257,48 @@ static dma_addr_t xen_virt_to_bus(void *address) return xen_phys_to_bus_quick(virt_to_phys(address)); } +static int xen_pin_dev_page(struct device *dev, + phys_addr_t phys, + dma_addr_t *dev_addr) +{ + u64 dma_mask = DMA_BIT_MASK(32); + xen_pfn_t in; + struct xen_dma_info *dma_info = xen_get_dma_info_from_phys(phys); + + if (dma_info != NULL) { + atomic_inc(&dma_info->refs); + *dev_addr = dma_info->dma_addr + (phys - dma_info->phys_addr); + return 0; + } + + if (dev && dev->coherent_dma_mask) + dma_mask = dma_alloc_coherent_mask(dev, GFP_KERNEL); + + in = phys >> PAGE_SHIFT; + if (!xen_pin_page(&in, fls64(dma_mask))) { + *dev_addr = in << PAGE_SHIFT; + dma_info = kzalloc(sizeof(struct xen_dma_info), GFP_NOWAIT); + if (!dma_info) { + pr_warn("cannot allocate xen_dma_info\n"); + xen_destroy_contiguous_region(phys & PAGE_MASK, 0); + return -ENOMEM; + } + dma_info->phys_addr = phys & PAGE_MASK; + dma_info->size = PAGE_SIZE; + dma_info->dma_addr = *dev_addr; + if (xen_dma_add_entry(dma_info)) { + pr_warn("cannot add new entry to bus_to_phys\n"); + xen_destroy_contiguous_region(phys & PAGE_MASK, 0); + kfree(dma_info); + return -EFAULT; + } + atomic_set(&dma_info->refs, 1); + *dev_addr += (phys & ~PAGE_MASK); + return 0; + } + return -EFAULT; +} + static int check_pages_physically_contiguous(unsigned long pfn, unsigned int offset, size_t length) @@ -434,6 +479,7 @@ retry: rc = 0; } else rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs); + spin_lock_init(&swiotlb_lock); return rc; error: if (repeat--) { @@ -461,6 +507,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, phys_addr_t phys; dma_addr_t dev_addr; struct xen_dma_info *dma_info = NULL; + unsigned long irqflags; /* * Ignore region specifiers - the kernel''s ideas of @@ -497,7 +544,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, !range_straddles_page_boundary(phys, size)) *dma_handle = dev_addr; else { - if (xen_create_contiguous_region(phys, order, + if (xen_create_contiguous_region(phys & PAGE_MASK, order, fls64(dma_mask), dma_handle) != 0) { xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs); return NULL; @@ -509,15 +556,19 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, xen_destroy_contiguous_region(phys, order); return NULL; } - dma_info->phys_addr = phys; - dma_info->size = size; + dma_info->phys_addr = phys & PAGE_MASK; + dma_info->size = (1U << order) << PAGE_SHIFT; dma_info->dma_addr = *dma_handle; + atomic_set(&dma_info->refs, 1); + spin_lock_irqsave(&swiotlb_lock, irqflags); if (xen_dma_add_entry(dma_info)) { + spin_unlock_irqrestore(&swiotlb_lock, irqflags); pr_warn("cannot add new entry to bus_to_phys\n"); xen_destroy_contiguous_region(phys, order); kfree(dma_info); return NULL; } + spin_unlock_irqrestore(&swiotlb_lock, irqflags); } memset(ret, 0, size); return ret; @@ -532,6 +583,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, phys_addr_t phys; u64 dma_mask = DMA_BIT_MASK(32); struct xen_dma_info *dma_info = NULL; + unsigned long flags; if (dma_release_from_coherent(hwdev, order, vaddr)) return; @@ -539,6 +591,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, if (hwdev && hwdev->coherent_dma_mask) dma_mask = hwdev->coherent_dma_mask; + spin_lock_irqsave(&swiotlb_lock, flags); /* do not use virt_to_phys because on ARM it doesn''t return you the * physical address */ phys = xen_bus_to_phys(dev_addr); @@ -546,12 +599,16 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, if (xen_feature(XENFEAT_auto_translated_physmap) || (((dev_addr + size - 1 > dma_mask)) || range_straddles_page_boundary(phys, size))) { - xen_destroy_contiguous_region(phys, order); dma_info = xen_get_dma_info_from_dma(dev_addr); - rb_erase(&dma_info->rbnode, &bus_to_phys); - kfree(dma_info); + if (atomic_dec_and_test(&dma_info->refs)) { + xen_destroy_contiguous_region(phys & PAGE_MASK, order); + rb_erase(&dma_info->rbnode_dma, &bus_to_phys); + rb_erase(&dma_info->rbnode_phys, &phys_to_bus); + kfree(dma_info); + } } + spin_unlock_irqrestore(&swiotlb_lock, flags); xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs); } EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent); @@ -583,6 +640,23 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, !range_straddles_page_boundary(phys, size) && !swiotlb_force) return dev_addr; + if (xen_feature(XENFEAT_auto_translated_physmap) && + size <= PAGE_SIZE && + !range_straddles_page_boundary(phys, size) && + !swiotlb_force) { + unsigned long flags; + int rc; + + spin_lock_irqsave(&swiotlb_lock, flags); + rc = xen_pin_dev_page(dev, phys, &dev_addr); + spin_unlock_irqrestore(&swiotlb_lock, flags); + + if (!rc) { + dma_mark_clean(phys_to_virt(phys), size); + return dev_addr; + } + } + /* * Oh well, have to allocate and map a bounce buffer. * Pass the dma_addr of the first slab in the iotlb buffer as @@ -618,10 +692,37 @@ EXPORT_SYMBOL_GPL(xen_swiotlb_map_page); static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr, size_t size, enum dma_data_direction dir) { - phys_addr_t paddr = xen_bus_to_phys(dev_addr); + struct xen_dma_info *dma_info; + phys_addr_t paddr = DMA_ERROR_CODE; + char *vaddr = NULL; + unsigned long flags; BUG_ON(dir == DMA_NONE); + spin_lock_irqsave(&swiotlb_lock, flags); + dma_info = xen_get_dma_info_from_dma(dev_addr); + if (dma_info != NULL) { + paddr = dma_info->phys_addr + (dev_addr - dma_info->dma_addr); + vaddr = phys_to_virt(paddr); + } + + if (xen_feature(XENFEAT_auto_translated_physmap) && + paddr != DMA_ERROR_CODE && + !(vaddr >= xen_io_tlb_start && vaddr < xen_io_tlb_end) && + !swiotlb_force) { + if (atomic_dec_and_test(&dma_info->refs)) { + xen_destroy_contiguous_region(paddr & PAGE_MASK, 0); + rb_erase(&dma_info->rbnode_dma, &bus_to_phys); + rb_erase(&dma_info->rbnode_phys, &phys_to_bus); + kfree(dma_info); + } + spin_unlock_irqrestore(&swiotlb_lock, flags); + if ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)) + dma_mark_clean(vaddr, size); + return; + } + spin_unlock_irqrestore(&swiotlb_lock, flags); + /* NOTE: We use dev_addr here, not paddr! */ if (is_xen_swiotlb_buffer(dev_addr)) { swiotlb_tbl_unmap_single(hwdev, paddr, size, dir); @@ -664,9 +765,19 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, enum dma_sync_target target) { phys_addr_t paddr = xen_bus_to_phys(dev_addr); + char *vaddr = phys_to_virt(paddr); BUG_ON(dir == DMA_NONE); + if (xen_feature(XENFEAT_auto_translated_physmap) && + paddr != DMA_ERROR_CODE && + size <= PAGE_SIZE && + !(vaddr >= xen_io_tlb_start && vaddr < xen_io_tlb_end) && + !range_straddles_page_boundary(paddr, size) && !swiotlb_force) { + dma_mark_clean(vaddr, size); + return; + } + /* NOTE: We use dev_addr here, not paddr! */ if (is_xen_swiotlb_buffer(dev_addr)) { swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target); @@ -717,13 +828,36 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, struct dma_attrs *attrs) { struct scatterlist *sg; - int i; + int i, rc; + u64 dma_mask = DMA_BIT_MASK(32); + unsigned long flags; BUG_ON(dir == DMA_NONE); + if (hwdev && hwdev->coherent_dma_mask) + dma_mask = dma_alloc_coherent_mask(hwdev, GFP_KERNEL); + for_each_sg(sgl, sg, nelems, i) { phys_addr_t paddr = sg_phys(sg); - dma_addr_t dev_addr = xen_phys_to_bus_quick(paddr); + dma_addr_t dev_addr; + + if (xen_feature(XENFEAT_auto_translated_physmap) && + !range_straddles_page_boundary(paddr, sg->length) && + sg->length <= PAGE_SIZE && + !swiotlb_force) { + + spin_lock_irqsave(&swiotlb_lock, flags); + rc = xen_pin_dev_page(hwdev, paddr, &dev_addr); + spin_unlock_irqrestore(&swiotlb_lock, flags); + + if (!rc) { + dma_mark_clean(phys_to_virt(paddr), sg->length); + sg_dma_len(sg) = sg->length; + sg->dma_address = dev_addr; + continue; + } + } + dev_addr = xen_phys_to_bus_quick(paddr); if (swiotlb_force || xen_feature(XENFEAT_auto_translated_physmap) || -- 1.7.2.5
On Fri, Sep 27, 2013 at 05:09:50PM +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > CC: catalin.marinas@arm.com > CC: will.deacon@arm.comAcked-by: Catalin Marinas <catalin.marinas@arm.com>
Catalin Marinas
2013-Sep-30 10:48 UTC
Re: [PATCH v6 09/19] arm64/xen: get_dma_ops: return xen_dma_ops if we are running on Xen
On Fri, Sep 27, 2013 at 05:09:57PM +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > CC: catalin.marinas@arm.com > CC: will.deacon@arm.comAcked-by: Catalin Marinas <catalin.marinas@arm.com>
Konrad Rzeszutek Wilk
2013-Sep-30 14:54 UTC
Re: [PATCH v6 03/19] xen: introduce XENMEM_exchange_and_pin and XENMEM_unpin
On Fri, Sep 27, 2013 at 05:09:51PM +0100, Stefano Stabellini wrote:> XENMEM_exchange can''t be used by autotranslate guests because of two > severe limitations: > > - it does not copy back the mfns into the out field for autotranslate > guests; > > - it does not guarantee that the hypervisor won''t change the p2m > mappings for the exchanged pages while the guest is using them. Xen > never promises to keep the p2m mapping stable for autotranslate guests > in general. In practice it won''t happen unless one uses uncommon > features like memory sharing or paging. > > To overcome these problems I am introducing two new hypercalls. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>> > > Changes in v6: > - update the comments and the hypercalls structures. > > Changes in v5: > - better comment for XENMEM_exchange_and_pin return codes; > > Changes in v4: > - rename XENMEM_get_dma_buf to XENMEM_exchange_and_pin; > - rename XENMEM_put_dma_buf to XENMEM_unpin; > - improve the documentation of the new hypercalls; > - add a note about out.address_bits for XENMEM_exchange. > --- > include/xen/interface/memory.h | 51 ++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 51 insertions(+), 0 deletions(-) > > diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h > index 2ecfe4f..49db252 100644 > --- a/include/xen/interface/memory.h > +++ b/include/xen/interface/memory.h > @@ -66,6 +66,9 @@ struct xen_memory_exchange { > /* > * [IN] Details of memory extents to be exchanged (GMFN bases). > * Note that @in.address_bits is ignored and unused. > + * @out.address_bits contains the maximum number of bits addressable > + * by the caller. The addresses of the newly allocated pages have to > + * meet this restriction. > */ > struct xen_memory_reservation in; > > @@ -263,4 +266,52 @@ struct xen_remove_from_physmap { > }; > DEFINE_GUEST_HANDLE_STRUCT(xen_remove_from_physmap); > > +/* > + * This hypercall is similar to XENMEM_exchange: it takes the same > + * struct as an argument and it exchanges the pages passed in with a new > + * set of pages. The new pages are going to be "pinned": it''s guaranteed > + * that their p2m mapping won''t be changed until explicitly "unpinned". > + * Only normal guest r/w memory can be pinned: no granted pages or > + * ballooned pages. > + * If return code is zero then @out.extent_list provides the frame > + * numbers of the newly-allocated memory. > + * On X86 the frame numbers are machine frame numbers (mfns). > + * On ARMv7 and ARMv8 the frame numbers are machine frame numbers (mfns). > + * Returns zero on complete success, otherwise a negative error code. > + * The most common error codes are: > + * -ENOSYS if not implemented > + * -EPERM if the domain is not privileged for this operation > + * -EBUSY if the page is already pinned > + * -EFAULT if an internal error occurs > + * On complete success then always @nr_exchanged == @in.nr_extents. On > + * partial success @nr_exchanged indicates how much work was done and a > + * negative error code is returned. > + */ > +#define XENMEM_exchange_and_pin 26 > + > +/* > + * XENMEM_unpin unpins a set of pages, previously pinned by > + * XENMEM_exchange_and_pin. After this call the p2m mapping of the pages can > + * be transparently changed by the hypervisor, as usual. The pages are > + * still accessible from the guest. > + */ > +#define XENMEM_unpin 27 > +struct xen_unpin { > + /* > + * [IN] Details of memory extents to be unpinned (GMFN bases). > + * Note that @in.address_bits is ignored and unused. > + */ > + struct xen_memory_reservation in; > + /* > + * [OUT] Number of input extents that were successfully unpinned. > + * 1. The first @nr_unpinned input extents were successfully > + * unpinned. > + * 2. All other input extents are untouched. > + * 3. If not all input extents are unpinned then the return code of this > + * command will be non-zero. > + */ > + xen_ulong_t nr_unpinned; > +}; > +DEFINE_GUEST_HANDLE_STRUCT(xen_unpin); > + > #endif /* __XEN_PUBLIC_MEMORY_H__ */ > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 15:14 UTC
Re: [PATCH v6 06/19] xen/arm,arm64: enable SWIOTLB_XEN
On Fri, Sep 27, 2013 at 05:09:54PM +0100, Stefano Stabellini wrote:> Xen on arm and arm64 needs SWIOTLB_XEN: when running on Xen we need to > program the hardware with mfns rather than pfns for dma addresses. > Remove SWIOTLB_XEN dependency on X86 and PCI and make XEN select > SWIOTLB_XEN on arm and arm64. > > At the moment always rely on swiotlb-xen, but when Xen starts supporting > hardware IOMMUs we''ll be able to avoid it conditionally on the presence > of an IOMMU on the platform. > > Implement xen_create_contiguous_region on arm and arm64 by using > XENMEM_exchange_and_pin. > > Initialize the xen-swiotlb from xen_early_init (before the native > dma_ops are initialized), set xen_dma_ops to &xen_swiotlb_dma_ops. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > Changes in v6: > - introduce and export xen_dma_ops; > - call xen_mm_init from as arch_initcall. > > Changes in v4: > - remove redefinition of DMA_ERROR_CODE; > - update the code to use XENMEM_exchange_and_pin and XENMEM_unpin; > - add a note about hardware IOMMU in the commit message. > > Changes in v3: > - code style changes; > - warn on XENMEM_put_dma_buf failures. > --- > arch/arm/Kconfig | 1 + > arch/arm/include/asm/xen/hypervisor.h | 2 + > arch/arm/include/asm/xen/page.h | 2 + > arch/arm/xen/Makefile | 2 +- > arch/arm/xen/mm.c | 121 +++++++++++++++++++++++++++++++++ > arch/arm64/Kconfig | 1 + > arch/arm64/xen/Makefile | 2 +- > drivers/xen/Kconfig | 1 - > drivers/xen/swiotlb-xen.c | 16 +++++ > 9 files changed, 145 insertions(+), 3 deletions(-) > create mode 100644 arch/arm/xen/mm.c > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index c0bfb33..2c9d112 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -1848,6 +1848,7 @@ config XEN > depends on CPU_V7 && !CPU_V6 > depends on !GENERIC_ATOMIC64 > select ARM_PSCI > + select SWIOTLB_XEN > help > Say Y if you want to run Linux in a Virtual Machine on Xen on ARM. > > diff --git a/arch/arm/include/asm/xen/hypervisor.h b/arch/arm/include/asm/xen/hypervisor.h > index d7ab99a..1317ee4 100644 > --- a/arch/arm/include/asm/xen/hypervisor.h > +++ b/arch/arm/include/asm/xen/hypervisor.h > @@ -16,4 +16,6 @@ static inline enum paravirt_lazy_mode paravirt_get_lazy_mode(void) > return PARAVIRT_LAZY_NONE; > } > > +extern struct dma_map_ops *xen_dma_ops; > + > #endif /* _ASM_ARM_XEN_HYPERVISOR_H */ > diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h > index 359a7b5..b0f7150 100644 > --- a/arch/arm/include/asm/xen/page.h > +++ b/arch/arm/include/asm/xen/page.h > @@ -6,12 +6,14 @@ > > #include <linux/pfn.h> > #include <linux/types.h> > +#include <linux/dma-mapping.h> > > #include <xen/interface/grant_table.h> > > #define pfn_to_mfn(pfn) (pfn) > #define phys_to_machine_mapping_valid(pfn) (1) > #define mfn_to_pfn(mfn) (mfn) > +#define mfn_to_local_pfn(m) (mfn_to_pfn(m)) > #define mfn_to_virt(m) (__va(mfn_to_pfn(m) << PAGE_SHIFT)) > > #define pte_mfn pte_pfn > diff --git a/arch/arm/xen/Makefile b/arch/arm/xen/Makefile > index 4384103..66fc35d 100644 > --- a/arch/arm/xen/Makefile > +++ b/arch/arm/xen/Makefile > @@ -1 +1 @@ > -obj-y := enlighten.o hypercall.o grant-table.o > +obj-y := enlighten.o hypercall.o grant-table.o mm.o > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c > new file mode 100644 > index 0000000..b065c98 > --- /dev/null > +++ b/arch/arm/xen/mm.c > @@ -0,0 +1,121 @@ > +#include <linux/bootmem.h> > +#include <linux/gfp.h> > +#include <linux/export.h> > +#include <linux/slab.h> > +#include <linux/types.h> > +#include <linux/dma-mapping.h> > +#include <linux/vmalloc.h> > +#include <linux/swiotlb.h> > + > +#include <xen/xen.h> > +#include <xen/interface/memory.h> > +#include <xen/swiotlb-xen.h> > + > +#include <asm/cacheflush.h> > +#include <asm/xen/page.h> > +#include <asm/xen/hypercall.h> > +#include <asm/xen/interface.h> > + > +static int xen_exchange_memory(xen_ulong_t extents_in, > + unsigned int order_in, > + xen_pfn_t *pfns_in, > + xen_ulong_t extents_out, > + unsigned int order_out, > + xen_pfn_t *mfns_out, > + unsigned int address_bits) > +{ > + long rc; > + int success; > + > + struct xen_memory_exchange exchange = { > + .in = { > + .nr_extents = extents_in, > + .extent_order = order_in, > + .domid = DOMID_SELF > + }, > + .out = { > + .nr_extents = extents_out, > + .extent_order = order_out, > + .address_bits = address_bits, > + .domid = DOMID_SELF > + }I think you need to set .nr_exchange = 0 just in case there is garbage on the stack and the hypercall is -ENOSYS.> + }; > + set_xen_guest_handle(exchange.in.extent_start, pfns_in); > + set_xen_guest_handle(exchange.out.extent_start, mfns_out); > + > + BUG_ON(extents_in << order_in != extents_out << order_out); > + > + > + rc = HYPERVISOR_memory_op(XENMEM_exchange_and_pin, &exchange); > + success = (exchange.nr_exchanged == extents_in); > + > + BUG_ON(!success && ((exchange.nr_exchanged != 0) || (rc == 0)));I think you need to set nr_exchange = 0 before you make the hypercall. Ohterwise you can get this: a). Say rc = -ENOSYS, in which case the exchage.nr_exchange won''t be touched. success = 0 BUG(!0 && (<garbage> != 0)) <= BOOM. If the hypercall failed (say -EBUSY), so we have: success = 0 BUG(!0 && (1)) <= BOOM Which I thought would be more of a normal error - and we would just return the number of succesfull operations.> + BUG_ON(success && (rc != 0)); > + > + return success; > +} > + > +int xen_create_contiguous_region(unsigned long vstart, unsigned int order, > + unsigned int address_bits, > + dma_addr_t *dma_handle) > +{ > + phys_addr_t pstart = __pa(vstart); > + xen_pfn_t in_frame, out_frame; > + int success; > + > + /* Get a new contiguous memory extent. */ > + in_frame = out_frame = pstart >> PAGE_SHIFT; > + success = xen_exchange_memory(1, order, &in_frame, > + 1, order, &out_frame, > + address_bits); > + > + if (!success) > + return -ENOMEM; > + > + *dma_handle = out_frame << PAGE_SHIFT; > + > + return success ? 0 : -ENOMEM; > +} > +EXPORT_SYMBOL_GPL(xen_create_contiguous_region); > + > +void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) > +{ > + xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT; > + struct xen_unpin unpin = { > + .in = { > + .nr_extents = 1, > + .extent_order = order, > + .domid = DOMID_SELF > + }, > + }; > + set_xen_guest_handle(unpin.in.extent_start, &in_frame); > + > + WARN_ON(HYPERVISOR_memory_op(XENMEM_unpin, &unpin)); > +} > +EXPORT_SYMBOL_GPL(xen_destroy_contiguous_region); > + > +struct dma_map_ops *xen_dma_ops; > +EXPORT_SYMBOL_GPL(xen_dma_ops); > + > +static struct dma_map_ops xen_swiotlb_dma_ops = { > + .mapping_error = xen_swiotlb_dma_mapping_error, > + .alloc = xen_swiotlb_alloc_coherent, > + .free = xen_swiotlb_free_coherent, > + .sync_single_for_cpu = xen_swiotlb_sync_single_for_cpu, > + .sync_single_for_device = xen_swiotlb_sync_single_for_device, > + .sync_sg_for_cpu = xen_swiotlb_sync_sg_for_cpu, > + .sync_sg_for_device = xen_swiotlb_sync_sg_for_device, > + .map_sg = xen_swiotlb_map_sg_attrs, > + .unmap_sg = xen_swiotlb_unmap_sg_attrs, > + .map_page = xen_swiotlb_map_page, > + .unmap_page = xen_swiotlb_unmap_page, > + .dma_supported = xen_swiotlb_dma_supported, > +}; > + > +int __init xen_mm_init(void) > +{ > + xen_swiotlb_init(1, false); > + xen_dma_ops = &xen_swiotlb_dma_ops; > + return 0; > +} > +arch_initcall(xen_mm_init); > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 9737e97..aa1f6fb 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -209,6 +209,7 @@ config XEN_DOM0 > config XEN > bool "Xen guest support on ARM64 (EXPERIMENTAL)" > depends on ARM64 && OF > + select SWIOTLB_XEN > help > Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64. > > diff --git a/arch/arm64/xen/Makefile b/arch/arm64/xen/Makefile > index be24040..0ef9637 100644 > --- a/arch/arm64/xen/Makefile > +++ b/arch/arm64/xen/Makefile > @@ -1,2 +1,2 @@ > -xen-arm-y += $(addprefix ../../arm/xen/, enlighten.o grant-table.o) > +xen-arm-y += $(addprefix ../../arm/xen/, enlighten.o grant-table.o mm.o) > obj-y := xen-arm.o hypercall.o > diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig > index 9e02d60..7e83688 100644 > --- a/drivers/xen/Kconfig > +++ b/drivers/xen/Kconfig > @@ -140,7 +140,6 @@ config XEN_GRANT_DEV_ALLOC > > config SWIOTLB_XEN > def_bool y > - depends on PCI && X86 > select SWIOTLB > > config XEN_TMEM > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 84aef43..a1cb0f4 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -45,6 +45,8 @@ > #include <xen/xen-ops.h> > #include <xen/hvc-console.h> > #include <xen/features.h> > +#include <asm/dma-mapping.h> > + > /* > * Used to do a quick range check in swiotlb_tbl_unmap_single and > * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by this > @@ -58,6 +60,20 @@ static unsigned long xen_io_tlb_nslabs; > * Quick lookup value of the bus address of the IOTLB. > */ > > +#ifndef CONFIG_X86 > +static unsigned long dma_alloc_coherent_mask(struct device *dev, > + gfp_t gfp) > +{ > + unsigned long dma_mask = 0; > + > + dma_mask = dev->coherent_dma_mask; > + if (!dma_mask) > + dma_mask = (gfp & GFP_DMA) ? DMA_BIT_MASK(24) : DMA_BIT_MASK(32); > + > + return dma_mask; > +} > +#endif > + > struct xen_dma_info { > dma_addr_t dma_addr; > phys_addr_t phys_addr; > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 15:17 UTC
Re: [PATCH v6 08/19] arm/xen: get_dma_ops: return xen_dma_ops if we are running on Xen
On Fri, Sep 27, 2013 at 05:09:56PM +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > CC: will.deacon@arm.com > CC: linux@arm.linux.org.uk> --- > arch/arm/include/asm/dma-mapping.h | 13 ++++++++++++- > 1 files changed, 12 insertions(+), 1 deletions(-) > > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h > index 8807124..b638d6d 100644 > --- a/arch/arm/include/asm/dma-mapping.h > +++ b/arch/arm/include/asm/dma-mapping.h > @@ -12,17 +12,28 @@ > #include <asm/memory.h> > #include <asm/cacheflush.h> > > +#include <xen/xen.h> > +#include <asm/xen/hypervisor.h> > + > #define DMA_ERROR_CODE (~0) > extern struct dma_map_ops arm_dma_ops; > extern struct dma_map_ops arm_coherent_dma_ops; > > -static inline struct dma_map_ops *get_dma_ops(struct device *dev) > +static inline struct dma_map_ops *__get_dma_ops(struct device *dev)How about calling it __generic_dma_ops ?> { > if (dev && dev->archdata.dma_ops) > return dev->archdata.dma_ops; > return &arm_dma_ops; > } > > +static inline struct dma_map_ops *get_dma_ops(struct device *dev) > +{ > + if (xen_domain()) > + return xen_dma_ops; > + else > + return __get_dma_ops(dev);Could you explan (hopefully in the git commit description) why we cannot over-write arm_dma_ops with xen_dma_ops? Thank you.> +} > + > static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops) > { > BUG_ON(!dev); > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 15:31 UTC
Re: [PATCH v6 10/19] xen: introduce xen_alloc/free_coherent_pages
On Fri, Sep 27, 2013 at 05:09:58PM +0100, Stefano Stabellini wrote:> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu > and devices. On native x86 and ARMv8 is sufficient to call > __get_free_pages in order to get a coherent buffer, while on ARM we need > to call the native dma_ops->alloc implementation. > > When arm64 stops using the swiotlb by default and starts having multiple > dma_ops implementations, we''ll use __get_dma_ops there too.I presume this is a future TODO, not some further patch (in which case you should say in here the title of it). If it is a TODO could you stick that in the sentence here somewhere to make it crytal clear that it is not implemented. Thank you.> > Introduce xen_alloc_coherent_pages to abstract the arch specific buffer > allocation. > > Similarly introduce xen_free_coherent_pages to free a coherent buffer: > on x86 and ARM64 is simply a call to free_pages while on ARM is > arm_dma_ops.free. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > Changes in v6: > - call __get_dma_ops to get the native dma_ops pointer on arm. > --- > arch/arm/include/asm/xen/page-coherent.h | 22 ++++++++++++++++++++++ > arch/arm64/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++ > arch/ia64/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++ > arch/x86/include/asm/xen/page-coherent.h | 24 ++++++++++++++++++++++++ > 4 files changed, 94 insertions(+), 0 deletions(-) > create mode 100644 arch/arm/include/asm/xen/page-coherent.h > create mode 100644 arch/arm64/include/asm/xen/page-coherent.h > create mode 100644 arch/ia64/include/asm/xen/page-coherent.h > create mode 100644 arch/x86/include/asm/xen/page-coherent.h > > diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h > new file mode 100644 > index 0000000..92b7a8a > --- /dev/null > +++ b/arch/arm/include/asm/xen/page-coherent.h > @@ -0,0 +1,22 @@ > +#ifndef _ASM_ARM_XEN_PAGE_COHERENT_H > +#define _ASM_ARM_XEN_PAGE_COHERENT_H > + > +#include <asm/page.h> > +#include <linux/dma-attrs.h> > +#include <linux/dma-mapping.h> > + > +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, > + dma_addr_t *dma_handle, gfp_t flags, > + struct dma_attrs *attrs) > +{ > + return __get_dma_ops(hwdev)->alloc(hwdev, size, dma_handle, flags, attrs); > +} > + > +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, > + void *cpu_addr, dma_addr_t dma_handle, > + struct dma_attrs *attrs) > +{ > + __get_dma_ops(hwdev)->free(hwdev, size, cpu_addr, dma_handle, attrs); > +} > + > +#endif /* _ASM_ARM_XEN_PAGE_COHERENT_H */ > diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h > new file mode 100644 > index 0000000..0d6ad25 > --- /dev/null > +++ b/arch/arm64/include/asm/xen/page-coherent.h > @@ -0,0 +1,24 @@ > +#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H > +#define _ASM_ARM64_XEN_PAGE_COHERENT_H > + > +#include <asm/page.h> > +#include <linux/dma-attrs.h> > +#include <linux/dma-mapping.h> > + > +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, > + dma_addr_t *dma_handle, gfp_t flags, > + struct dma_attrs *attrs) > +{ > + void *vstart = (void*)__get_free_pages(flags, get_order(size)); > + *dma_handle = virt_to_phys(vstart); > + return vstart; > +} > + > +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, > + void *cpu_addr, dma_addr_t dma_handle, > + struct dma_attrs *attrs) > +{ > + free_pages((unsigned long) cpu_addr, get_order(size)); > +} > + > +#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */ > diff --git a/arch/ia64/include/asm/xen/page-coherent.h b/arch/ia64/include/asm/xen/page-coherent.h > new file mode 100644 > index 0000000..37b929c > --- /dev/null > +++ b/arch/ia64/include/asm/xen/page-coherent.h > @@ -0,0 +1,24 @@ > +#ifndef _ASM_IA64_XEN_PAGE_COHERENT_H > +#define _ASM_IA64_XEN_PAGE_COHERENT_H > + > +#include <asm/page.h> > +#include <linux/dma-attrs.h> > +#include <linux/dma-mapping.h> > + > +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, > + dma_addr_t *dma_handle, gfp_t flags, > + struct dma_attrs *attrs) > +{ > + void *vstart = (void*)__get_free_pages(flags, get_order(size)); > + *dma_handle = virt_to_phys(vstart); > + return vstart; > +} > + > +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, > + void *cpu_addr, dma_addr_t dma_handle, > + struct dma_attrs *attrs) > +{ > + free_pages((unsigned long) cpu_addr, get_order(size)); > +} > + > +#endif /* _ASM_IA64_XEN_PAGE_COHERENT_H */ > diff --git a/arch/x86/include/asm/xen/page-coherent.h b/arch/x86/include/asm/xen/page-coherent.h > new file mode 100644 > index 0000000..31de2e0 > --- /dev/null > +++ b/arch/x86/include/asm/xen/page-coherent.h > @@ -0,0 +1,24 @@ > +#ifndef _ASM_X86_XEN_PAGE_COHERENT_H > +#define _ASM_X86_XEN_PAGE_COHERENT_H > + > +#include <asm/page.h> > +#include <linux/dma-attrs.h> > +#include <linux/dma-mapping.h> > + > +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, > + dma_addr_t *dma_handle, gfp_t flags, > + struct dma_attrs *attrs) > +{ > + void *vstart = (void*)__get_free_pages(flags, get_order(size)); > + *dma_handle = virt_to_phys(vstart); > + return vstart; > +} > + > +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, > + void *cpu_addr, dma_addr_t dma_handle, > + struct dma_attrs *attrs) > +{ > + free_pages((unsigned long) cpu_addr, get_order(size)); > +} > + > +#endif /* _ASM_X86_XEN_PAGE_COHERENT_H */ > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 15:34 UTC
Re: [PATCH v6 11/19] swiotlb-xen: use xen_alloc/free_coherent_pages
On Fri, Sep 27, 2013 at 05:09:59PM +0100, Stefano Stabellini wrote:> Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or > free coherent pages. > > We need to be careful handling the pointer returned by > xen_alloc_coherent_pages, because on ARM the pointer is not equal to > phys_to_virt(*dma_handle). In fact virt_to_phys only works for kernel > direct mapped RAM memory. > In ARM case the pointer could be an ioremap address, therefore passing > it to virt_to_phys would give you another physical address that doesn''t > correspond to it. > > Make xen_create_contiguous_region take a phys_addr_t as start parameter to > avoid the virt_to_phys calls which would be incorrect. > > Changes in v6: > - remove extra spaces. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>> --- > arch/arm/xen/mm.c | 7 +++---- > arch/x86/xen/mmu.c | 7 +++++-- > drivers/xen/swiotlb-xen.c | 31 +++++++++++++++++++++---------- > include/xen/xen-ops.h | 4 ++-- > 4 files changed, 31 insertions(+), 18 deletions(-) > > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c > index 4330b15..b305b94 100644 > --- a/arch/arm/xen/mm.c > +++ b/arch/arm/xen/mm.c > @@ -55,11 +55,10 @@ static int xen_exchange_memory(xen_ulong_t extents_in, > return success; > } > > -int xen_create_contiguous_region(unsigned long vstart, unsigned int order, > +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, > unsigned int address_bits, > dma_addr_t *dma_handle) > { > - phys_addr_t pstart = __pa(vstart); > xen_pfn_t in_frame, out_frame; > int success; > > @@ -78,9 +77,9 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order, > } > EXPORT_SYMBOL_GPL(xen_create_contiguous_region); > > -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) > +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order) > { > - xen_pfn_t in_frame = __pa(vstart) >> PAGE_SHIFT; > + xen_pfn_t in_frame = pstart >> PAGE_SHIFT; > struct xen_unpin unpin = { > .in = { > .nr_extents = 1, > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 6c34d7c..8830883 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -2328,13 +2328,14 @@ static int xen_exchange_memory(unsigned long extents_in, unsigned int order_in, > return success; > } > > -int xen_create_contiguous_region(unsigned long vstart, unsigned int order, > +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, > unsigned int address_bits, > dma_addr_t *dma_handle) > { > unsigned long *in_frames = discontig_frames, out_frame; > unsigned long flags; > int success; > + unsigned long vstart = (unsigned long)phys_to_virt(pstart); > > /* > * Currently an auto-translated guest will not perform I/O, nor will > @@ -2374,11 +2375,12 @@ int xen_create_contiguous_region(unsigned long vstart, unsigned int order, > } > EXPORT_SYMBOL_GPL(xen_create_contiguous_region); > > -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) > +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order) > { > unsigned long *out_frames = discontig_frames, in_frame; > unsigned long flags; > int success; > + unsigned long vstart; > > if (xen_feature(XENFEAT_auto_translated_physmap)) > return; > @@ -2386,6 +2388,7 @@ void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order) > if (unlikely(order > MAX_CONTIG_ORDER)) > return; > > + vstart = (unsigned long)phys_to_virt(pstart); > memset((void *) vstart, 0, PAGE_SIZE << order); > > spin_lock_irqsave(&xen_reservation_lock, flags); > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index deb9131..96ad316 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -46,6 +46,7 @@ > #include <xen/hvc-console.h> > #include <xen/features.h> > #include <asm/dma-mapping.h> > +#include <asm/xen/page-coherent.h> > > /* > * Used to do a quick range check in swiotlb_tbl_unmap_single and > @@ -244,6 +245,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) > { > int i, j, rc; > int dma_bits; > + phys_addr_t p = virt_to_phys(buf); > > dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT; > > @@ -253,7 +255,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) > > do { > rc = xen_create_contiguous_region( > - (unsigned long)buf + (i << IO_TLB_SHIFT), > + p + (i << IO_TLB_SHIFT), > get_order(slabs << IO_TLB_SHIFT), > dma_bits, &xen_dma_seg[j].dma_addr); > } while (rc && dma_bits++ < max_dma_bits); > @@ -389,7 +391,6 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > void *ret; > int order = get_order(size); > u64 dma_mask = DMA_BIT_MASK(32); > - unsigned long vstart; > phys_addr_t phys; > dma_addr_t dev_addr; > struct xen_dma_info *dma_info = NULL; > @@ -405,8 +406,12 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > if (dma_alloc_from_coherent(hwdev, size, dma_handle, &ret)) > return ret; > > - vstart = __get_free_pages(flags, order); > - ret = (void *)vstart; > + /* On ARM this function returns an ioremap''ped virtual address for > + * which virt_to_phys doesn''t return the corresponding physical > + * address. In fact on ARM virt_to_phys only works for kernel direct > + * mapped RAM memory. Also see comment below. > + */ > + ret = xen_alloc_coherent_pages(hwdev, size, dma_handle, flags, attrs); > > if (!ret) > return ret; > @@ -414,16 +419,20 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > if (hwdev && hwdev->coherent_dma_mask) > dma_mask = dma_alloc_coherent_mask(hwdev, flags); > > - phys = virt_to_phys(ret); > + /* At this point dma_handle is the physical address, next we are > + * going to set it to the machine address. > + * Do not use virt_to_phys(ret) because on ARM it doesn''t correspond > + * to *dma_handle. */ > + phys = *dma_handle; > dev_addr = xen_phys_to_bus(phys); > if (!xen_feature(XENFEAT_auto_translated_physmap) && > ((dev_addr + size - 1 <= dma_mask)) && > !range_straddles_page_boundary(phys, size)) > *dma_handle = dev_addr; > else { > - if (xen_create_contiguous_region(vstart, order, > + if (xen_create_contiguous_region(phys, order, > fls64(dma_mask), dma_handle) != 0) { > - free_pages(vstart, order); > + xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs); > return NULL; > } > > @@ -463,18 +472,20 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, > if (hwdev && hwdev->coherent_dma_mask) > dma_mask = hwdev->coherent_dma_mask; > > - phys = virt_to_phys(vaddr); > + /* do not use virt_to_phys because on ARM it doesn''t return you the > + * physical address */ > + phys = xen_bus_to_phys(dev_addr); > > if (xen_feature(XENFEAT_auto_translated_physmap) || > (((dev_addr + size - 1 > dma_mask)) || > range_straddles_page_boundary(phys, size))) { > - xen_destroy_contiguous_region((unsigned long)vaddr, order); > + xen_destroy_contiguous_region(phys, order); > dma_info = xen_get_dma_info_from_dma(dev_addr); > rb_erase(&dma_info->rbnode, &bus_to_phys); > kfree(dma_info); > } > > - free_pages((unsigned long)vaddr, order); > + xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs); > } > EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent); > > diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h > index 9ef704d..fb2ea8f 100644 > --- a/include/xen/xen-ops.h > +++ b/include/xen/xen-ops.h > @@ -19,11 +19,11 @@ void xen_arch_resume(void); > int xen_setup_shutdown_event(void); > > extern unsigned long *xen_contiguous_bitmap; > -int xen_create_contiguous_region(unsigned long vstart, unsigned int order, > +int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, > unsigned int address_bits, > dma_addr_t *dma_handle); > > -void xen_destroy_contiguous_region(unsigned long vstart, unsigned int order); > +void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order); > > struct vm_area_struct; > int xen_remap_domain_mfn_range(struct vm_area_struct *vma, > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 15:56 UTC
Re: [PATCH v6 12/19] swiotlb: don''t assume that io_tlb_start-io_tlb_end is coherent
On Fri, Sep 27, 2013 at 05:10:00PM +0100, Stefano Stabellini wrote:> The swiotlb code has appropriate calls to dma_mark_clean in place for > buffers passed to swiotlb_map_page as an argument. However it assumes > that the swiotlb bounce buffer (io_tlb_start-io_tlb_end) is already > coherent and doesn''t need any calls to dma_mark_clean. > > On ARM the swiotlb bounce buffer is not coherent (the memory is > writealloc while it should be bufferable) and therefore we need to call > dma_mark_clean appropriately on the bounce buffer code paths too. > > Note that most architecures have an empty dma_mark_clean implementation > anyway.The other architecture that uses swiotlb is IA64 and that does have an implementation where it touches on page attributes. Which means I have to figure out why my HP zx6000 won''t boot with 3.11 now :-(> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > lib/swiotlb.c | 13 ++++++++++--- > 1 files changed, 10 insertions(+), 3 deletions(-) > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > index 4e8686c..eb45d17 100644 > --- a/lib/swiotlb.c > +++ b/lib/swiotlb.c > @@ -515,6 +515,7 @@ found: > io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT); > if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) > swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE); > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > return tlb_addr; > } > @@ -547,7 +548,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, > * First, sync the memory before unmapping the entry > */ > if (orig_addr && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))) > + { > + dma_mark_clean(phys_to_virt(tlb_addr), size); > swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE); > + } > > /* > * Return the buffer to the free list by setting the corresponding > @@ -587,17 +591,20 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr, > > switch (target) { > case SYNC_FOR_CPU: > - if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) > + if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) { > + dma_mark_clean(phys_to_virt(tlb_addr), size); > swiotlb_bounce(orig_addr, tlb_addr, > size, DMA_FROM_DEVICE); > + } > else > BUG_ON(dir != DMA_TO_DEVICE); > break; > case SYNC_FOR_DEVICE: > - if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) > + if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) { > swiotlb_bounce(orig_addr, tlb_addr, > size, DMA_TO_DEVICE); > - else > + dma_mark_clean(phys_to_virt(tlb_addr), size); > + } else > BUG_ON(dir != DMA_FROM_DEVICE); > break; > default: > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 15:58 UTC
Re: [PATCH v6 14/19] swiotlb: print a warning when the swiotlb is full
On Fri, Sep 27, 2013 at 05:10:02PM +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > drivers/xen/swiotlb-xen.c | 1 + > lib/swiotlb.c | 1 + > 2 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 96ad316..790c2eb 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -674,6 +674,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > sg->length, > dir); > if (map == SWIOTLB_MAP_ERROR) { > + pr_warn("swiotlb buffer is full\n");It would be beneficial to use dev_warn instead. And perhaps even call debug_dma_dump_mappings to help in diagnosing a problem?> /* Don''t panic here, we expect map_sg users > to do proper error handling. */ > xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir, > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > index eb45d17..f06da0d 100644 > --- a/lib/swiotlb.c > +++ b/lib/swiotlb.c > @@ -502,6 +502,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, > > not_found: > spin_unlock_irqrestore(&io_tlb_lock, flags); > + pr_warn("swiotlb buffer is full\n"); > return SWIOTLB_MAP_ERROR; > found: > spin_unlock_irqrestore(&io_tlb_lock, flags); > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 16:02 UTC
Re: [PATCH v6 15/19] swiotlb-xen: call dma_capable only if dev->dma_mask is allocated
On Fri, Sep 27, 2013 at 05:10:03PM +0100, Stefano Stabellini wrote: Why? I am looking at X86 and IA64 and I see: 79 if (!dev->dma_mask) 80 return 0; Why not port dma_capable over to ARM?> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > drivers/xen/swiotlb-xen.c | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 790c2eb..3011736 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -512,7 +512,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > * buffering it. > */ > if (!xen_feature(XENFEAT_auto_translated_physmap) && > - dma_capable(dev, dev_addr, size) && > + dev->dma_mask && dma_capable(dev, dev_addr, size) && > !range_straddles_page_boundary(phys, size) && !swiotlb_force) > return dev_addr; > > @@ -532,7 +532,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > /* > * Ensure that the address returned is DMA''ble > */ > - if (!dma_capable(dev, dev_addr, size)) { > + if (dev->dma_mask && !dma_capable(dev, dev_addr, size)) { > swiotlb_tbl_unmap_single(dev, map, size, dir); > dev_addr = 0; > } > @@ -660,7 +660,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > if (swiotlb_force || > xen_feature(XENFEAT_auto_translated_physmap) || > - !dma_capable(hwdev, dev_addr, sg->length) || > + (hwdev->dma_mask && !dma_capable(hwdev, dev_addr, sg->length)) || > range_straddles_page_boundary(paddr, sg->length)) { > /* > * Pass the dma_addr of the first slab in the iotlb buffer as > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 16:06 UTC
Re: [PATCH v6 16/19] arm, arm64: do not always merge biovec if we are running on Xen
On Fri, Sep 27, 2013 at 05:10:04PM +0100, Stefano Stabellini wrote:> This is similar to what it is done on X86: biovecs are prevented from merging > otherwise every dma requests would be forced to bounce on the swiotlb buffer. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>This also alters the generic (well, was x86, now it is generic) code. Perhaps that change should be made in a seperate patch and explain what the autotranslate extra check will do for PV, PVH and HVM guests? Wait, does that mean we had not been merging requests on HVM guests? Is that good or bad?> --- > arch/arm/include/asm/io.h | 8 ++++++++ > arch/arm64/include/asm/io.h | 9 +++++++++ > drivers/xen/biomerge.c | 4 +++- > 3 files changed, 20 insertions(+), 1 deletions(-) > > diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h > index d070741..c45effc 100644 > --- a/arch/arm/include/asm/io.h > +++ b/arch/arm/include/asm/io.h > @@ -24,9 +24,11 @@ > #ifdef __KERNEL__ > > #include <linux/types.h> > +#include <linux/blk_types.h> > #include <asm/byteorder.h> > #include <asm/memory.h> > #include <asm-generic/pci_iomap.h> > +#include <xen/xen.h> > > /* > * ISA I/O bus memory addresses are 1:1 with the physical address. > @@ -372,6 +374,12 @@ extern void pci_iounmap(struct pci_dev *dev, void __iomem *addr); > #define BIOVEC_MERGEABLE(vec1, vec2) \ > ((bvec_to_phys((vec1)) + (vec1)->bv_len) == bvec_to_phys((vec2))) > > +extern bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > + const struct bio_vec *vec2); > +#define BIOVEC_PHYS_MERGEABLE(vec1, vec2) \ > + (__BIOVEC_PHYS_MERGEABLE(vec1, vec2) && \ > + (!xen_domain() || xen_biovec_phys_mergeable(vec1, vec2))) > + > #ifdef CONFIG_MMU > #define ARCH_HAS_VALID_PHYS_ADDR_RANGE > extern int valid_phys_addr_range(phys_addr_t addr, size_t size); > diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h > index 1d12f89..c163287b 100644 > --- a/arch/arm64/include/asm/io.h > +++ b/arch/arm64/include/asm/io.h > @@ -22,11 +22,14 @@ > #ifdef __KERNEL__ > > #include <linux/types.h> > +#include <linux/blk_types.h> > > #include <asm/byteorder.h> > #include <asm/barrier.h> > #include <asm/pgtable.h> > > +#include <xen/xen.h> > + > /* > * Generic IO read/write. These perform native-endian accesses. > */ > @@ -263,5 +266,11 @@ extern int devmem_is_allowed(unsigned long pfn); > */ > #define xlate_dev_kmem_ptr(p) p > > +extern bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > + const struct bio_vec *vec2); > +#define BIOVEC_PHYS_MERGEABLE(vec1, vec2) \ > + (__BIOVEC_PHYS_MERGEABLE(vec1, vec2) && \ > + (!xen_domain() || xen_biovec_phys_mergeable(vec1, vec2))) > + > #endif /* __KERNEL__ */ > #endif /* __ASM_IO_H */ > diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c > index 0edb91c..f323a2d 100644 > --- a/drivers/xen/biomerge.c > +++ b/drivers/xen/biomerge.c > @@ -2,6 +2,7 @@ > #include <linux/io.h> > #include <linux/export.h> > #include <xen/page.h> > +#include <xen/features.h> > > bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > const struct bio_vec *vec2) > @@ -9,7 +10,8 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page)); > unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page)); > > - return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) && > + return !xen_feature(XENFEAT_auto_translated_physmap) && > + __BIOVEC_PHYS_MERGEABLE(vec1, vec2) && > ((mfn1 == mfn2) || ((mfn1+1) == mfn2)); > } > EXPORT_SYMBOL(xen_biovec_phys_mergeable); > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 17:22 UTC
Re: [PATCH v6 17/19] xen: introduce XENMEM_pin
On Fri, Sep 27, 2013 at 05:10:05PM +0100, Stefano Stabellini wrote:> Introduce a new hypercall to pin one or more pages whose machine > addresses respect a dma_mask passed as an argument > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > arch/arm/xen/mm.c | 16 ++++++++++++++++ > arch/x86/xen/mmu.c | 7 +++++++ > include/xen/interface/memory.h | 32 ++++++++++++++++++++++++++++++++ > include/xen/xen-ops.h | 1 + > 4 files changed, 56 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c > index b305b94..146c1c3 100644 > --- a/arch/arm/xen/mm.c > +++ b/arch/arm/xen/mm.c > @@ -55,6 +55,22 @@ static int xen_exchange_memory(xen_ulong_t extents_in, > return success; > } > > +int xen_pin_page(xen_pfn_t *in_frame, unsigned int address_bits) > +{ > + struct xen_pin pin = { > + .in = { > + .nr_extents = 1, > + .extent_order = 0, > + .domid = DOMID_SELF, > + .address_bits = address_bits > + }, > + }; > + set_xen_guest_handle(pin.in.extent_start, in_frame); > + > + return HYPERVISOR_memory_op(XENMEM_pin, &pin); > +} > +EXPORT_SYMBOL_GPL(xen_pin_page); > + > int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, > unsigned int address_bits, > dma_addr_t *dma_handle) > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 8830883..8f76ce2 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -2568,3 +2568,10 @@ int xen_unmap_domain_mfn_range(struct vm_area_struct *vma, > return -EINVAL; > } > EXPORT_SYMBOL_GPL(xen_unmap_domain_mfn_range); > + > +int xen_pin_page(xen_pfn_t *in_frame, unsigned int address_bits) > +{ > + return -ENOSYS; > +} > +EXPORT_SYMBOL_GPL(xen_pin_page); > + > diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h > index 49db252..66ab578 100644 > --- a/include/xen/interface/memory.h > +++ b/include/xen/interface/memory.h > @@ -314,4 +314,36 @@ struct xen_unpin { > }; > DEFINE_GUEST_HANDLE_STRUCT(xen_unpin); > > +/* > + * XENMEM_pin pins a set of pages to make sure that the hypervisor does > + * not change the p2m mappings for them. > + * > + */ > +#define XENMEM_pin 28 > +struct xen_pin { > + /* > + * [IN/OUT] Details of memory extents to be pinned (GMFN bases). > + * Xen copies back the MFNs corresponding to the GMFNs passed in as > + * argument. > + * @in.address_bits contains the maximum number of bits addressable > + * by the caller. If the machine addresses of the pages to be pinned > + * are not addressable according to @in.address_bits, the hypercall > + * fails and returns an errors. The pages are not pinned. Otherwise^^^^^-error. What kind of error? And you should probably join the two sentences together (the "If the ... The pages are not pinned.")> + * the hypercall succeeds.and does it return a number of pages that were pinned or just zero?> + */ > + struct xen_memory_reservation in; > + > + /* > + * [OUT] Number of input extents that were successfully pinned. > + * 1. The first @nr_pinned input extents were successfully > + * pinned. > + * 2. All other input extents are untouched. > + * 3. If not all input extents are pinned then the return code of this > + * command will be non-zero.OK, what return code?> + */ > + xen_ulong_t nr_pinned; > +}; > +DEFINE_GUEST_HANDLE_STRUCT(xen_pin); > + > + > #endif /* __XEN_PUBLIC_MEMORY_H__ */ > diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h > index fb2ea8f..4cf4fc5 100644 > --- a/include/xen/xen-ops.h > +++ b/include/xen/xen-ops.h > @@ -22,6 +22,7 @@ extern unsigned long *xen_contiguous_bitmap; > int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order, > unsigned int address_bits, > dma_addr_t *dma_handle); > +int xen_pin_page(xen_pfn_t *in_frame, unsigned int address_bits); > > void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order); > > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 17:27 UTC
Re: [PATCH v6 18/19] swiotlb-xen: introduce a rbtree to track phys to bus mappings
On Fri, Sep 27, 2013 at 05:10:06PM +0100, Stefano Stabellini wrote:> Introduce a second red-back tree to track phys to bus mappings created after > the initialization of the swiotlb buffer.Could you explain the use case a bit more please? As in: a) why is this needed b) why are we using the rb tree instead of something else (say FIFO queue) c) how long are these in usage? d) do we need locking now?> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > drivers/xen/swiotlb-xen.c | 99 +++++++++++++++++++++++++++++++++++++------- > 1 files changed, 83 insertions(+), 16 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 3011736..022bcaf 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -79,7 +79,8 @@ struct xen_dma_info { > dma_addr_t dma_addr; > phys_addr_t phys_addr; > size_t size; > - struct rb_node rbnode; > + struct rb_node rbnode_dma; > + struct rb_node rbnode_phys; > }; > > /* > @@ -96,8 +97,13 @@ static struct xen_dma_info *xen_dma_seg; > * mappings. > */ > static struct rb_root bus_to_phys = RB_ROOT; > +/* > + * This tree keeps track of physical address to bus address > + * mappings apart from the ones belonging to the initial swiotlb buffer. > + */ > +static struct rb_root phys_to_bus = RB_ROOT; > > -static int xen_dma_add_entry(struct xen_dma_info *new) > +static int xen_dma_add_entry_bus(struct xen_dma_info *new) > { > struct rb_node **link = &bus_to_phys.rb_node; > struct rb_node *parent = NULL; > @@ -106,7 +112,7 @@ static int xen_dma_add_entry(struct xen_dma_info *new) > > while (*link) { > parent = *link; > - entry = rb_entry(parent, struct xen_dma_info, rbnode); > + entry = rb_entry(parent, struct xen_dma_info, rbnode_dma); > > if (new->dma_addr == entry->dma_addr) > goto err_out; > @@ -118,8 +124,41 @@ static int xen_dma_add_entry(struct xen_dma_info *new) > else > link = &(*link)->rb_right; > } > - rb_link_node(&new->rbnode, parent, link); > - rb_insert_color(&new->rbnode, &bus_to_phys); > + rb_link_node(&new->rbnode_dma, parent, link); > + rb_insert_color(&new->rbnode_dma, &bus_to_phys); > + goto out; > + > +err_out: > + rc = -EINVAL; > + pr_warn("%s: cannot add phys=%pa -> dma=%pa: phys=%pa -> dma=%pa already exists\n", > + __func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr); > +out: > + return rc; > +} > + > +static int xen_dma_add_entry_phys(struct xen_dma_info *new) > +{ > + struct rb_node **link = &phys_to_bus.rb_node; > + struct rb_node *parent = NULL; > + struct xen_dma_info *entry; > + int rc = 0; > + > + while (*link) { > + parent = *link; > + entry = rb_entry(parent, struct xen_dma_info, rbnode_phys); > + > + if (new->dma_addr == entry->dma_addr) > + goto err_out; > + if (new->phys_addr == entry->phys_addr) > + goto err_out; > + > + if (new->phys_addr < entry->phys_addr) > + link = &(*link)->rb_left; > + else > + link = &(*link)->rb_right; > + } > + rb_link_node(&new->rbnode_phys, parent, link); > + rb_insert_color(&new->rbnode_phys, &phys_to_bus); > goto out; > > err_out: > @@ -130,13 +169,22 @@ out: > return rc; > } > > +static int xen_dma_add_entry(struct xen_dma_info *new) > +{ > + int rc; > + if ((rc = xen_dma_add_entry_bus(new) < 0) || > + (rc = xen_dma_add_entry_phys(new) < 0))Please don''t do that. Just do rc = xen_dma_add_entry(bus); if (rc) return rc; rc = xen_dma_add_entry_phys(new); if (rc) { // unwind it somehow? <<== This is important :-) } return rc;> + return rc; > + return 0; > +} > + > static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) > { > struct rb_node *n = bus_to_phys.rb_node; > struct xen_dma_info *entry; > > while (n) { > - entry = rb_entry(n, struct xen_dma_info, rbnode); > + entry = rb_entry(n, struct xen_dma_info, rbnode_dma); > if (entry->dma_addr <= dma_addr && > entry->dma_addr + entry->size > dma_addr) { > return entry; > @@ -150,11 +198,30 @@ static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) > return NULL; > } > > -static dma_addr_t xen_phys_to_bus(phys_addr_t paddr) > +static struct xen_dma_info *xen_get_dma_info_from_phys(phys_addr_t phys) > { > - int nr_seg; > - unsigned long offset; > - char *vaddr; > + struct rb_node *n = phys_to_bus.rb_node; > + struct xen_dma_info *entry; > + > + while (n) { > + entry = rb_entry(n, struct xen_dma_info, rbnode_phys); > + if (entry->phys_addr <= phys && > + entry->phys_addr + entry->size > phys) { > + return entry; > + } > + if (phys < entry->phys_addr) > + n = n->rb_left; > + else > + n = n->rb_right; > + } > + > + return NULL; > +} > + > +/* Only looks into the initial buffer allocation in case of > + * XENFEAT_auto_translated_physmap guests. */ > +static dma_addr_t xen_phys_to_bus_quick(phys_addr_t paddr) { int nr_seg; > + unsigned long offset; char *vaddr; > > if (!xen_feature(XENFEAT_auto_translated_physmap)) > return phys_to_machine(XPADDR(paddr)).maddr; > @@ -184,7 +251,7 @@ static phys_addr_t xen_bus_to_phys(dma_addr_t baddr) > > static dma_addr_t xen_virt_to_bus(void *address) > { > - return xen_phys_to_bus(virt_to_phys(address)); > + return xen_phys_to_bus_quick(virt_to_phys(address)); > } > > static int check_pages_physically_contiguous(unsigned long pfn, > @@ -424,7 +491,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > * Do not use virt_to_phys(ret) because on ARM it doesn''t correspond > * to *dma_handle. */ > phys = *dma_handle; > - dev_addr = xen_phys_to_bus(phys); > + dev_addr = xen_phys_to_bus_quick(phys); > if (!xen_feature(XENFEAT_auto_translated_physmap) && > ((dev_addr + size - 1 <= dma_mask)) && > !range_straddles_page_boundary(phys, size)) > @@ -503,7 +570,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > struct dma_attrs *attrs) > { > phys_addr_t map, phys = page_to_phys(page) + offset; > - dma_addr_t dev_addr = xen_phys_to_bus(phys); > + dma_addr_t dev_addr = xen_phys_to_bus_quick(phys); > > BUG_ON(dir == DMA_NONE); > /* > @@ -527,7 +594,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > if (map == SWIOTLB_MAP_ERROR) > return DMA_ERROR_CODE; > > - dev_addr = xen_phys_to_bus(map); > + dev_addr = xen_phys_to_bus_quick(map); > > /* > * Ensure that the address returned is DMA''ble > @@ -656,7 +723,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > for_each_sg(sgl, sg, nelems, i) { > phys_addr_t paddr = sg_phys(sg); > - dma_addr_t dev_addr = xen_phys_to_bus(paddr); > + dma_addr_t dev_addr = xen_phys_to_bus_quick(paddr); > > if (swiotlb_force || > xen_feature(XENFEAT_auto_translated_physmap) || > @@ -682,7 +749,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > sg_dma_len(sgl) = 0; > return DMA_ERROR_CODE; > } > - sg->dma_address = xen_phys_to_bus(map); > + sg->dma_address = xen_phys_to_bus_quick(map); > } else > sg->dma_address = dev_addr; > sg_dma_len(sg) = sg->length; > -- > 1.7.2.5 >
Konrad Rzeszutek Wilk
2013-Sep-30 17:39 UTC
Re: [PATCH v6 19/19] swiotlb-xen: instead of bouncing on the swiotlb, pin single pages
On Fri, Sep 27, 2013 at 05:10:07PM +0100, Stefano Stabellini wrote:> If we are dealing with single page mappings that don''t cross page > boundaries, we can try to pin the page and get the corresponding mfn, > using xen_pin_page. This avoids going through the swiotlb bounce > buffer. If xen_pin_page fails (because the underlying mfn doesn''t > respect the dma_mask) fall back to the swiotlb bounce buffer. > Add a ref count to xen_dma_info, so that we can avoid pinnig pages that > are already pinned. > Use a spinlock to protect accesses, insertions and deletions in the > rbtrees. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > drivers/xen/swiotlb-xen.c | 152 ++++++++++++++++++++++++++++++++++++++++++--- > 1 files changed, 143 insertions(+), 9 deletions(-) > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > index 022bcaf..6f94285 100644 > --- a/drivers/xen/swiotlb-xen.c > +++ b/drivers/xen/swiotlb-xen.c > @@ -57,6 +57,8 @@ > #define NR_DMA_SEGS ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE) > static char *xen_io_tlb_start, *xen_io_tlb_end; > static unsigned long xen_io_tlb_nslabs; > +spinlock_t swiotlb_lock; > + > /* > * Quick lookup value of the bus address of the IOTLB. > */ > @@ -79,6 +81,7 @@ struct xen_dma_info { > dma_addr_t dma_addr; > phys_addr_t phys_addr; > size_t size; > + atomic_t refs; > struct rb_node rbnode_dma; > struct rb_node rbnode_phys; > }; > @@ -254,6 +257,48 @@ static dma_addr_t xen_virt_to_bus(void *address) > return xen_phys_to_bus_quick(virt_to_phys(address)); > } > > +static int xen_pin_dev_page(struct device *dev, > + phys_addr_t phys, > + dma_addr_t *dev_addr)Something is odd with your tabs.> +{ > + u64 dma_mask = DMA_BIT_MASK(32);Why 32?> + xen_pfn_t in; > + struct xen_dma_info *dma_info = xen_get_dma_info_from_phys(phys); > + > + if (dma_info != NULL) { > + atomic_inc(&dma_info->refs); > + *dev_addr = dma_info->dma_addr + (phys - dma_info->phys_addr); > + return 0; > + } > + > + if (dev && dev->coherent_dma_mask) > + dma_mask = dma_alloc_coherent_mask(dev, GFP_KERNEL); > + > + in = phys >> PAGE_SHIFT; > + if (!xen_pin_page(&in, fls64(dma_mask))) {Why not just make xen_pin_page use an phys address and it can also do the appropiate bit shifting in it?> + *dev_addr = in << PAGE_SHIFT; > + dma_info = kzalloc(sizeof(struct xen_dma_info), GFP_NOWAIT); > + if (!dma_info) { > + pr_warn("cannot allocate xen_dma_info\n"); > + xen_destroy_contiguous_region(phys & PAGE_MASK, 0);Perhaps we should add an inline function for that called ''xen_unpin_page'' ?> + return -ENOMEM; > + } > + dma_info->phys_addr = phys & PAGE_MASK; > + dma_info->size = PAGE_SIZE; > + dma_info->dma_addr = *dev_addr; > + if (xen_dma_add_entry(dma_info)) { > + pr_warn("cannot add new entry to bus_to_phys\n"); > + xen_destroy_contiguous_region(phys & PAGE_MASK, 0); > + kfree(dma_info); > + return -EFAULT; > + } > + atomic_set(&dma_info->refs, 1); > + *dev_addr += (phys & ~PAGE_MASK); > + return 0; > + }Don''t you want to the opposite of dma_alloc_coherent_mask ?> + return -EFAULT; > +} > + > static int check_pages_physically_contiguous(unsigned long pfn, > unsigned int offset, > size_t length) > @@ -434,6 +479,7 @@ retry: > rc = 0; > } else > rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs); > + spin_lock_init(&swiotlb_lock); > return rc; > error: > if (repeat--) { > @@ -461,6 +507,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > phys_addr_t phys; > dma_addr_t dev_addr; > struct xen_dma_info *dma_info = NULL; > + unsigned long irqflags; > > /* > * Ignore region specifiers - the kernel''s ideas of > @@ -497,7 +544,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > !range_straddles_page_boundary(phys, size)) > *dma_handle = dev_addr; > else { > - if (xen_create_contiguous_region(phys, order, > + if (xen_create_contiguous_region(phys & PAGE_MASK, order, > fls64(dma_mask), dma_handle) != 0) { > xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs); > return NULL; > @@ -509,15 +556,19 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > xen_destroy_contiguous_region(phys, order); > return NULL; > } > - dma_info->phys_addr = phys; > - dma_info->size = size; > + dma_info->phys_addr = phys & PAGE_MASK; > + dma_info->size = (1U << order) << PAGE_SHIFT; > dma_info->dma_addr = *dma_handle; > + atomic_set(&dma_info->refs, 1); > + spin_lock_irqsave(&swiotlb_lock, irqflags); > if (xen_dma_add_entry(dma_info)) { > + spin_unlock_irqrestore(&swiotlb_lock, irqflags); > pr_warn("cannot add new entry to bus_to_phys\n"); > xen_destroy_contiguous_region(phys, order); > kfree(dma_info); > return NULL; > } > + spin_unlock_irqrestore(&swiotlb_lock, irqflags); > } > memset(ret, 0, size); > return ret; > @@ -532,6 +583,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, > phys_addr_t phys; > u64 dma_mask = DMA_BIT_MASK(32); > struct xen_dma_info *dma_info = NULL; > + unsigned long flags; > > if (dma_release_from_coherent(hwdev, order, vaddr)) > return; > @@ -539,6 +591,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, > if (hwdev && hwdev->coherent_dma_mask) > dma_mask = hwdev->coherent_dma_mask; > > + spin_lock_irqsave(&swiotlb_lock, flags); > /* do not use virt_to_phys because on ARM it doesn''t return you the > * physical address */ > phys = xen_bus_to_phys(dev_addr); > @@ -546,12 +599,16 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, > if (xen_feature(XENFEAT_auto_translated_physmap) || > (((dev_addr + size - 1 > dma_mask)) || > range_straddles_page_boundary(phys, size))) { > - xen_destroy_contiguous_region(phys, order); > dma_info = xen_get_dma_info_from_dma(dev_addr); > - rb_erase(&dma_info->rbnode, &bus_to_phys); > - kfree(dma_info); > + if (atomic_dec_and_test(&dma_info->refs)) { > + xen_destroy_contiguous_region(phys & PAGE_MASK, order); > + rb_erase(&dma_info->rbnode_dma, &bus_to_phys); > + rb_erase(&dma_info->rbnode_phys, &phys_to_bus); > + kfree(dma_info); > + }If xen_pin_dev_page failed or was not called we would still end up calling this. And we would decrement a potentially garbage value? Or not.> } > > + spin_unlock_irqrestore(&swiotlb_lock, flags); > xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs); > } > EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent); > @@ -583,6 +640,23 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > !range_straddles_page_boundary(phys, size) && !swiotlb_force) > return dev_addr; > > + if (xen_feature(XENFEAT_auto_translated_physmap) && > + size <= PAGE_SIZE && > + !range_straddles_page_boundary(phys, size) && > + !swiotlb_force) { > + unsigned long flags; > + int rc; > + > + spin_lock_irqsave(&swiotlb_lock, flags); > + rc = xen_pin_dev_page(dev, phys, &dev_addr); > + spin_unlock_irqrestore(&swiotlb_lock, flags); > + > + if (!rc) { > + dma_mark_clean(phys_to_virt(phys), size); > + return dev_addr; > + }And if there is an rc you should probably do dev_warn(.., "RC ..") But more importantly - all of this code adds an extra lock on the X86 side which will get -ENOxxx on the xen_pin_dev_page. I am wondering if it makes sense to make most of this code dependent on CONFIG_ARM? As the check for auto-xlat falls flat on X86 + PVH. Thought I have no idea what we want to do with PVH and X86 at this point.> + } > + > /* > * Oh well, have to allocate and map a bounce buffer. > * Pass the dma_addr of the first slab in the iotlb buffer as > @@ -618,10 +692,37 @@ EXPORT_SYMBOL_GPL(xen_swiotlb_map_page); > static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr, > size_t size, enum dma_data_direction dir) > { > - phys_addr_t paddr = xen_bus_to_phys(dev_addr); > + struct xen_dma_info *dma_info; > + phys_addr_t paddr = DMA_ERROR_CODE; > + char *vaddr = NULL; > + unsigned long flags; > > BUG_ON(dir == DMA_NONE); > > + spin_lock_irqsave(&swiotlb_lock, flags); > + dma_info = xen_get_dma_info_from_dma(dev_addr); > + if (dma_info != NULL) { > + paddr = dma_info->phys_addr + (dev_addr - dma_info->dma_addr); > + vaddr = phys_to_virt(paddr); > + } > + > + if (xen_feature(XENFEAT_auto_translated_physmap) && > + paddr != DMA_ERROR_CODE && > + !(vaddr >= xen_io_tlb_start && vaddr < xen_io_tlb_end) && > + !swiotlb_force) { > + if (atomic_dec_and_test(&dma_info->refs)) { > + xen_destroy_contiguous_region(paddr & PAGE_MASK, 0); > + rb_erase(&dma_info->rbnode_dma, &bus_to_phys); > + rb_erase(&dma_info->rbnode_phys, &phys_to_bus); > + kfree(dma_info); > + } > + spin_unlock_irqrestore(&swiotlb_lock, flags); > + if ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)) > + dma_mark_clean(vaddr, size); > + return; > + } > + spin_unlock_irqrestore(&swiotlb_lock, flags); > + > /* NOTE: We use dev_addr here, not paddr! */ > if (is_xen_swiotlb_buffer(dev_addr)) { > swiotlb_tbl_unmap_single(hwdev, paddr, size, dir); > @@ -664,9 +765,19 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, > enum dma_sync_target target) > { > phys_addr_t paddr = xen_bus_to_phys(dev_addr); > + char *vaddr = phys_to_virt(paddr); > > BUG_ON(dir == DMA_NONE); > > + if (xen_feature(XENFEAT_auto_translated_physmap) && > + paddr != DMA_ERROR_CODE && > + size <= PAGE_SIZE && > + !(vaddr >= xen_io_tlb_start && vaddr < xen_io_tlb_end) && > + !range_straddles_page_boundary(paddr, size) && !swiotlb_force) { > + dma_mark_clean(vaddr, size); > + return; > + } > + > /* NOTE: We use dev_addr here, not paddr! */ > if (is_xen_swiotlb_buffer(dev_addr)) { > swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target); > @@ -717,13 +828,36 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > struct dma_attrs *attrs) > { > struct scatterlist *sg; > - int i; > + int i, rc; > + u64 dma_mask = DMA_BIT_MASK(32); > + unsigned long flags; > > BUG_ON(dir == DMA_NONE); > > + if (hwdev && hwdev->coherent_dma_mask) > + dma_mask = dma_alloc_coherent_mask(hwdev, GFP_KERNEL); > + > for_each_sg(sgl, sg, nelems, i) { > phys_addr_t paddr = sg_phys(sg); > - dma_addr_t dev_addr = xen_phys_to_bus_quick(paddr); > + dma_addr_t dev_addr; > + > + if (xen_feature(XENFEAT_auto_translated_physmap) && > + !range_straddles_page_boundary(paddr, sg->length) && > + sg->length <= PAGE_SIZE && > + !swiotlb_force) { > + > + spin_lock_irqsave(&swiotlb_lock, flags); > + rc = xen_pin_dev_page(hwdev, paddr, &dev_addr); > + spin_unlock_irqrestore(&swiotlb_lock, flags); > + > + if (!rc) { > + dma_mark_clean(phys_to_virt(paddr), sg->length); > + sg_dma_len(sg) = sg->length; > + sg->dma_address = dev_addr; > + continue; > + } > + } > + dev_addr = xen_phys_to_bus_quick(paddr); > > if (swiotlb_force || > xen_feature(XENFEAT_auto_translated_physmap) || > -- > 1.7.2.5 >
Catalin Marinas
2013-Oct-01 13:40 UTC
Re: [PATCH v6 10/19] xen: introduce xen_alloc/free_coherent_pages
On 30 Sep 2013, at 16:31, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Fri, Sep 27, 2013 at 05:09:58PM +0100, Stefano Stabellini wrote: >> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu >> and devices. On native x86 and ARMv8 is sufficient to call >> __get_free_pages in order to get a coherent buffer, while on ARM we need >> to call the native dma_ops->alloc implementation. >> >> When arm64 stops using the swiotlb by default and starts having multiple >> dma_ops implementations, we''ll use __get_dma_ops there too. > > I presume this is a future TODO, not some further patch (in which > case you should say in here the title of it). If it is a TODO could > you stick that in the sentence here somewhere to make it crytal clear that > it is not implemented.Thanks for pointing this out. The arm64 patch shouldn''t be different from arm here. I thought a subsequent patch already added __get_dma_ops() for arm64. Catalin-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Stefano Stabellini
2013-Oct-02 17:03 UTC
Re: [PATCH v6 10/19] xen: introduce xen_alloc/free_coherent_pages
On Tue, 1 Oct 2013, Catalin Marinas wrote:> On 30 Sep 2013, at 16:31, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > On Fri, Sep 27, 2013 at 05:09:58PM +0100, Stefano Stabellini wrote: > >> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu > >> and devices. On native x86 and ARMv8 is sufficient to call > >> __get_free_pages in order to get a coherent buffer, while on ARM we need > >> to call the native dma_ops->alloc implementation. > >> > >> When arm64 stops using the swiotlb by default and starts having multiple > >> dma_ops implementations, we''ll use __get_dma_ops there too. > > > > I presume this is a future TODO, not some further patch (in which > > case you should say in here the title of it). If it is a TODO could > > you stick that in the sentence here somewhere to make it crytal clear that > > it is not implemented. > > Thanks for pointing this out. The arm64 patch shouldn''t be different > from arm here. I thought a subsequent patch already added > __get_dma_ops() for arm64.Yes, the arm64 patch already added __get_dma_ops. However I am not using __get_dma_ops to implement xen_alloc_coherent_pages for arm64 because I can just use __get_free_pages for the moment:> diff --git a/arch/arm64/include/asm/xen/page-coherent.h b/arch/arm64/include/asm/xen/page-coherent.h > new file mode 100644 > index 0000000..0d6ad25 > --- /dev/null > +++ b/arch/arm64/include/asm/xen/page-coherent.h > @@ -0,0 +1,24 @@ > +#ifndef _ASM_ARM64_XEN_PAGE_COHERENT_H > +#define _ASM_ARM64_XEN_PAGE_COHERENT_H > + > +#include <asm/page.h> > +#include <linux/dma-attrs.h> > +#include <linux/dma-mapping.h> > + > +static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size, > + dma_addr_t *dma_handle, gfp_t flags, > + struct dma_attrs *attrs) > +{ > + void *vstart = (void*)__get_free_pages(flags, get_order(size)); > + *dma_handle = virt_to_phys(vstart); > + return vstart; > +} > + > +static inline void xen_free_coherent_pages(struct device *hwdev, size_t size, > + void *cpu_addr, dma_addr_t dma_handle, > + struct dma_attrs *attrs) > +{ > + free_pages((unsigned long) cpu_addr, get_order(size)); > +} > + > +#endif /* _ASM_ARM64_XEN_PAGE_COHERENT_H */
Catalin Marinas
2013-Oct-02 17:07 UTC
Re: [PATCH v6 10/19] xen: introduce xen_alloc/free_coherent_pages
On Wed, Oct 02, 2013 at 06:03:46PM +0100, Stefano Stabellini wrote:> On Tue, 1 Oct 2013, Catalin Marinas wrote: > > On 30 Sep 2013, at 16:31, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > On Fri, Sep 27, 2013 at 05:09:58PM +0100, Stefano Stabellini wrote: > > >> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu > > >> and devices. On native x86 and ARMv8 is sufficient to call > > >> __get_free_pages in order to get a coherent buffer, while on ARM we need > > >> to call the native dma_ops->alloc implementation. > > >> > > >> When arm64 stops using the swiotlb by default and starts having multiple > > >> dma_ops implementations, we''ll use __get_dma_ops there too. > > > > > > I presume this is a future TODO, not some further patch (in which > > > case you should say in here the title of it). If it is a TODO could > > > you stick that in the sentence here somewhere to make it crytal clear that > > > it is not implemented. > > > > Thanks for pointing this out. The arm64 patch shouldn''t be different > > from arm here. I thought a subsequent patch already added > > __get_dma_ops() for arm64. > > Yes, the arm64 patch already added __get_dma_ops. > > However I am not using __get_dma_ops to implement > xen_alloc_coherent_pages for arm64 because I can just use > __get_free_pages for the moment:So why do the work twice when we''ll get coherency maintenance in the DMA ops? -- Catalin
Stefano Stabellini
2013-Oct-02 17:13 UTC
Re: [PATCH v6 15/19] swiotlb-xen: call dma_capable only if dev->dma_mask is allocated
The real issue is that some devices (xgmac, I am looking at you), don''t set the dma_mask, even though they are perfectly capable of doing dma. Maybe I should modify the arm implementation of dma_capable (see http://marc.info/?l=linux-kernel&m=138029832007821&w=2) to ignore the dma_mask. On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 27, 2013 at 05:10:03PM +0100, Stefano Stabellini wrote: > > Why? I am looking at X86 and IA64 and I see: > > 79 if (!dev->dma_mask) > 80 return 0; > > > Why not port dma_capable over to ARM? > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > drivers/xen/swiotlb-xen.c | 6 +++--- > > 1 files changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > > index 790c2eb..3011736 100644 > > --- a/drivers/xen/swiotlb-xen.c > > +++ b/drivers/xen/swiotlb-xen.c > > @@ -512,7 +512,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > * buffering it. > > */ > > if (!xen_feature(XENFEAT_auto_translated_physmap) && > > - dma_capable(dev, dev_addr, size) && > > + dev->dma_mask && dma_capable(dev, dev_addr, size) && > > !range_straddles_page_boundary(phys, size) && !swiotlb_force) > > return dev_addr; > > > > @@ -532,7 +532,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > /* > > * Ensure that the address returned is DMA''ble > > */ > > - if (!dma_capable(dev, dev_addr, size)) { > > + if (dev->dma_mask && !dma_capable(dev, dev_addr, size)) { > > swiotlb_tbl_unmap_single(dev, map, size, dir); > > dev_addr = 0; > > } > > @@ -660,7 +660,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > > > if (swiotlb_force || > > xen_feature(XENFEAT_auto_translated_physmap) || > > - !dma_capable(hwdev, dev_addr, sg->length) || > > + (hwdev->dma_mask && !dma_capable(hwdev, dev_addr, sg->length)) || > > range_straddles_page_boundary(paddr, sg->length)) { > > /* > > * Pass the dma_addr of the first slab in the iotlb buffer as > > -- > > 1.7.2.5 > > >
Stefano Stabellini
2013-Oct-02 17:14 UTC
Re: [PATCH v6 10/19] xen: introduce xen_alloc/free_coherent_pages
On Wed, 2 Oct 2013, Catalin Marinas wrote:> On Wed, Oct 02, 2013 at 06:03:46PM +0100, Stefano Stabellini wrote: > > On Tue, 1 Oct 2013, Catalin Marinas wrote: > > > On 30 Sep 2013, at 16:31, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > > On Fri, Sep 27, 2013 at 05:09:58PM +0100, Stefano Stabellini wrote: > > > >> xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu > > > >> and devices. On native x86 and ARMv8 is sufficient to call > > > >> __get_free_pages in order to get a coherent buffer, while on ARM we need > > > >> to call the native dma_ops->alloc implementation. > > > >> > > > >> When arm64 stops using the swiotlb by default and starts having multiple > > > >> dma_ops implementations, we''ll use __get_dma_ops there too. > > > > > > > > I presume this is a future TODO, not some further patch (in which > > > > case you should say in here the title of it). If it is a TODO could > > > > you stick that in the sentence here somewhere to make it crytal clear that > > > > it is not implemented. > > > > > > Thanks for pointing this out. The arm64 patch shouldn''t be different > > > from arm here. I thought a subsequent patch already added > > > __get_dma_ops() for arm64. > > > > Yes, the arm64 patch already added __get_dma_ops. > > > > However I am not using __get_dma_ops to implement > > xen_alloc_coherent_pages for arm64 because I can just use > > __get_free_pages for the moment: > > So why do the work twice when we''ll get coherency maintenance in the DMA > ops?Yeah, good point.
Konrad Rzeszutek Wilk
2013-Oct-02 17:22 UTC
Re: [PATCH v6 15/19] swiotlb-xen: call dma_capable only if dev->dma_mask is allocated
On Wed, Oct 02, 2013 at 06:13:35PM +0100, Stefano Stabellini wrote:> The real issue is that some devices (xgmac, I am looking at you), don''t > set the dma_mask, even though they are perfectly capable of doing dma.So this looks like a bug in the drivers. I thought I saw a huge patchset by the ARM maintainer that tries to fix the dma_mask and dma_mask_coherent bugs? And also fix the drivers to use the dma mask?> > Maybe I should modify the arm implementation of dma_capable (see > http://marc.info/?l=linux-kernel&m=138029832007821&w=2) to ignore the > dma_mask.Or fix the ''xgmac''?> > On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote: > > On Fri, Sep 27, 2013 at 05:10:03PM +0100, Stefano Stabellini wrote: > > > > Why? I am looking at X86 and IA64 and I see: > > > > 79 if (!dev->dma_mask) > > 80 return 0; > > > > > > Why not port dma_capable over to ARM? > > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > --- > > > drivers/xen/swiotlb-xen.c | 6 +++--- > > > 1 files changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > > > index 790c2eb..3011736 100644 > > > --- a/drivers/xen/swiotlb-xen.c > > > +++ b/drivers/xen/swiotlb-xen.c > > > @@ -512,7 +512,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > > * buffering it. > > > */ > > > if (!xen_feature(XENFEAT_auto_translated_physmap) && > > > - dma_capable(dev, dev_addr, size) && > > > + dev->dma_mask && dma_capable(dev, dev_addr, size) && > > > !range_straddles_page_boundary(phys, size) && !swiotlb_force) > > > return dev_addr; > > > > > > @@ -532,7 +532,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > > /* > > > * Ensure that the address returned is DMA''ble > > > */ > > > - if (!dma_capable(dev, dev_addr, size)) { > > > + if (dev->dma_mask && !dma_capable(dev, dev_addr, size)) { > > > swiotlb_tbl_unmap_single(dev, map, size, dir); > > > dev_addr = 0; > > > } > > > @@ -660,7 +660,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > > > > > if (swiotlb_force || > > > xen_feature(XENFEAT_auto_translated_physmap) || > > > - !dma_capable(hwdev, dev_addr, sg->length) || > > > + (hwdev->dma_mask && !dma_capable(hwdev, dev_addr, sg->length)) || > > > range_straddles_page_boundary(paddr, sg->length)) { > > > /* > > > * Pass the dma_addr of the first slab in the iotlb buffer as > > > -- > > > 1.7.2.5 > > > > >
Stefano Stabellini
2013-Oct-02 17:23 UTC
Re: [PATCH v6 18/19] swiotlb-xen: introduce a rbtree to track phys to bus mappings
On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 27, 2013 at 05:10:06PM +0100, Stefano Stabellini wrote: > > Introduce a second red-back tree to track phys to bus mappings created after > > the initialization of the swiotlb buffer. > > Could you explain the use case a bit more please? > > As in: > a) why is this neededThe reason I introduced it in this patch series was to keep track of existing physical to dma mappings so that I could start to dynamically pin single pages and avoid bouncing on the swiotlb all the time. See the following patch in the series. However it turns out that memcpying is faster that going to Xen for the gpfn->mfn translation all the time. So I''ll drop "swiotlb-xen: instead of bouncing on the swiotlb, pin single pages" from the series. However with or without that patch, I would still like to keep this second tree to keep track of physical to dma mappings created by gnttab_map_refs: GNTTABOP_map_grant_ref returns the mfn of the granted page and we can exploit it to avoid bouncing on the swiotlb buffer for DMA operations on granted pages.> b) why are we using the rb tree instead of something else (say FIFO queue)Given the type of workload, I though that it would be the best fit. It also happens to be the same data structure used by XenServer in their kernel to achieve something similar. I don''t have hard numbers though.> c) how long are these in usage?During the entire life of the guest VM.> d) do we need locking now?Yes, but only for tree insertions, deletions and lookups.> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > drivers/xen/swiotlb-xen.c | 99 +++++++++++++++++++++++++++++++++++++------- > > 1 files changed, 83 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > > index 3011736..022bcaf 100644 > > --- a/drivers/xen/swiotlb-xen.c > > +++ b/drivers/xen/swiotlb-xen.c > > @@ -79,7 +79,8 @@ struct xen_dma_info { > > dma_addr_t dma_addr; > > phys_addr_t phys_addr; > > size_t size; > > - struct rb_node rbnode; > > + struct rb_node rbnode_dma; > > + struct rb_node rbnode_phys; > > }; > > > > /* > > @@ -96,8 +97,13 @@ static struct xen_dma_info *xen_dma_seg; > > * mappings. > > */ > > static struct rb_root bus_to_phys = RB_ROOT; > > +/* > > + * This tree keeps track of physical address to bus address > > + * mappings apart from the ones belonging to the initial swiotlb buffer. > > + */ > > +static struct rb_root phys_to_bus = RB_ROOT; > > > > -static int xen_dma_add_entry(struct xen_dma_info *new) > > +static int xen_dma_add_entry_bus(struct xen_dma_info *new) > > { > > struct rb_node **link = &bus_to_phys.rb_node; > > struct rb_node *parent = NULL; > > @@ -106,7 +112,7 @@ static int xen_dma_add_entry(struct xen_dma_info *new) > > > > while (*link) { > > parent = *link; > > - entry = rb_entry(parent, struct xen_dma_info, rbnode); > > + entry = rb_entry(parent, struct xen_dma_info, rbnode_dma); > > > > if (new->dma_addr == entry->dma_addr) > > goto err_out; > > @@ -118,8 +124,41 @@ static int xen_dma_add_entry(struct xen_dma_info *new) > > else > > link = &(*link)->rb_right; > > } > > - rb_link_node(&new->rbnode, parent, link); > > - rb_insert_color(&new->rbnode, &bus_to_phys); > > + rb_link_node(&new->rbnode_dma, parent, link); > > + rb_insert_color(&new->rbnode_dma, &bus_to_phys); > > + goto out; > > + > > +err_out: > > + rc = -EINVAL; > > + pr_warn("%s: cannot add phys=%pa -> dma=%pa: phys=%pa -> dma=%pa already exists\n", > > + __func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr); > > +out: > > + return rc; > > +} > > + > > +static int xen_dma_add_entry_phys(struct xen_dma_info *new) > > +{ > > + struct rb_node **link = &phys_to_bus.rb_node; > > + struct rb_node *parent = NULL; > > + struct xen_dma_info *entry; > > + int rc = 0; > > + > > + while (*link) { > > + parent = *link; > > + entry = rb_entry(parent, struct xen_dma_info, rbnode_phys); > > + > > + if (new->dma_addr == entry->dma_addr) > > + goto err_out; > > + if (new->phys_addr == entry->phys_addr) > > + goto err_out; > > + > > + if (new->phys_addr < entry->phys_addr) > > + link = &(*link)->rb_left; > > + else > > + link = &(*link)->rb_right; > > + } > > + rb_link_node(&new->rbnode_phys, parent, link); > > + rb_insert_color(&new->rbnode_phys, &phys_to_bus); > > goto out; > > > > err_out: > > @@ -130,13 +169,22 @@ out: > > return rc; > > } > > > > +static int xen_dma_add_entry(struct xen_dma_info *new) > > +{ > > + int rc; > > + if ((rc = xen_dma_add_entry_bus(new) < 0) || > > + (rc = xen_dma_add_entry_phys(new) < 0)) > > Please don''t do that. Just do > > rc = xen_dma_add_entry(bus); > if (rc) > return rc; > rc = xen_dma_add_entry_phys(new); > if (rc) { > // unwind it somehow? <<== This is important :-) > } > return rc; > > > > + return rc; > > + return 0; > > +} > > + > > static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) > > { > > struct rb_node *n = bus_to_phys.rb_node; > > struct xen_dma_info *entry; > > > > while (n) { > > - entry = rb_entry(n, struct xen_dma_info, rbnode); > > + entry = rb_entry(n, struct xen_dma_info, rbnode_dma); > > if (entry->dma_addr <= dma_addr && > > entry->dma_addr + entry->size > dma_addr) { > > return entry; > > @@ -150,11 +198,30 @@ static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) > > return NULL; > > } > > > > -static dma_addr_t xen_phys_to_bus(phys_addr_t paddr) > > +static struct xen_dma_info *xen_get_dma_info_from_phys(phys_addr_t phys) > > { > > - int nr_seg; > > - unsigned long offset; > > - char *vaddr; > > + struct rb_node *n = phys_to_bus.rb_node; > > + struct xen_dma_info *entry; > > + > > + while (n) { > > + entry = rb_entry(n, struct xen_dma_info, rbnode_phys); > > + if (entry->phys_addr <= phys && > > + entry->phys_addr + entry->size > phys) { > > + return entry; > > + } > > + if (phys < entry->phys_addr) > > + n = n->rb_left; > > + else > > + n = n->rb_right; > > + } > > + > > + return NULL; > > +} > > + > > +/* Only looks into the initial buffer allocation in case of > > + * XENFEAT_auto_translated_physmap guests. */ > > +static dma_addr_t xen_phys_to_bus_quick(phys_addr_t paddr) { int nr_seg; > > + unsigned long offset; char *vaddr; > > > > if (!xen_feature(XENFEAT_auto_translated_physmap)) > > return phys_to_machine(XPADDR(paddr)).maddr; > > @@ -184,7 +251,7 @@ static phys_addr_t xen_bus_to_phys(dma_addr_t baddr) > > > > static dma_addr_t xen_virt_to_bus(void *address) > > { > > - return xen_phys_to_bus(virt_to_phys(address)); > > + return xen_phys_to_bus_quick(virt_to_phys(address)); > > } > > > > static int check_pages_physically_contiguous(unsigned long pfn, > > @@ -424,7 +491,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > > * Do not use virt_to_phys(ret) because on ARM it doesn''t correspond > > * to *dma_handle. */ > > phys = *dma_handle; > > - dev_addr = xen_phys_to_bus(phys); > > + dev_addr = xen_phys_to_bus_quick(phys); > > if (!xen_feature(XENFEAT_auto_translated_physmap) && > > ((dev_addr + size - 1 <= dma_mask)) && > > !range_straddles_page_boundary(phys, size)) > > @@ -503,7 +570,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > struct dma_attrs *attrs) > > { > > phys_addr_t map, phys = page_to_phys(page) + offset; > > - dma_addr_t dev_addr = xen_phys_to_bus(phys); > > + dma_addr_t dev_addr = xen_phys_to_bus_quick(phys); > > > > BUG_ON(dir == DMA_NONE); > > /* > > @@ -527,7 +594,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > if (map == SWIOTLB_MAP_ERROR) > > return DMA_ERROR_CODE; > > > > - dev_addr = xen_phys_to_bus(map); > > + dev_addr = xen_phys_to_bus_quick(map); > > > > /* > > * Ensure that the address returned is DMA''ble > > @@ -656,7 +723,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > > > for_each_sg(sgl, sg, nelems, i) { > > phys_addr_t paddr = sg_phys(sg); > > - dma_addr_t dev_addr = xen_phys_to_bus(paddr); > > + dma_addr_t dev_addr = xen_phys_to_bus_quick(paddr); > > > > if (swiotlb_force || > > xen_feature(XENFEAT_auto_translated_physmap) || > > @@ -682,7 +749,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > sg_dma_len(sgl) = 0; > > return DMA_ERROR_CODE; > > } > > - sg->dma_address = xen_phys_to_bus(map); > > + sg->dma_address = xen_phys_to_bus_quick(map); > > } else > > sg->dma_address = dev_addr; > > sg_dma_len(sg) = sg->length; > > -- > > 1.7.2.5 > > >
Stefano Stabellini
2013-Oct-02 17:31 UTC
Re: [PATCH v6 12/19] swiotlb: don''t assume that io_tlb_start-io_tlb_end is coherent
On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 27, 2013 at 05:10:00PM +0100, Stefano Stabellini wrote: > > The swiotlb code has appropriate calls to dma_mark_clean in place for > > buffers passed to swiotlb_map_page as an argument. However it assumes > > that the swiotlb bounce buffer (io_tlb_start-io_tlb_end) is already > > coherent and doesn''t need any calls to dma_mark_clean. > > > > On ARM the swiotlb bounce buffer is not coherent (the memory is > > writealloc while it should be bufferable) and therefore we need to call > > dma_mark_clean appropriately on the bounce buffer code paths too. > > > > Note that most architecures have an empty dma_mark_clean implementation > > anyway. > > The other architecture that uses swiotlb is IA64 and that does have > an implementation where it touches on page attributes. > > Which means I have to figure out why my HP zx6000 won''t boot with 3.11 now :-( >Now this is a very thorny issue. Honestly I don''t like the dma_mark_clean interface very much: it''s one big hammer, when we actually need some finesse to handle coherency. For example on ARM some devices might not need the dma_mark_clean call, while others do. Calling it all the times is at the very best inefficient and incorrect at worst. I am thinking of calling the original map/unmap_page functions instead (arm_dma_map_page or arm_coherent_dma_map_page in the arm case). However in order to do that I would need to add more __get_dma_ops calls in both lib/swiotlb.c and drivers/xen/swiotlb-xen.c> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > lib/swiotlb.c | 13 ++++++++++--- > > 1 files changed, 10 insertions(+), 3 deletions(-) > > > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > > index 4e8686c..eb45d17 100644 > > --- a/lib/swiotlb.c > > +++ b/lib/swiotlb.c > > @@ -515,6 +515,7 @@ found: > > io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT); > > if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) > > swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE); > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > > > return tlb_addr; > > } > > @@ -547,7 +548,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, > > * First, sync the memory before unmapping the entry > > */ > > if (orig_addr && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))) > > + { > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE); > > + } > > > > /* > > * Return the buffer to the free list by setting the corresponding > > @@ -587,17 +591,20 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr, > > > > switch (target) { > > case SYNC_FOR_CPU: > > - if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) > > + if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) { > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > swiotlb_bounce(orig_addr, tlb_addr, > > size, DMA_FROM_DEVICE); > > + } > > else > > BUG_ON(dir != DMA_TO_DEVICE); > > break; > > case SYNC_FOR_DEVICE: > > - if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) > > + if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) { > > swiotlb_bounce(orig_addr, tlb_addr, > > size, DMA_TO_DEVICE); > > - else > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > + } else > > BUG_ON(dir != DMA_FROM_DEVICE); > > break; > > default: > > -- > > 1.7.2.5 > > >
Rob Herring
2013-Oct-03 18:35 UTC
Re: [PATCH v6 15/19] swiotlb-xen: call dma_capable only if dev->dma_mask is allocated
On Wed, Oct 2, 2013 at 12:22 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> On Wed, Oct 02, 2013 at 06:13:35PM +0100, Stefano Stabellini wrote: >> The real issue is that some devices (xgmac, I am looking at you), don''t >> set the dma_mask, even though they are perfectly capable of doing dma. > > So this looks like a bug in the drivers. I thought I saw a huge patchset > by the ARM maintainer that tries to fix the dma_mask and dma_mask_coherent > bugs? And also fix the drivers to use the dma mask?I think Russell only fixed incorrect handling of masks, not missing handling.>> >> Maybe I should modify the arm implementation of dma_capable (see >> http://marc.info/?l=linux-kernel&m=138029832007821&w=2) to ignore the >> dma_mask. > > Or fix the ''xgmac''?There''s really some core changes needed to fix this and I''d guess there are lots of other cases of missing dma_mask. The problem is platform devices don''t create a mask to assign to dma_mask in the first place. I think the right solution is assigning dma_mask to &coherent_dma_mask by default as Russell did for AMBA devices. I don''t know if doing that would break anything or not. Rob>> >> On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote: >> > On Fri, Sep 27, 2013 at 05:10:03PM +0100, Stefano Stabellini wrote: >> > >> > Why? I am looking at X86 and IA64 and I see: >> > >> > 79 if (!dev->dma_mask) >> > 80 return 0; >> > >> > >> > Why not port dma_capable over to ARM? >> > >> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> > > --- >> > > drivers/xen/swiotlb-xen.c | 6 +++--- >> > > 1 files changed, 3 insertions(+), 3 deletions(-) >> > > >> > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c >> > > index 790c2eb..3011736 100644 >> > > --- a/drivers/xen/swiotlb-xen.c >> > > +++ b/drivers/xen/swiotlb-xen.c >> > > @@ -512,7 +512,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, >> > > * buffering it. >> > > */ >> > > if (!xen_feature(XENFEAT_auto_translated_physmap) && >> > > - dma_capable(dev, dev_addr, size) && >> > > + dev->dma_mask && dma_capable(dev, dev_addr, size) && >> > > !range_straddles_page_boundary(phys, size) && !swiotlb_force) >> > > return dev_addr; >> > > >> > > @@ -532,7 +532,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, >> > > /* >> > > * Ensure that the address returned is DMA''ble >> > > */ >> > > - if (!dma_capable(dev, dev_addr, size)) { >> > > + if (dev->dma_mask && !dma_capable(dev, dev_addr, size)) { >> > > swiotlb_tbl_unmap_single(dev, map, size, dir); >> > > dev_addr = 0; >> > > } >> > > @@ -660,7 +660,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, >> > > >> > > if (swiotlb_force || >> > > xen_feature(XENFEAT_auto_translated_physmap) || >> > > - !dma_capable(hwdev, dev_addr, sg->length) || >> > > + (hwdev->dma_mask && !dma_capable(hwdev, dev_addr, sg->length)) || >> > > range_straddles_page_boundary(paddr, sg->length)) { >> > > /* >> > > * Pass the dma_addr of the first slab in the iotlb buffer as >> > > -- >> > > 1.7.2.5 >> > > >> > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Konrad Rzeszutek Wilk
2013-Oct-04 13:23 UTC
Re: [PATCH v6 12/19] swiotlb: don''t assume that io_tlb_start-io_tlb_end is coherent
On Wed, Oct 02, 2013 at 06:31:57PM +0100, Stefano Stabellini wrote:> On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote: > > On Fri, Sep 27, 2013 at 05:10:00PM +0100, Stefano Stabellini wrote: > > > The swiotlb code has appropriate calls to dma_mark_clean in place for > > > buffers passed to swiotlb_map_page as an argument. However it assumes > > > that the swiotlb bounce buffer (io_tlb_start-io_tlb_end) is already > > > coherent and doesn''t need any calls to dma_mark_clean. > > > > > > On ARM the swiotlb bounce buffer is not coherent (the memory is > > > writealloc while it should be bufferable) and therefore we need to call > > > dma_mark_clean appropriately on the bounce buffer code paths too. > > > > > > Note that most architecures have an empty dma_mark_clean implementation > > > anyway. > > > > The other architecture that uses swiotlb is IA64 and that does have > > an implementation where it touches on page attributes. > > > > Which means I have to figure out why my HP zx6000 won''t boot with 3.11 now :-( > > > > Now this is a very thorny issue. > > Honestly I don''t like the dma_mark_clean interface very much: it''s one > big hammer, when we actually need some finesse to handle coherency. > > For example on ARM some devices might not need the dma_mark_clean call, > while others do. Calling it all the times is at the very best > inefficient and incorrect at worst. > > I am thinking of calling the original map/unmap_page functions instead > (arm_dma_map_page or arm_coherent_dma_map_page in the arm case). > However in order to do that I would need to add more __get_dma_ops calls in > both lib/swiotlb.c and drivers/xen/swiotlb-xen.cI think that is OK for the Xen-SWIOTLB case. For the lib/swiotlb - would that mean that non-Xen-ARM would use the SWIOTLB? If so, I am OK with that too.> > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > --- > > > lib/swiotlb.c | 13 ++++++++++--- > > > 1 files changed, 10 insertions(+), 3 deletions(-) > > > > > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > > > index 4e8686c..eb45d17 100644 > > > --- a/lib/swiotlb.c > > > +++ b/lib/swiotlb.c > > > @@ -515,6 +515,7 @@ found: > > > io_tlb_orig_addr[index+i] = orig_addr + (i << IO_TLB_SHIFT); > > > if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) > > > swiotlb_bounce(orig_addr, tlb_addr, size, DMA_TO_DEVICE); > > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > > > > > return tlb_addr; > > > } > > > @@ -547,7 +548,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, > > > * First, sync the memory before unmapping the entry > > > */ > > > if (orig_addr && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))) > > > + { > > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > > swiotlb_bounce(orig_addr, tlb_addr, size, DMA_FROM_DEVICE); > > > + } > > > > > > /* > > > * Return the buffer to the free list by setting the corresponding > > > @@ -587,17 +591,20 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr, > > > > > > switch (target) { > > > case SYNC_FOR_CPU: > > > - if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) > > > + if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) { > > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > > swiotlb_bounce(orig_addr, tlb_addr, > > > size, DMA_FROM_DEVICE); > > > + } > > > else > > > BUG_ON(dir != DMA_TO_DEVICE); > > > break; > > > case SYNC_FOR_DEVICE: > > > - if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) > > > + if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) { > > > swiotlb_bounce(orig_addr, tlb_addr, > > > size, DMA_TO_DEVICE); > > > - else > > > + dma_mark_clean(phys_to_virt(tlb_addr), size); > > > + } else > > > BUG_ON(dir != DMA_FROM_DEVICE); > > > break; > > > default: > > > -- > > > 1.7.2.5 > > > > >
Stefano Stabellini
2013-Oct-09 17:25 UTC
Re: [PATCH v6 18/19] swiotlb-xen: introduce a rbtree to track phys to bus mappings
On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 27, 2013 at 05:10:06PM +0100, Stefano Stabellini wrote: > > Introduce a second red-back tree to track phys to bus mappings created after > > the initialization of the swiotlb buffer. > > Could you explain the use case a bit more please? > > As in: > a) why is this needed > b) why are we using the rb tree instead of something else (say FIFO queue) > c) how long are these in usage? > d) do we need locking now?Given that I was rebuilding a P2M for autotranslate guests in swiotlb-xen, making the code much harder to read, I decided to go the alternative route of providing a real P2M to ARM guest. The next swiotlb series will introduce arch/arm/xen/p2m.c, moving the two rbtrees there. As a consequence we can start using swiotlb-xen on ARM with minimal changes. The whole thing looks much nicer now.> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > drivers/xen/swiotlb-xen.c | 99 +++++++++++++++++++++++++++++++++++++------- > > 1 files changed, 83 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > > index 3011736..022bcaf 100644 > > --- a/drivers/xen/swiotlb-xen.c > > +++ b/drivers/xen/swiotlb-xen.c > > @@ -79,7 +79,8 @@ struct xen_dma_info { > > dma_addr_t dma_addr; > > phys_addr_t phys_addr; > > size_t size; > > - struct rb_node rbnode; > > + struct rb_node rbnode_dma; > > + struct rb_node rbnode_phys; > > }; > > > > /* > > @@ -96,8 +97,13 @@ static struct xen_dma_info *xen_dma_seg; > > * mappings. > > */ > > static struct rb_root bus_to_phys = RB_ROOT; > > +/* > > + * This tree keeps track of physical address to bus address > > + * mappings apart from the ones belonging to the initial swiotlb buffer. > > + */ > > +static struct rb_root phys_to_bus = RB_ROOT; > > > > -static int xen_dma_add_entry(struct xen_dma_info *new) > > +static int xen_dma_add_entry_bus(struct xen_dma_info *new) > > { > > struct rb_node **link = &bus_to_phys.rb_node; > > struct rb_node *parent = NULL; > > @@ -106,7 +112,7 @@ static int xen_dma_add_entry(struct xen_dma_info *new) > > > > while (*link) { > > parent = *link; > > - entry = rb_entry(parent, struct xen_dma_info, rbnode); > > + entry = rb_entry(parent, struct xen_dma_info, rbnode_dma); > > > > if (new->dma_addr == entry->dma_addr) > > goto err_out; > > @@ -118,8 +124,41 @@ static int xen_dma_add_entry(struct xen_dma_info *new) > > else > > link = &(*link)->rb_right; > > } > > - rb_link_node(&new->rbnode, parent, link); > > - rb_insert_color(&new->rbnode, &bus_to_phys); > > + rb_link_node(&new->rbnode_dma, parent, link); > > + rb_insert_color(&new->rbnode_dma, &bus_to_phys); > > + goto out; > > + > > +err_out: > > + rc = -EINVAL; > > + pr_warn("%s: cannot add phys=%pa -> dma=%pa: phys=%pa -> dma=%pa already exists\n", > > + __func__, &new->phys_addr, &new->dma_addr, &entry->phys_addr, &entry->dma_addr); > > +out: > > + return rc; > > +} > > + > > +static int xen_dma_add_entry_phys(struct xen_dma_info *new) > > +{ > > + struct rb_node **link = &phys_to_bus.rb_node; > > + struct rb_node *parent = NULL; > > + struct xen_dma_info *entry; > > + int rc = 0; > > + > > + while (*link) { > > + parent = *link; > > + entry = rb_entry(parent, struct xen_dma_info, rbnode_phys); > > + > > + if (new->dma_addr == entry->dma_addr) > > + goto err_out; > > + if (new->phys_addr == entry->phys_addr) > > + goto err_out; > > + > > + if (new->phys_addr < entry->phys_addr) > > + link = &(*link)->rb_left; > > + else > > + link = &(*link)->rb_right; > > + } > > + rb_link_node(&new->rbnode_phys, parent, link); > > + rb_insert_color(&new->rbnode_phys, &phys_to_bus); > > goto out; > > > > err_out: > > @@ -130,13 +169,22 @@ out: > > return rc; > > } > > > > +static int xen_dma_add_entry(struct xen_dma_info *new) > > +{ > > + int rc; > > + if ((rc = xen_dma_add_entry_bus(new) < 0) || > > + (rc = xen_dma_add_entry_phys(new) < 0)) > > Please don''t do that. Just do > > rc = xen_dma_add_entry(bus); > if (rc) > return rc; > rc = xen_dma_add_entry_phys(new); > if (rc) { > // unwind it somehow? <<== This is important :-) > } > return rc; > > > > + return rc; > > + return 0; > > +} > > + > > static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) > > { > > struct rb_node *n = bus_to_phys.rb_node; > > struct xen_dma_info *entry; > > > > while (n) { > > - entry = rb_entry(n, struct xen_dma_info, rbnode); > > + entry = rb_entry(n, struct xen_dma_info, rbnode_dma); > > if (entry->dma_addr <= dma_addr && > > entry->dma_addr + entry->size > dma_addr) { > > return entry; > > @@ -150,11 +198,30 @@ static struct xen_dma_info *xen_get_dma_info_from_dma(dma_addr_t dma_addr) > > return NULL; > > } > > > > -static dma_addr_t xen_phys_to_bus(phys_addr_t paddr) > > +static struct xen_dma_info *xen_get_dma_info_from_phys(phys_addr_t phys) > > { > > - int nr_seg; > > - unsigned long offset; > > - char *vaddr; > > + struct rb_node *n = phys_to_bus.rb_node; > > + struct xen_dma_info *entry; > > + > > + while (n) { > > + entry = rb_entry(n, struct xen_dma_info, rbnode_phys); > > + if (entry->phys_addr <= phys && > > + entry->phys_addr + entry->size > phys) { > > + return entry; > > + } > > + if (phys < entry->phys_addr) > > + n = n->rb_left; > > + else > > + n = n->rb_right; > > + } > > + > > + return NULL; > > +} > > + > > +/* Only looks into the initial buffer allocation in case of > > + * XENFEAT_auto_translated_physmap guests. */ > > +static dma_addr_t xen_phys_to_bus_quick(phys_addr_t paddr) { int nr_seg; > > + unsigned long offset; char *vaddr; > > > > if (!xen_feature(XENFEAT_auto_translated_physmap)) > > return phys_to_machine(XPADDR(paddr)).maddr; > > @@ -184,7 +251,7 @@ static phys_addr_t xen_bus_to_phys(dma_addr_t baddr) > > > > static dma_addr_t xen_virt_to_bus(void *address) > > { > > - return xen_phys_to_bus(virt_to_phys(address)); > > + return xen_phys_to_bus_quick(virt_to_phys(address)); > > } > > > > static int check_pages_physically_contiguous(unsigned long pfn, > > @@ -424,7 +491,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > > * Do not use virt_to_phys(ret) because on ARM it doesn''t correspond > > * to *dma_handle. */ > > phys = *dma_handle; > > - dev_addr = xen_phys_to_bus(phys); > > + dev_addr = xen_phys_to_bus_quick(phys); > > if (!xen_feature(XENFEAT_auto_translated_physmap) && > > ((dev_addr + size - 1 <= dma_mask)) && > > !range_straddles_page_boundary(phys, size)) > > @@ -503,7 +570,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > struct dma_attrs *attrs) > > { > > phys_addr_t map, phys = page_to_phys(page) + offset; > > - dma_addr_t dev_addr = xen_phys_to_bus(phys); > > + dma_addr_t dev_addr = xen_phys_to_bus_quick(phys); > > > > BUG_ON(dir == DMA_NONE); > > /* > > @@ -527,7 +594,7 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > if (map == SWIOTLB_MAP_ERROR) > > return DMA_ERROR_CODE; > > > > - dev_addr = xen_phys_to_bus(map); > > + dev_addr = xen_phys_to_bus_quick(map); > > > > /* > > * Ensure that the address returned is DMA''ble > > @@ -656,7 +723,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > > > for_each_sg(sgl, sg, nelems, i) { > > phys_addr_t paddr = sg_phys(sg); > > - dma_addr_t dev_addr = xen_phys_to_bus(paddr); > > + dma_addr_t dev_addr = xen_phys_to_bus_quick(paddr); > > > > if (swiotlb_force || > > xen_feature(XENFEAT_auto_translated_physmap) || > > @@ -682,7 +749,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > sg_dma_len(sgl) = 0; > > return DMA_ERROR_CODE; > > } > > - sg->dma_address = xen_phys_to_bus(map); > > + sg->dma_address = xen_phys_to_bus_quick(map); > > } else > > sg->dma_address = dev_addr; > > sg_dma_len(sg) = sg->length; > > -- > > 1.7.2.5 > > >
Stefano Stabellini
2013-Oct-09 17:27 UTC
Re: [PATCH v6 19/19] swiotlb-xen: instead of bouncing on the swiotlb, pin single pages
On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 27, 2013 at 05:10:07PM +0100, Stefano Stabellini wrote: > > If we are dealing with single page mappings that don''t cross page > > boundaries, we can try to pin the page and get the corresponding mfn, > > using xen_pin_page. This avoids going through the swiotlb bounce > > buffer. If xen_pin_page fails (because the underlying mfn doesn''t > > respect the dma_mask) fall back to the swiotlb bounce buffer. > > Add a ref count to xen_dma_info, so that we can avoid pinnig pages that > > are already pinned. > > Use a spinlock to protect accesses, insertions and deletions in the > > rbtrees. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Thanks for the review, however I am dropping this patch because it wasn''t improving performances as I was hoping it would be. I am taking a new approach now: I am keeping the 1:1 physical to machine mapping for dom0 and using swiotlb-xen only to handle dma requests involving foreign grants. The code is much nicer, and it runs much faster.> > drivers/xen/swiotlb-xen.c | 152 ++++++++++++++++++++++++++++++++++++++++++--- > > 1 files changed, 143 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > > index 022bcaf..6f94285 100644 > > --- a/drivers/xen/swiotlb-xen.c > > +++ b/drivers/xen/swiotlb-xen.c > > @@ -57,6 +57,8 @@ > > #define NR_DMA_SEGS ((xen_io_tlb_nslabs + IO_TLB_SEGSIZE - 1) / IO_TLB_SEGSIZE) > > static char *xen_io_tlb_start, *xen_io_tlb_end; > > static unsigned long xen_io_tlb_nslabs; > > +spinlock_t swiotlb_lock; > > + > > /* > > * Quick lookup value of the bus address of the IOTLB. > > */ > > @@ -79,6 +81,7 @@ struct xen_dma_info { > > dma_addr_t dma_addr; > > phys_addr_t phys_addr; > > size_t size; > > + atomic_t refs; > > struct rb_node rbnode_dma; > > struct rb_node rbnode_phys; > > }; > > @@ -254,6 +257,48 @@ static dma_addr_t xen_virt_to_bus(void *address) > > return xen_phys_to_bus_quick(virt_to_phys(address)); > > } > > > > +static int xen_pin_dev_page(struct device *dev, > > + phys_addr_t phys, > > + dma_addr_t *dev_addr) > > Something is odd with your tabs. > > +{ > > + u64 dma_mask = DMA_BIT_MASK(32); > > Why 32? > > > + xen_pfn_t in; > > + struct xen_dma_info *dma_info = xen_get_dma_info_from_phys(phys); > > + > > + if (dma_info != NULL) { > > + atomic_inc(&dma_info->refs); > > + *dev_addr = dma_info->dma_addr + (phys - dma_info->phys_addr); > > + return 0; > > + } > > + > > + if (dev && dev->coherent_dma_mask) > > + dma_mask = dma_alloc_coherent_mask(dev, GFP_KERNEL); > > + > > + in = phys >> PAGE_SHIFT; > > + if (!xen_pin_page(&in, fls64(dma_mask))) { > > Why not just make xen_pin_page use an phys address and it can also > do the appropiate bit shifting in it? > > > + *dev_addr = in << PAGE_SHIFT; > > + dma_info = kzalloc(sizeof(struct xen_dma_info), GFP_NOWAIT); > > + if (!dma_info) { > > + pr_warn("cannot allocate xen_dma_info\n"); > > + xen_destroy_contiguous_region(phys & PAGE_MASK, 0); > > Perhaps we should add an inline function for that called ''xen_unpin_page'' ? > > > + return -ENOMEM; > > + } > > + dma_info->phys_addr = phys & PAGE_MASK; > > + dma_info->size = PAGE_SIZE; > > + dma_info->dma_addr = *dev_addr; > > + if (xen_dma_add_entry(dma_info)) { > > + pr_warn("cannot add new entry to bus_to_phys\n"); > > + xen_destroy_contiguous_region(phys & PAGE_MASK, 0); > > + kfree(dma_info); > > + return -EFAULT; > > + } > > + atomic_set(&dma_info->refs, 1); > > + *dev_addr += (phys & ~PAGE_MASK); > > + return 0; > > + } > > Don''t you want to the opposite of dma_alloc_coherent_mask ? > > > + return -EFAULT; > > +} > > + > > static int check_pages_physically_contiguous(unsigned long pfn, > > unsigned int offset, > > size_t length) > > @@ -434,6 +479,7 @@ retry: > > rc = 0; > > } else > > rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs); > > + spin_lock_init(&swiotlb_lock); > > return rc; > > error: > > if (repeat--) { > > @@ -461,6 +507,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > > phys_addr_t phys; > > dma_addr_t dev_addr; > > struct xen_dma_info *dma_info = NULL; > > + unsigned long irqflags; > > > > /* > > * Ignore region specifiers - the kernel''s ideas of > > @@ -497,7 +544,7 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > > !range_straddles_page_boundary(phys, size)) > > *dma_handle = dev_addr; > > else { > > - if (xen_create_contiguous_region(phys, order, > > + if (xen_create_contiguous_region(phys & PAGE_MASK, order, > > fls64(dma_mask), dma_handle) != 0) { > > xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs); > > return NULL; > > @@ -509,15 +556,19 @@ xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, > > xen_destroy_contiguous_region(phys, order); > > return NULL; > > } > > - dma_info->phys_addr = phys; > > - dma_info->size = size; > > + dma_info->phys_addr = phys & PAGE_MASK; > > + dma_info->size = (1U << order) << PAGE_SHIFT; > > dma_info->dma_addr = *dma_handle; > > + atomic_set(&dma_info->refs, 1); > > + spin_lock_irqsave(&swiotlb_lock, irqflags); > > if (xen_dma_add_entry(dma_info)) { > > + spin_unlock_irqrestore(&swiotlb_lock, irqflags); > > pr_warn("cannot add new entry to bus_to_phys\n"); > > xen_destroy_contiguous_region(phys, order); > > kfree(dma_info); > > return NULL; > > } > > + spin_unlock_irqrestore(&swiotlb_lock, irqflags); > > } > > memset(ret, 0, size); > > return ret; > > @@ -532,6 +583,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, > > phys_addr_t phys; > > u64 dma_mask = DMA_BIT_MASK(32); > > struct xen_dma_info *dma_info = NULL; > > + unsigned long flags; > > > > if (dma_release_from_coherent(hwdev, order, vaddr)) > > return; > > @@ -539,6 +591,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, > > if (hwdev && hwdev->coherent_dma_mask) > > dma_mask = hwdev->coherent_dma_mask; > > > > + spin_lock_irqsave(&swiotlb_lock, flags); > > /* do not use virt_to_phys because on ARM it doesn''t return you the > > * physical address */ > > phys = xen_bus_to_phys(dev_addr); > > @@ -546,12 +599,16 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, > > if (xen_feature(XENFEAT_auto_translated_physmap) || > > (((dev_addr + size - 1 > dma_mask)) || > > range_straddles_page_boundary(phys, size))) { > > - xen_destroy_contiguous_region(phys, order); > > dma_info = xen_get_dma_info_from_dma(dev_addr); > > - rb_erase(&dma_info->rbnode, &bus_to_phys); > > - kfree(dma_info); > > + if (atomic_dec_and_test(&dma_info->refs)) { > > + xen_destroy_contiguous_region(phys & PAGE_MASK, order); > > + rb_erase(&dma_info->rbnode_dma, &bus_to_phys); > > + rb_erase(&dma_info->rbnode_phys, &phys_to_bus); > > + kfree(dma_info); > > + } > > If xen_pin_dev_page failed or was not called we would still end up > calling this. And we would decrement a potentially garbage value? Or not. > > } > > > > + spin_unlock_irqrestore(&swiotlb_lock, flags); > > xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs); > > } > > EXPORT_SYMBOL_GPL(xen_swiotlb_free_coherent); > > @@ -583,6 +640,23 @@ dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page, > > !range_straddles_page_boundary(phys, size) && !swiotlb_force) > > return dev_addr; > > > > + if (xen_feature(XENFEAT_auto_translated_physmap) && > > + size <= PAGE_SIZE && > > + !range_straddles_page_boundary(phys, size) && > > + !swiotlb_force) { > > + unsigned long flags; > > + int rc; > > + > > + spin_lock_irqsave(&swiotlb_lock, flags); > > + rc = xen_pin_dev_page(dev, phys, &dev_addr); > > + spin_unlock_irqrestore(&swiotlb_lock, flags); > > + > > + if (!rc) { > > + dma_mark_clean(phys_to_virt(phys), size); > > + return dev_addr; > > + } > > And if there is an rc you should probably do > dev_warn(.., "RC ..") > > > But more importantly - all of this code adds an extra lock on the X86 side > which will get -ENOxxx on the xen_pin_dev_page. > > I am wondering if it makes sense to make most of this code dependent > on CONFIG_ARM? As the check for auto-xlat falls flat on X86 + PVH. Thought > I have no idea what we want to do with PVH and X86 at this point. > > > + } > > + > > /* > > * Oh well, have to allocate and map a bounce buffer. > > * Pass the dma_addr of the first slab in the iotlb buffer as > > @@ -618,10 +692,37 @@ EXPORT_SYMBOL_GPL(xen_swiotlb_map_page); > > static void xen_unmap_single(struct device *hwdev, dma_addr_t dev_addr, > > size_t size, enum dma_data_direction dir) > > { > > - phys_addr_t paddr = xen_bus_to_phys(dev_addr); > > + struct xen_dma_info *dma_info; > > + phys_addr_t paddr = DMA_ERROR_CODE; > > + char *vaddr = NULL; > > + unsigned long flags; > > > > BUG_ON(dir == DMA_NONE); > > > > + spin_lock_irqsave(&swiotlb_lock, flags); > > + dma_info = xen_get_dma_info_from_dma(dev_addr); > > + if (dma_info != NULL) { > > + paddr = dma_info->phys_addr + (dev_addr - dma_info->dma_addr); > > + vaddr = phys_to_virt(paddr); > > + } > > + > > + if (xen_feature(XENFEAT_auto_translated_physmap) && > > + paddr != DMA_ERROR_CODE && > > + !(vaddr >= xen_io_tlb_start && vaddr < xen_io_tlb_end) && > > + !swiotlb_force) { > > + if (atomic_dec_and_test(&dma_info->refs)) { > > + xen_destroy_contiguous_region(paddr & PAGE_MASK, 0); > > + rb_erase(&dma_info->rbnode_dma, &bus_to_phys); > > + rb_erase(&dma_info->rbnode_phys, &phys_to_bus); > > + kfree(dma_info); > > + } > > + spin_unlock_irqrestore(&swiotlb_lock, flags); > > + if ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)) > > + dma_mark_clean(vaddr, size); > > + return; > > + } > > + spin_unlock_irqrestore(&swiotlb_lock, flags); > > + > > /* NOTE: We use dev_addr here, not paddr! */ > > if (is_xen_swiotlb_buffer(dev_addr)) { > > swiotlb_tbl_unmap_single(hwdev, paddr, size, dir); > > @@ -664,9 +765,19 @@ xen_swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, > > enum dma_sync_target target) > > { > > phys_addr_t paddr = xen_bus_to_phys(dev_addr); > > + char *vaddr = phys_to_virt(paddr); > > > > BUG_ON(dir == DMA_NONE); > > > > + if (xen_feature(XENFEAT_auto_translated_physmap) && > > + paddr != DMA_ERROR_CODE && > > + size <= PAGE_SIZE && > > + !(vaddr >= xen_io_tlb_start && vaddr < xen_io_tlb_end) && > > + !range_straddles_page_boundary(paddr, size) && !swiotlb_force) { > > + dma_mark_clean(vaddr, size); > > + return; > > + } > > + > > /* NOTE: We use dev_addr here, not paddr! */ > > if (is_xen_swiotlb_buffer(dev_addr)) { > > swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target); > > @@ -717,13 +828,36 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > struct dma_attrs *attrs) > > { > > struct scatterlist *sg; > > - int i; > > + int i, rc; > > + u64 dma_mask = DMA_BIT_MASK(32); > > + unsigned long flags; > > > > BUG_ON(dir == DMA_NONE); > > > > + if (hwdev && hwdev->coherent_dma_mask) > > + dma_mask = dma_alloc_coherent_mask(hwdev, GFP_KERNEL); > > + > > for_each_sg(sgl, sg, nelems, i) { > > phys_addr_t paddr = sg_phys(sg); > > - dma_addr_t dev_addr = xen_phys_to_bus_quick(paddr); > > + dma_addr_t dev_addr; > > + > > + if (xen_feature(XENFEAT_auto_translated_physmap) && > > + !range_straddles_page_boundary(paddr, sg->length) && > > + sg->length <= PAGE_SIZE && > > + !swiotlb_force) { > > + > > + spin_lock_irqsave(&swiotlb_lock, flags); > > + rc = xen_pin_dev_page(hwdev, paddr, &dev_addr); > > + spin_unlock_irqrestore(&swiotlb_lock, flags); > > + > > + if (!rc) { > > + dma_mark_clean(phys_to_virt(paddr), sg->length); > > + sg_dma_len(sg) = sg->length; > > + sg->dma_address = dev_addr; > > + continue; > > + } > > + } > > + dev_addr = xen_phys_to_bus_quick(paddr); > > > > if (swiotlb_force || > > xen_feature(XENFEAT_auto_translated_physmap) || > > -- > > 1.7.2.5 > > >
Stefano Stabellini
2013-Oct-09 17:31 UTC
Re: [PATCH v6 14/19] swiotlb: print a warning when the swiotlb is full
On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 27, 2013 at 05:10:02PM +0100, Stefano Stabellini wrote: > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > --- > > drivers/xen/swiotlb-xen.c | 1 + > > lib/swiotlb.c | 1 + > > 2 files changed, 2 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c > > index 96ad316..790c2eb 100644 > > --- a/drivers/xen/swiotlb-xen.c > > +++ b/drivers/xen/swiotlb-xen.c > > @@ -674,6 +674,7 @@ xen_swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, > > sg->length, > > dir); > > if (map == SWIOTLB_MAP_ERROR) { > > + pr_warn("swiotlb buffer is full\n"); > > It would be beneficial to use dev_warn instead.OK> And perhaps even call debug_dma_dump_mappings to help in diagnosing > a problem?I don''t think that would be very beneficial because this is an internal swiotlb issue: we just run out of buffer space.> > /* Don''t panic here, we expect map_sg users > > to do proper error handling. */ > > xen_swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir, > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > > index eb45d17..f06da0d 100644 > > --- a/lib/swiotlb.c > > +++ b/lib/swiotlb.c > > @@ -502,6 +502,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, > > > > not_found: > > spin_unlock_irqrestore(&io_tlb_lock, flags); > > + pr_warn("swiotlb buffer is full\n"); > > return SWIOTLB_MAP_ERROR; > > found: > > spin_unlock_irqrestore(&io_tlb_lock, flags); > > -- > > 1.7.2.5 > > >
Stefano Stabellini
2013-Oct-09 17:54 UTC
Re: [PATCH v6 16/19] arm,arm64: do not always merge biovec if we are running on Xen
On Mon, 30 Sep 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Sep 27, 2013 at 05:10:04PM +0100, Stefano Stabellini wrote: > > This is similar to what it is done on X86: biovecs are prevented from merging > > otherwise every dma requests would be forced to bounce on the swiotlb buffer. > > > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > > This also alters the generic (well, was x86, now it is generic) code. > Perhaps that change should be made in a seperate patch and explain > what the autotranslate extra check will do for PV, PVH and HVM guests? > > Wait, does that mean we had not been merging requests on HVM guests? > Is that good or bad?With the new introduction of a p2m for xen on arm and arm64, I don''t need the change to drivers/xen/biomerge.c anymore. No more autotranslate extra check.> > --- > > arch/arm/include/asm/io.h | 8 ++++++++ > > arch/arm64/include/asm/io.h | 9 +++++++++ > > drivers/xen/biomerge.c | 4 +++- > > 3 files changed, 20 insertions(+), 1 deletions(-) > > > > diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h > > index d070741..c45effc 100644 > > --- a/arch/arm/include/asm/io.h > > +++ b/arch/arm/include/asm/io.h > > @@ -24,9 +24,11 @@ > > #ifdef __KERNEL__ > > > > #include <linux/types.h> > > +#include <linux/blk_types.h> > > #include <asm/byteorder.h> > > #include <asm/memory.h> > > #include <asm-generic/pci_iomap.h> > > +#include <xen/xen.h> > > > > /* > > * ISA I/O bus memory addresses are 1:1 with the physical address. > > @@ -372,6 +374,12 @@ extern void pci_iounmap(struct pci_dev *dev, void __iomem *addr); > > #define BIOVEC_MERGEABLE(vec1, vec2) \ > > ((bvec_to_phys((vec1)) + (vec1)->bv_len) == bvec_to_phys((vec2))) > > > > +extern bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > > + const struct bio_vec *vec2); > > +#define BIOVEC_PHYS_MERGEABLE(vec1, vec2) \ > > + (__BIOVEC_PHYS_MERGEABLE(vec1, vec2) && \ > > + (!xen_domain() || xen_biovec_phys_mergeable(vec1, vec2))) > > + > > #ifdef CONFIG_MMU > > #define ARCH_HAS_VALID_PHYS_ADDR_RANGE > > extern int valid_phys_addr_range(phys_addr_t addr, size_t size); > > diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h > > index 1d12f89..c163287b 100644 > > --- a/arch/arm64/include/asm/io.h > > +++ b/arch/arm64/include/asm/io.h > > @@ -22,11 +22,14 @@ > > #ifdef __KERNEL__ > > > > #include <linux/types.h> > > +#include <linux/blk_types.h> > > > > #include <asm/byteorder.h> > > #include <asm/barrier.h> > > #include <asm/pgtable.h> > > > > +#include <xen/xen.h> > > + > > /* > > * Generic IO read/write. These perform native-endian accesses. > > */ > > @@ -263,5 +266,11 @@ extern int devmem_is_allowed(unsigned long pfn); > > */ > > #define xlate_dev_kmem_ptr(p) p > > > > +extern bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > > + const struct bio_vec *vec2); > > +#define BIOVEC_PHYS_MERGEABLE(vec1, vec2) \ > > + (__BIOVEC_PHYS_MERGEABLE(vec1, vec2) && \ > > + (!xen_domain() || xen_biovec_phys_mergeable(vec1, vec2))) > > + > > #endif /* __KERNEL__ */ > > #endif /* __ASM_IO_H */ > > diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c > > index 0edb91c..f323a2d 100644 > > --- a/drivers/xen/biomerge.c > > +++ b/drivers/xen/biomerge.c > > @@ -2,6 +2,7 @@ > > #include <linux/io.h> > > #include <linux/export.h> > > #include <xen/page.h> > > +#include <xen/features.h> > > > > bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > > const struct bio_vec *vec2) > > @@ -9,7 +10,8 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1, > > unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page)); > > unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page)); > > > > - return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) && > > + return !xen_feature(XENFEAT_auto_translated_physmap) && > > + __BIOVEC_PHYS_MERGEABLE(vec1, vec2) && > > ((mfn1 == mfn2) || ((mfn1+1) == mfn2)); > > } > > EXPORT_SYMBOL(xen_biovec_phys_mergeable);