Fujita-san et al. Attached is a set of fifteen RFC patches that separate the address translation (virt_to_phys, virt_to_bus, etc) from the SWIOTLB library. The idea behind this set of patches is to make it possible to have separate mechanisms for translating virtual to physical or virtual to DMA addresses on platforms which need an SWIOTLB, and where physical != PCI bus address. One customers of this, is the pv-ops project, which can switch between different modes of operation depending in which environment it is running. One of the usage model is the PCI pass-through in a virtualized environment where an IOMMU is required since the Linux kernel''s idea of physical address is not the real physical address (worst yet, the PFNs that look like they are under 4GB, could be actually pointing above 4GB - presents an interesting set of bugs). Pv-ops kernel provides a set address translation mechanisms to translate physical (PFNs) to real-physical (Machine Frame Number) frame numbers (and vice-versa). For the IOMMU, one solution has been to wholesale copy the SWIOTLB, stick it in arch/x86/xen/swiotlb.c and modify the virt_to_phys, phys_to_virt and others to use the Xen address translation functions. Unfortunately, since the kernel can run on bare-metal, there is big code overlap with the real SWIOTLB. (git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git xen/dom0/swiotlb-new) Another approach, which this set of patches explores, is to abstract the address translation and address determination functions away from the SWIOTLB book-keeping functions. This way the core SWIOTLB library functions are present in one place, while the address related functions are in a separate library for different run-time platforms. I would very much appreciate input on this idea and the set of patches. The set of fifteen patches is also accessible on: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6.git swiotlb-rfc-0.2 An example of how this can be utilized in both bare-metal and Xen environments is this git tree: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git swiotlb-xen-0.2 Sincerely, Konrad Rzeszutek Wilk include/linux/swiotlb.h | 97 ++++++++++++ lib/Makefile | 2 +- lib/swiotlb-default.c | 263 ++++++++++++++++++++++++++++++++ lib/swiotlb.c | 389 ++++++++++++++--------------------------------- 4 files changed, 477 insertions(+), 274 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.
Before this patch, if you specified ''swiotlb=force,1024'' it would ignore both arguments. This fixes it and allows the user specify it in any order (or none at all). Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 20 +++++++++++--------- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 437eedb..e6d9e32 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -102,16 +102,18 @@ static int late_alloc; static int __init setup_io_tlb_npages(char *str) { - if (isdigit(*str)) { - io_tlb_nslabs = simple_strtoul(str, &str, 0); - /* avoid tail segment of size < IO_TLB_SEGSIZE */ - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + while (*str) { + if (isdigit(*str)) { + io_tlb_nslabs = simple_strtoul(str, &str, 0); + /* avoid tail segment of size < IO_TLB_SEGSIZE */ + io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + } + if (!strncmp(str, "force", 5)) + swiotlb_force = 1; + str += strcspn(str, ","); + if (*str == '','') + ++str; } - if (*str == '','') - ++str; - if (!strcmp(str, "force")) - swiotlb_force = 1; - return 1; } __setup("swiotlb=", setup_io_tlb_npages); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
The structure contains all of the existing variables used in software IO TLB (swiotlb.c) collected within a structure. Additionally a name variable and a deconstructor (release) function variable is defined for API usages. The other set of functions: is_swiotlb_buffer, dma_capable, phys_to_bus, bus_to_phys, virt_to_bus, and bus_to_virt server as a method to abstract them out of the SWIOTLB library. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- include/linux/swiotlb.h | 94 +++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 94 insertions(+), 0 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index febedcf..781c3aa 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -24,6 +24,100 @@ extern int swiotlb_force; extern void swiotlb_init(int verbose); +struct swiotlb_engine { + + /* + * Name of the engine (ie: "Software IO TLB") + */ + const char *name; + + /* + * Used to do a quick range check in unmap_single and + * sync_single_*, to see if the memory was in fact allocated by this + * API. + */ + char *start; + char *end; + + /* + * The number of IO TLB blocks (in groups of 64) betweeen start and + * end. This is command line adjustable via setup_io_tlb_npages. + */ + unsigned long nslabs; + + /* + * When the IOMMU overflows we return a fallback buffer. + * This sets the size. + */ + unsigned long overflow; + + void *overflow_buffer; + + /* + * This is a free list describing the number of free entries available + * from each index + */ + unsigned int *list; + + /* + * Current marker in the start through end location. Is incremented + * on each map and wraps around. + */ + unsigned int index; + + /* + * We need to save away the original address corresponding to a mapped + * entry for the sync operations. + */ + phys_addr_t *orig_addr; + + /* + * IOMMU private data. + */ + void *priv; + /* + * The API call to free a SWIOTLB engine if another wants to register + * (or if want to turn SWIOTLB off altogether). + * It is imperative that this function checks for existing DMA maps + * and not release the IOTLB if there are out-standing maps. + */ + int (*release)(struct swiotlb_engine *); + + /* + * Is the DMA (Bus) address within our bounce buffer (start and end). + */ + int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr, + phys_addr_t phys); + + /* + * Is the DMA (Bus) address reachable by the PCI device?. + */ + bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t); + /* + * Physical to bus (DMA) address translation. On + * most platforms this is an equivalent function. + */ + dma_addr_t (*phys_to_bus)(struct device *hwdev, phys_addr_t paddr); + + /* + * Bus (DMA) to physical address translation. On most + * platforms this is an equivalant function. + */ + phys_addr_t (*bus_to_phys)(struct device *hwdev, dma_addr_t baddr); + + /* + * Virtual to bus (DMA) address translation. On most platforms + * this is a call to __pa(address). + */ + dma_addr_t (*virt_to_bus)(struct device *hwdev, void *address); + + /* + * Bus (DMA) to virtual address translation. On most platforms + * this is a call to __va(address). + */ + void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address); +}; + extern void *swiotlb_alloc_coherent(struct device *hwdev, size_t size, dma_addr_t *dma_handle, gfp_t flags); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.
We set the internal iommu_sw pointer to the passed in swiotlb_engine structure. Obviously we also check if the existing iommu_sw is set and if so, call iommu_sw->release before the switch-over. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- include/linux/swiotlb.h | 2 + lib/swiotlb.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+), 0 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 781c3aa..3bc3c42 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -118,6 +118,8 @@ struct swiotlb_engine { void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address); }; +int swiotlb_register_engine(struct swiotlb_engine *iommu); + extern void *swiotlb_alloc_coherent(struct device *hwdev, size_t size, dma_addr_t *dma_handle, gfp_t flags); diff --git a/lib/swiotlb.c b/lib/swiotlb.c index e6d9e32..e84f269 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -97,6 +97,12 @@ static phys_addr_t *io_tlb_orig_addr; */ static DEFINE_SPINLOCK(io_tlb_lock); +/* + * The software IOMMU this library will utilize. + */ +struct swiotlb_engine *iommu_sw; +EXPORT_SYMBOL(iommu_sw); + static int late_alloc; static int __init @@ -126,6 +132,48 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, return phys_to_dma(hwdev, virt_to_phys(address)); } +/* + * Register a software IO TLB engine. + * + * The registration allows the software IO TLB functions in the + * swiotlb library to function properly. + * + * All the values in the iotlb structure must be set. + * + * If the registration fails, it is assumed that the caller will free + * all of the resources allocated in the swiotlb_engine structure. + */ +int swiotlb_register_engine(struct swiotlb_engine *iommu) +{ + if (!iommu || !iommu->name || !iommu->release) { + printk(KERN_ERR "DMA: Trying to register a SWIOTLB engine" \ + " improperly!"); + return -EINVAL; + } + + if (iommu_sw && iommu_sw->name) { + int retval = -EINVAL; + + /* ''release'' must check for out-standing DMAs and flush them + * out or fail. */ + if (iommu_sw->release) + retval = iommu_sw->release(iommu_sw); + + if (retval) { + printk(KERN_ERR "DMA: %s cannot be released!\n", + iommu_sw->name); + return retval; + } + printk(KERN_INFO "DMA: Replacing [%s] with [%s]\n", + iommu_sw->name, iommu->name); + } + + iommu_sw = iommu; + + return 0; +} +EXPORT_SYMBOL(swiotlb_register_engine); + void swiotlb_print_info(void) { unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT; -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/
We also fix the checkpatch.pl errors that surfaced during this conversion. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 204 +++++++++++++++++++++++++++++---------------------------- 1 files changed, 104 insertions(+), 100 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index e84f269..3499001 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -176,14 +176,14 @@ EXPORT_SYMBOL(swiotlb_register_engine); void swiotlb_print_info(void) { - unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT; + unsigned long bytes = iommu_sw->nslabs << IO_TLB_SHIFT; phys_addr_t pstart, pend; - pstart = virt_to_phys(io_tlb_start); - pend = virt_to_phys(io_tlb_end); + pstart = virt_to_phys(iommu_sw->start); + pend = virt_to_phys(iommu_sw->end); printk(KERN_INFO "Placing %luMB software IO TLB between %p - %p\n", - bytes >> 20, io_tlb_start, io_tlb_end); + bytes >> 20, iommu_sw->start, iommu_sw->end); printk(KERN_INFO "software IO TLB at phys %#llx - %#llx\n", (unsigned long long)pstart, (unsigned long long)pend); @@ -198,37 +198,38 @@ swiotlb_init_with_default_size(size_t default_size, int verbose) { unsigned long i, bytes; - if (!io_tlb_nslabs) { - io_tlb_nslabs = (default_size >> IO_TLB_SHIFT); - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + if (!iommu_sw->nslabs) { + iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); + iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); } - bytes = io_tlb_nslabs << IO_TLB_SHIFT; + bytes = iommu_sw->nslabs << IO_TLB_SHIFT; /* * Get IO TLB memory from the low pages */ - io_tlb_start = alloc_bootmem_low_pages(bytes); - if (!io_tlb_start) + iommu_sw->start = alloc_bootmem_low_pages(bytes); + if (!iommu_sw->start) panic("Cannot allocate SWIOTLB buffer"); - io_tlb_end = io_tlb_start + bytes; + iommu_sw->end = iommu_sw->start + bytes; /* * Allocate and initialize the free list array. This array is used * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE - * between io_tlb_start and io_tlb_end. + * between iommu_sw->start and iommu_sw->end. */ - io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int)); - for (i = 0; i < io_tlb_nslabs; i++) - io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); - io_tlb_index = 0; - io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t)); + iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int)); + for (i = 0; i < iommu_sw->nslabs; i++) + iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + iommu_sw->index = 0; + iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs * + sizeof(phys_addr_t)); /* * Get the overflow emergency buffer */ - io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow); - if (!io_tlb_overflow_buffer) + iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow); + if (!iommu_sw->overflow_buffer) panic("Cannot allocate SWIOTLB overflow buffer!\n"); if (verbose) swiotlb_print_info(); @@ -248,70 +249,70 @@ swiotlb_init(int verbose) int swiotlb_late_init_with_default_size(size_t default_size) { - unsigned long i, bytes, req_nslabs = io_tlb_nslabs; + unsigned long i, bytes, req_nslabs = iommu_sw->nslabs; unsigned int order; - if (!io_tlb_nslabs) { - io_tlb_nslabs = (default_size >> IO_TLB_SHIFT); - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + if (!iommu_sw->nslabs) { + iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); + iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); } /* * Get IO TLB memory from the low pages */ - order = get_order(io_tlb_nslabs << IO_TLB_SHIFT); - io_tlb_nslabs = SLABS_PER_PAGE << order; - bytes = io_tlb_nslabs << IO_TLB_SHIFT; + order = get_order(iommu_sw->nslabs << IO_TLB_SHIFT); + iommu_sw->nslabs = SLABS_PER_PAGE << order; + bytes = iommu_sw->nslabs << IO_TLB_SHIFT; while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { - io_tlb_start = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, - order); - if (io_tlb_start) + iommu_sw->start = (void *)__get_free_pages(GFP_DMA | + __GFP_NOWARN, order); + if (iommu_sw->start) break; order--; } - if (!io_tlb_start) + if (!iommu_sw->start) goto cleanup1; if (order != get_order(bytes)) { printk(KERN_WARNING "Warning: only able to allocate %ld MB " "for software IO TLB\n", (PAGE_SIZE << order) >> 20); - io_tlb_nslabs = SLABS_PER_PAGE << order; - bytes = io_tlb_nslabs << IO_TLB_SHIFT; + iommu_sw->nslabs = SLABS_PER_PAGE << order; + bytes = iommu_sw->nslabs << IO_TLB_SHIFT; } - io_tlb_end = io_tlb_start + bytes; - memset(io_tlb_start, 0, bytes); + iommu_sw->end = iommu_sw->start + bytes; + memset(iommu_sw->start, 0, bytes); /* * Allocate and initialize the free list array. This array is used * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE - * between io_tlb_start and io_tlb_end. + * between iommu_sw->start and iommu_sw->end. */ - io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL, - get_order(io_tlb_nslabs * sizeof(int))); - if (!io_tlb_list) + iommu_sw->list = (unsigned int *)__get_free_pages(GFP_KERNEL, + get_order(iommu_sw->nslabs * sizeof(int))); + if (!iommu_sw->list) goto cleanup2; - for (i = 0; i < io_tlb_nslabs; i++) - io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); - io_tlb_index = 0; + for (i = 0; i < iommu_sw->nslabs; i++) + iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + iommu_sw->index = 0; - io_tlb_orig_addr = (phys_addr_t *) + iommu_sw->orig_addr = (phys_addr_t *) __get_free_pages(GFP_KERNEL, - get_order(io_tlb_nslabs * + get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); - if (!io_tlb_orig_addr) + if (!iommu_sw->orig_addr) goto cleanup3; - memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t)); + memset(iommu_sw->orig_addr, 0, iommu_sw->nslabs * sizeof(phys_addr_t)); /* * Get the overflow emergency buffer */ - io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA, - get_order(io_tlb_overflow)); - if (!io_tlb_overflow_buffer) + iommu_sw->overflow_buffer = (void *)__get_free_pages(GFP_DMA, + get_order(iommu_sw->overflow)); + if (!iommu_sw->overflow_buffer) goto cleanup4; swiotlb_print_info(); @@ -321,52 +322,52 @@ swiotlb_late_init_with_default_size(size_t default_size) return 0; cleanup4: - free_pages((unsigned long)io_tlb_orig_addr, - get_order(io_tlb_nslabs * sizeof(phys_addr_t))); - io_tlb_orig_addr = NULL; + free_pages((unsigned long)iommu_sw->orig_addr, + get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); + iommu_sw->orig_addr = NULL; cleanup3: - free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs * + free_pages((unsigned long)iommu_sw->list, get_order(iommu_sw->nslabs * sizeof(int))); - io_tlb_list = NULL; + iommu_sw->list = NULL; cleanup2: - io_tlb_end = NULL; - free_pages((unsigned long)io_tlb_start, order); - io_tlb_start = NULL; + iommu_sw->end = NULL; + free_pages((unsigned long)iommu_sw->start, order); + iommu_sw->start = NULL; cleanup1: - io_tlb_nslabs = req_nslabs; + iommu_sw->nslabs = req_nslabs; return -ENOMEM; } void __init swiotlb_free(void) { - if (!io_tlb_overflow_buffer) + if (!iommu_sw->overflow_buffer) return; if (late_alloc) { - free_pages((unsigned long)io_tlb_overflow_buffer, - get_order(io_tlb_overflow)); - free_pages((unsigned long)io_tlb_orig_addr, - get_order(io_tlb_nslabs * sizeof(phys_addr_t))); - free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs * - sizeof(int))); - free_pages((unsigned long)io_tlb_start, - get_order(io_tlb_nslabs << IO_TLB_SHIFT)); + free_pages((unsigned long)iommu_sw->overflow_buffer, + get_order(iommu_sw->overflow)); + free_pages((unsigned long)iommu_sw->orig_addr, + get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); + free_pages((unsigned long)iommu_sw->list, + get_order(iommu_sw->nslabs * sizeof(int))); + free_pages((unsigned long)iommu_sw->start, + get_order(iommu_sw->nslabs << IO_TLB_SHIFT)); } else { - free_bootmem_late(__pa(io_tlb_overflow_buffer), - io_tlb_overflow); - free_bootmem_late(__pa(io_tlb_orig_addr), - io_tlb_nslabs * sizeof(phys_addr_t)); - free_bootmem_late(__pa(io_tlb_list), - io_tlb_nslabs * sizeof(int)); - free_bootmem_late(__pa(io_tlb_start), - io_tlb_nslabs << IO_TLB_SHIFT); + free_bootmem_late(__pa(iommu_sw->overflow_buffer), + iommu_sw->overflow); + free_bootmem_late(__pa(iommu_sw->orig_addr), + iommu_sw->nslabs * sizeof(phys_addr_t)); + free_bootmem_late(__pa(iommu_sw->list), + iommu_sw->nslabs * sizeof(int)); + free_bootmem_late(__pa(iommu_sw->start), + iommu_sw->nslabs << IO_TLB_SHIFT); } } static int is_swiotlb_buffer(phys_addr_t paddr) { - return paddr >= virt_to_phys(io_tlb_start) && - paddr < virt_to_phys(io_tlb_end); + return paddr >= virt_to_phys(iommu_sw->start) && + paddr < virt_to_phys(iommu_sw->end); } /* @@ -426,7 +427,7 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) unsigned long max_slots; mask = dma_get_seg_boundary(hwdev); - start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start) & mask; + start_dma_addr = swiotlb_virt_to_bus(hwdev, iommu_sw->start) & mask; offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; @@ -454,8 +455,8 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) * request and allocate a buffer from that IO TLB pool. */ spin_lock_irqsave(&io_tlb_lock, flags); - index = ALIGN(io_tlb_index, stride); - if (index >= io_tlb_nslabs) + index = ALIGN(iommu_sw->index, stride); + if (index >= iommu_sw->nslabs) index = 0; wrap = index; @@ -463,7 +464,7 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) while (iommu_is_span_boundary(index, nslots, offset_slots, max_slots)) { index += stride; - if (index >= io_tlb_nslabs) + if (index >= iommu_sw->nslabs) index = 0; if (index == wrap) goto not_found; @@ -474,26 +475,27 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) * contiguous buffers, we allocate the buffers from that slot * and mark the entries as ''0'' indicating unavailable. */ - if (io_tlb_list[index] >= nslots) { + if (iommu_sw->list[index] >= nslots) { int count = 0; for (i = index; i < (int) (index + nslots); i++) - io_tlb_list[i] = 0; - for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) - io_tlb_list[i] = ++count; - dma_addr = io_tlb_start + (index << IO_TLB_SHIFT); + iommu_sw->list[i] = 0; + for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+ IO_TLB_SEGSIZE - 1) && iommu_sw->list[i]; i--) + iommu_sw->list[i] = ++count; + dma_addr = iommu_sw->start + (index << IO_TLB_SHIFT); /* * Update the indices to avoid searching in the next * round. */ - io_tlb_index = ((index + nslots) < io_tlb_nslabs + iommu_sw->index = ((index + nslots) < iommu_sw->nslabs ? (index + nslots) : 0); goto found; } index += stride; - if (index >= io_tlb_nslabs) + if (index >= iommu_sw->nslabs) index = 0; } while (index != wrap); @@ -509,7 +511,7 @@ found: * needed. */ for (i = 0; i < nslots; i++) - io_tlb_orig_addr[index+i] = phys + (i << IO_TLB_SHIFT); + iommu_sw->orig_addr[index+i] = phys + (i << IO_TLB_SHIFT); if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE); @@ -524,8 +526,8 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) { unsigned long flags; int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; - int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT; - phys_addr_t phys = io_tlb_orig_addr[index]; + int index = (dma_addr - iommu_sw->start) >> IO_TLB_SHIFT; + phys_addr_t phys = iommu_sw->orig_addr[index]; /* * First, sync the memory before unmapping the entry @@ -542,19 +544,20 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) spin_lock_irqsave(&io_tlb_lock, flags); { count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ? - io_tlb_list[index + nslots] : 0); + iommu_sw->list[index + nslots] : 0); /* * Step 1: return the slots to the free list, merging the * slots with superceeding slots */ for (i = index + nslots - 1; i >= index; i--) - io_tlb_list[i] = ++count; + iommu_sw->list[i] = ++count; /* * Step 2: merge the returned slots with the preceding slots, * if available (non zero) */ - for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--) - io_tlb_list[i] = ++count; + for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+ IO_TLB_SEGSIZE - 1) && iommu_sw->list[i]; i--) + iommu_sw->list[i] = ++count; } spin_unlock_irqrestore(&io_tlb_lock, flags); } @@ -563,8 +566,8 @@ static void sync_single(struct device *hwdev, char *dma_addr, size_t size, int dir, int target) { - int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT; - phys_addr_t phys = io_tlb_orig_addr[index]; + int index = (dma_addr - iommu_sw->start) >> IO_TLB_SHIFT; + phys_addr_t phys = iommu_sw->orig_addr[index]; phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1)); @@ -663,7 +666,7 @@ swiotlb_full(struct device *dev, size_t size, int dir, int do_panic) printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at " "device %s\n", size, dev ? dev_name(dev) : "?"); - if (size <= io_tlb_overflow || !do_panic) + if (size <= iommu_sw->overflow || !do_panic) return; if (dir == DMA_BIDIRECTIONAL) @@ -705,7 +708,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, map = map_single(dev, phys, size, dir); if (!map) { swiotlb_full(dev, size, dir, 1); - map = io_tlb_overflow_buffer; + map = iommu_sw->overflow_buffer; } dev_addr = swiotlb_virt_to_bus(dev, map); @@ -960,7 +963,8 @@ EXPORT_SYMBOL(swiotlb_sync_sg_for_device); int swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr) { - return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer)); + return (dma_addr == swiotlb_virt_to_bus(hwdev, + iommu_sw->overflow_buffer)); } EXPORT_SYMBOL(swiotlb_dma_mapping_error); @@ -973,6 +977,6 @@ EXPORT_SYMBOL(swiotlb_dma_mapping_error); int swiotlb_dma_supported(struct device *hwdev, u64 mask) { - return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask; + return swiotlb_virt_to_bus(hwdev, iommu_sw->end - 1) <= mask; } EXPORT_SYMBOL(swiotlb_dma_supported); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 05/15] [swiotlb] Respect the io_tlb_nslabs argument value.
The search and replace removed the option to override the amount of slabs via swiotlb=<x> argument. This puts it back in. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 3499001..cf29f03 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -198,10 +198,11 @@ swiotlb_init_with_default_size(size_t default_size, int verbose) { unsigned long i, bytes; - if (!iommu_sw->nslabs) { + if (!io_tlb_nslabs) { iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); - } + } else + iommu_sw->nslabs = io_tlb_nslabs; bytes = iommu_sw->nslabs << IO_TLB_SHIFT; @@ -252,10 +253,11 @@ swiotlb_late_init_with_default_size(size_t default_size) unsigned long i, bytes, req_nslabs = iommu_sw->nslabs; unsigned int order; - if (!iommu_sw->nslabs) { + if (!io_tlb_nslabs) { iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); - } + } else + iommu_sw->nslabs = io_tlb_nslabs; /* * Get IO TLB memory from the low pages -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.
For baselevel support we define required functions and fill out variables. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 22 ++++++++++++++++++++++ 1 files changed, 22 insertions(+), 0 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index cf29f03..3c7bd4e 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -132,6 +132,11 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, return phys_to_dma(hwdev, virt_to_phys(address)); } +static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr) +{ + return phys_to_virt(dma_to_phys(hwdev, dev_addr)); +}; + /* * Register a software IO TLB engine. * @@ -236,9 +241,26 @@ swiotlb_init_with_default_size(size_t default_size, int verbose) swiotlb_print_info(); } +static int swiotlb_release(struct swiotlb_engine *iotlb) +{ + swiotlb_free(); + return 0; +} + +static struct swiotlb_engine swiotlb_ops = { + .name = "software IO TLB", + .overflow = 32 * 1024, + .release = swiotlb_release, + .phys_to_bus = phys_to_dma, + .bus_to_phys = dma_to_phys, + .virt_to_bus = swiotlb_virt_to_bus, + .bus_to_virt = swiotlb_bus_to_virt, +}; + void __init swiotlb_init(int verbose) { + swiotlb_register_engine(&swiotlb_ops); swiotlb_init_with_default_size(64 * (1<<20), verbose); /* default to 64MB */ } -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.
Previous to the usage of the iommu_sw pointer, we would check if io_tlb_overflow_buffer was set and use that to determine whether we had been called. With the iommu_sw introduction, we just need to check whether that pointer is not NULL. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 3c7bd4e..72c9abe 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -364,7 +364,7 @@ cleanup1: void __init swiotlb_free(void) { - if (!iommu_sw->overflow_buffer) + if (!iommu_sw) return; if (late_alloc) { -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 08/15] [swiotlb] Add ''is_swiotlb_buffer'' to the swiotlb_ops function decleration.
We move the ''io_swiotlb_buffer'' function before the swiotlb_ops_ structure decleration to avoid compilation problems. Also we replace the calls to is_swiotlb_buffer to go through the iommu_sw->is_swiotlb_buffer function. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 18 ++++++++++-------- 1 files changed, 10 insertions(+), 8 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 72c9abe..688965d 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -246,11 +246,18 @@ static int swiotlb_release(struct swiotlb_engine *iotlb) swiotlb_free(); return 0; } +static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw, + dma_addr_t dev_addr, phys_addr_t paddr) +{ + return paddr >= virt_to_phys(iommu_sw->start) && + paddr < virt_to_phys(iommu_sw->end); +} static struct swiotlb_engine swiotlb_ops = { .name = "software IO TLB", .overflow = 32 * 1024, .release = swiotlb_release, + .is_swiotlb_buffer = is_swiotlb_buffer, .phys_to_bus = phys_to_dma, .bus_to_phys = dma_to_phys, .virt_to_bus = swiotlb_virt_to_bus, @@ -388,11 +395,6 @@ void __init swiotlb_free(void) } } -static int is_swiotlb_buffer(phys_addr_t paddr) -{ - return paddr >= virt_to_phys(iommu_sw->start) && - paddr < virt_to_phys(iommu_sw->end); -} /* * Bounce: copy the swiotlb buffer back to the original dma location @@ -669,7 +671,7 @@ swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); WARN_ON(irqs_disabled()); - if (!is_swiotlb_buffer(paddr)) + if (!iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr)) free_pages((unsigned long)vaddr, get_order(size)); else /* DMA_TO_DEVICE to avoid memcpy in unmap_single */ @@ -762,7 +764,7 @@ static void unmap_single(struct device *hwdev, dma_addr_t dev_addr, BUG_ON(dir == DMA_NONE); - if (is_swiotlb_buffer(paddr)) { + if (iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr)) { do_unmap_single(hwdev, phys_to_virt(paddr), size, dir); return; } @@ -805,7 +807,7 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, BUG_ON(dir == DMA_NONE); - if (is_swiotlb_buffer(paddr)) { + if (iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr)) { sync_single(hwdev, phys_to_virt(paddr), size, dir, target); return; } -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 09/15] [swiotlb] Add ''dma_capable'' to the swiotlb_ops structure.
And we also replace the ''dma_capable'' with iommu_sw->dma_capable to abstract the functionality of that function. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 14 +++++++++++--- 1 files changed, 11 insertions(+), 3 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 688965d..4da8151 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -253,10 +253,18 @@ static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw, paddr < virt_to_phys(iommu_sw->end); } +static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr, + phys_addr_t phys, size_t size) +{ + /* Phys is not neccessary in this case. */ + return dma_capable(hwdev, dma_addr, size); +} + static struct swiotlb_engine swiotlb_ops = { .name = "software IO TLB", .overflow = 32 * 1024, .release = swiotlb_release, + .dma_capable = swiotlb_dma_capable, .is_swiotlb_buffer = is_swiotlb_buffer, .phys_to_bus = phys_to_dma, .bus_to_phys = dma_to_phys, @@ -725,7 +733,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, * we can safely return the device addr and not worry about bounce * buffering it. */ - if (dma_capable(dev, dev_addr, size) && !swiotlb_force) + if (iommu_sw->dma_capable(dev, dev_addr, phys, size) && !swiotlb_force) return dev_addr; /* @@ -742,7 +750,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, /* * Ensure that the address returned is DMA''ble */ - if (!dma_capable(dev, dev_addr, size)) + if (!iommu_sw->dma_capable(dev, dev_addr, phys, size)) panic("map_single: bounce buffer is not DMA''ble"); return dev_addr; @@ -895,7 +903,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, dma_addr_t dev_addr = phys_to_dma(hwdev, paddr); if (swiotlb_force || - !dma_capable(hwdev, dev_addr, sg->length)) { + !iommu_sw->dma_capable(hwdev, dev_addr, paddr, sg->length)) { void *map = map_single(hwdev, sg_phys(sg), sg->length, dir); if (!map) { -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:00 UTC
[Xen-devel] [PATCH 10/15] [swiotlb] Replace the [phys, bus]->virt and virt->[bus, phys] functions with iommu_sw calls.
We replace all of the address translation calls to go through the iommu_sw functions. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 26 +++++++++++++------------- 1 files changed, 13 insertions(+), 13 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 4da8151..075b56c 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -127,7 +127,7 @@ __setup("swiotlb=", setup_io_tlb_npages); /* Note that this doesn''t work with highmem page */ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, - volatile void *address) + void *address) { return phys_to_dma(hwdev, virt_to_phys(address)); } @@ -461,7 +461,7 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) unsigned long max_slots; mask = dma_get_seg_boundary(hwdev); - start_dma_addr = swiotlb_virt_to_bus(hwdev, iommu_sw->start) & mask; + start_dma_addr = iommu_sw->virt_to_bus(hwdev, iommu_sw->start) & mask; offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; @@ -636,7 +636,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, dma_mask = hwdev->coherent_dma_mask; ret = (void *)__get_free_pages(flags, order); - if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) { + if (ret && iommu_sw->virt_to_bus(hwdev, ret) + size - 1 > dma_mask) { /* * The allocated memory isn''t reachable by the device. */ @@ -655,7 +655,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, } memset(ret, 0, size); - dev_addr = swiotlb_virt_to_bus(hwdev, ret); + dev_addr = iommu_sw->virt_to_bus(hwdev, ret); /* Confirm address can be DMA''d by device */ if (dev_addr + size - 1 > dma_mask) { @@ -676,7 +676,7 @@ void swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, dma_addr_t dev_addr) { - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + phys_addr_t paddr = iommu_sw->bus_to_phys(hwdev, dev_addr); WARN_ON(irqs_disabled()); if (!iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr)) @@ -724,7 +724,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, struct dma_attrs *attrs) { phys_addr_t phys = page_to_phys(page) + offset; - dma_addr_t dev_addr = phys_to_dma(dev, phys); + dma_addr_t dev_addr = iommu_sw->phys_to_bus(dev, phys); void *map; BUG_ON(dir == DMA_NONE); @@ -745,7 +745,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, map = iommu_sw->overflow_buffer; } - dev_addr = swiotlb_virt_to_bus(dev, map); + dev_addr = iommu_sw->virt_to_bus(dev, map); /* * Ensure that the address returned is DMA''ble @@ -768,7 +768,7 @@ EXPORT_SYMBOL_GPL(swiotlb_map_page); static void unmap_single(struct device *hwdev, dma_addr_t dev_addr, size_t size, int dir) { - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + phys_addr_t paddr = iommu_sw->bus_to_phys(hwdev, dev_addr); BUG_ON(dir == DMA_NONE); @@ -811,7 +811,7 @@ static void swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, size_t size, int dir, int target) { - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + phys_addr_t paddr = iommu_sw->bus_to_phys(hwdev, dev_addr); BUG_ON(dir == DMA_NONE); @@ -900,7 +900,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, for_each_sg(sgl, sg, nelems, i) { phys_addr_t paddr = sg_phys(sg); - dma_addr_t dev_addr = phys_to_dma(hwdev, paddr); + dma_addr_t dev_addr = iommu_sw->phys_to_bus(hwdev, paddr); if (swiotlb_force || !iommu_sw->dma_capable(hwdev, dev_addr, paddr, sg->length)) { @@ -915,7 +915,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, sgl[0].dma_length = 0; return 0; } - sg->dma_address = swiotlb_virt_to_bus(hwdev, map); + sg->dma_address = iommu_sw->virt_to_bus(hwdev, map); } else sg->dma_address = dev_addr; sg->dma_length = sg->length; @@ -997,7 +997,7 @@ EXPORT_SYMBOL(swiotlb_sync_sg_for_device); int swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr) { - return (dma_addr == swiotlb_virt_to_bus(hwdev, + return (dma_addr == iommu_sw->virt_to_bus(hwdev, iommu_sw->overflow_buffer)); } EXPORT_SYMBOL(swiotlb_dma_mapping_error); @@ -1011,6 +1011,6 @@ EXPORT_SYMBOL(swiotlb_dma_mapping_error); int swiotlb_dma_supported(struct device *hwdev, u64 mask) { - return swiotlb_virt_to_bus(hwdev, iommu_sw->end - 1) <= mask; + return iommu_sw->virt_to_bus(hwdev, iommu_sw->end - 1) <= mask; } EXPORT_SYMBOL(swiotlb_dma_supported); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:01 UTC
[Xen-devel] [PATCH 11/15] [swiotlb] Replace late_alloc with iommu_sw->priv usage.
We utilize the private placeholder to figure out whether we are initialized late or early. Obviously the ->priv can be expanded to point to a structure for more internal data but for right now this all we need. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 11 +++++++---- 1 files changed, 7 insertions(+), 4 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 075b56c..20df588 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -103,8 +103,6 @@ static DEFINE_SPINLOCK(io_tlb_lock); struct swiotlb_engine *iommu_sw; EXPORT_SYMBOL(iommu_sw); -static int late_alloc; - static int __init setup_io_tlb_npages(char *str) { @@ -239,6 +237,8 @@ swiotlb_init_with_default_size(size_t default_size, int verbose) panic("Cannot allocate SWIOTLB overflow buffer!\n"); if (verbose) swiotlb_print_info(); + + iommu_sw->priv = NULL; } static int swiotlb_release(struct swiotlb_engine *iotlb) @@ -356,7 +356,10 @@ swiotlb_late_init_with_default_size(size_t default_size) swiotlb_print_info(); - late_alloc = 1; + /* We utilize the private field to figure out whether we + * were allocated late or early. + */ + iommu_sw->priv = (void *)1; return 0; @@ -382,7 +385,7 @@ void __init swiotlb_free(void) if (!iommu_sw) return; - if (late_alloc) { + if (iommu_sw->priv) { free_pages((unsigned long)iommu_sw->overflow_buffer, get_order(iommu_sw->overflow)); free_pages((unsigned long)iommu_sw->orig_addr, -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:01 UTC
[Xen-devel] [PATCH 12/15] [swiotlb] Remove un-used static declerations obsoleted by iommu_sw.
This includes the io_tlb_start, io_tlb_end, io_tlb_list, etc. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 29 +---------------------------- 1 files changed, 1 insertions(+), 28 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 20df588..c11dcb1 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -60,40 +60,13 @@ enum dma_sync_target { int swiotlb_force; /* - * Used to do a quick range check in unmap_single and - * sync_single_*, to see if the memory was in fact allocated by this - * API. - */ -static char *io_tlb_start, *io_tlb_end; - -/* * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and * io_tlb_end. This is command line adjustable via setup_io_tlb_npages. */ static unsigned long io_tlb_nslabs; /* - * When the IOMMU overflows we return a fallback buffer. This sets the size. - */ -static unsigned long io_tlb_overflow = 32*1024; - -void *io_tlb_overflow_buffer; - -/* - * This is a free list describing the number of free entries available from - * each index - */ -static unsigned int *io_tlb_list; -static unsigned int io_tlb_index; - -/* - * We need to save away the original address corresponding to a mapped entry - * for the sync operations. - */ -static phys_addr_t *io_tlb_orig_addr; - -/* - * Protect the above data structures in the map and unmap calls + * Protect the iommu_sw data structures in the map and unmap calls */ static DEFINE_SPINLOCK(io_tlb_lock); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:01 UTC
[Xen-devel] [PATCH 13/15] [swiotlb] Make io_tlb_nslabs visible outside lib/swiotlb.c and rename it.
We rename it to something more generic: swiotlb_nslabs and make it visible outside the lib/swiotlb.c library. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- include/linux/swiotlb.h | 1 + lib/swiotlb.c | 14 +++++++------- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 3bc3c42..23739b0 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -9,6 +9,7 @@ struct scatterlist; extern int swiotlb_force; +extern unsigned long swiotlb_nslabs; /* * Maximum allowable number of contiguous slabs to map, * must be a power of 2. What is the appropriate value ? diff --git a/lib/swiotlb.c b/lib/swiotlb.c index c11dcb1..8e65cee 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -63,7 +63,7 @@ int swiotlb_force; * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and * io_tlb_end. This is command line adjustable via setup_io_tlb_npages. */ -static unsigned long io_tlb_nslabs; +unsigned long swiotlb_nslabs; /* * Protect the iommu_sw data structures in the map and unmap calls @@ -81,9 +81,9 @@ setup_io_tlb_npages(char *str) { while (*str) { if (isdigit(*str)) { - io_tlb_nslabs = simple_strtoul(str, &str, 0); + swiotlb_nslabs = simple_strtoul(str, &str, 0); /* avoid tail segment of size < IO_TLB_SEGSIZE */ - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE); } if (!strncmp(str, "force", 5)) swiotlb_force = 1; @@ -174,11 +174,11 @@ swiotlb_init_with_default_size(size_t default_size, int verbose) { unsigned long i, bytes; - if (!io_tlb_nslabs) { + if (!swiotlb_nslabs) { iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); } else - iommu_sw->nslabs = io_tlb_nslabs; + iommu_sw->nslabs = swiotlb_nslabs; bytes = iommu_sw->nslabs << IO_TLB_SHIFT; @@ -263,11 +263,11 @@ swiotlb_late_init_with_default_size(size_t default_size) unsigned long i, bytes, req_nslabs = iommu_sw->nslabs; unsigned int order; - if (!io_tlb_nslabs) { + if (!swiotlb_nslabs) { iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); } else - iommu_sw->nslabs = io_tlb_nslabs; + iommu_sw->nslabs = swiotlb_nslabs; /* * Get IO TLB memory from the low pages -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:01 UTC
[Xen-devel] [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c
We move all of the initialization functions and as well all functions defined in the swiotlb_ops to a seperate file. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/Makefile | 2 +- lib/swiotlb-default.c | 242 +++++++++++++++++++++++++++++++++++++++++++++++++ lib/swiotlb.c | 231 +---------------------------------------------- 3 files changed, 245 insertions(+), 230 deletions(-) create mode 100644 lib/swiotlb-default.c diff --git a/lib/Makefile b/lib/Makefile index 347ad8d..fd96891 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -77,7 +77,7 @@ obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o obj-$(CONFIG_SMP) += percpu_counter.o obj-$(CONFIG_AUDIT_GENERIC) += audit.o -obj-$(CONFIG_SWIOTLB) += swiotlb.o +obj-$(CONFIG_SWIOTLB) += swiotlb.o swiotlb-default.o obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o diff --git a/lib/swiotlb-default.c b/lib/swiotlb-default.c new file mode 100644 index 0000000..c490fcf --- /dev/null +++ b/lib/swiotlb-default.c @@ -0,0 +1,242 @@ + +#include <linux/dma-mapping.h> +#include <linux/swiotlb.h> +#include <linux/bootmem.h> + + +#define OFFSET(val, align) ((unsigned long) \ + (val) & ((align) - 1)) + +#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) + +/* + * Minimum IO TLB size to bother booting with. Systems with mainly + * 64bit capable cards will only lightly use the swiotlb. If we can''t + * allocate a contiguous 1MB, we''re probably in trouble anyway. + */ +#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) + +/* Note that this doesn''t work with highmem page */ +static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, + void *address) +{ + return phys_to_dma(hwdev, virt_to_phys(address)); +} + +static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr) +{ + return phys_to_virt(dma_to_phys(hwdev, dev_addr)); +}; + +/* + * Statically reserve bounce buffer space and initialize bounce buffer data + * structures for the software IO TLB used to implement the DMA API. + */ +void __init +swiotlb_init_with_default_size(struct swiotlb_engine *iommu_sw, + size_t default_size, int verbose) +{ + unsigned long i, bytes; + + if (!swiotlb_nslabs) { + iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); + iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); + } else + iommu_sw->nslabs = swiotlb_nslabs; + + bytes = iommu_sw->nslabs << IO_TLB_SHIFT; + + /* + * Get IO TLB memory from the low pages + */ + iommu_sw->start = alloc_bootmem_low_pages(bytes); + if (!iommu_sw->start) + panic("Cannot allocate SWIOTLB buffer"); + iommu_sw->end = iommu_sw->start + bytes; + + /* + * Allocate and initialize the free list array. This array is used + * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE + * between iommu_sw->start and iommu_sw->end. + */ + iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int)); + for (i = 0; i < iommu_sw->nslabs; i++) + iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + iommu_sw->index = 0; + iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs * + sizeof(phys_addr_t)); + + /* + * Get the overflow emergency buffer + */ + iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow); + if (!iommu_sw->overflow_buffer) + panic("Cannot allocate SWIOTLB overflow buffer!\n"); + if (verbose) + swiotlb_print_info(); + + iommu_sw->priv = NULL; +} + +int swiotlb_release(struct swiotlb_engine *iommu_sw) +{ + if (!iommu_sw) + return -ENODEV; + + if (iommu_sw->priv) { + free_pages((unsigned long)iommu_sw->overflow_buffer, + get_order(iommu_sw->overflow)); + free_pages((unsigned long)iommu_sw->orig_addr, + get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); + free_pages((unsigned long)iommu_sw->list, + get_order(iommu_sw->nslabs * sizeof(int))); + free_pages((unsigned long)iommu_sw->start, + get_order(iommu_sw->nslabs << IO_TLB_SHIFT)); + } else { + free_bootmem_late(__pa(iommu_sw->overflow_buffer), + iommu_sw->overflow); + free_bootmem_late(__pa(iommu_sw->orig_addr), + iommu_sw->nslabs * sizeof(phys_addr_t)); + free_bootmem_late(__pa(iommu_sw->list), + iommu_sw->nslabs * sizeof(int)); + free_bootmem_late(__pa(iommu_sw->start), + iommu_sw->nslabs << IO_TLB_SHIFT); + } + return 0; +} + +static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw, + dma_addr_t dma_addr, phys_addr_t paddr) +{ + return paddr >= virt_to_phys(iommu_sw->start) && + paddr < virt_to_phys(iommu_sw->end); +} + +static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr, + phys_addr_t phys, size_t size) +{ + /* Phys is not neccessary in this case. */ + return dma_capable(hwdev, dma_addr, size); +} +static struct swiotlb_engine swiotlb_ops = { + .name = "software IO TLB", + .overflow = 32 * 1024, + .release = swiotlb_release, + .dma_capable = swiotlb_dma_capable, + .is_swiotlb_buffer = is_swiotlb_buffer, + .phys_to_bus = phys_to_dma, + .bus_to_phys = dma_to_phys, + .virt_to_bus = swiotlb_virt_to_bus, + .bus_to_virt = swiotlb_bus_to_virt, +}; + +void __init +swiotlb_init(int verbose) +{ + swiotlb_register_engine(&swiotlb_ops); + swiotlb_init_with_default_size(&swiotlb_ops, 64 * (1<<20), + verbose); /* default to 64MB */ +} + +/* + * Systems with larger DMA zones (those that don''t support ISA) can + * initialize the swiotlb later using the slab allocator if needed. + * This should be just like above, but with some error catching. + */ +int +swiotlb_late_init_with_default_size(struct swiotlb_engine *iommu_sw, + size_t default_size) +{ + unsigned long i, bytes, req_nslabs = iommu_sw->nslabs; + unsigned int order; + + if (!swiotlb_nslabs) { + iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); + iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); + } else + iommu_sw->nslabs = swiotlb_nslabs; + + /* + * Get IO TLB memory from the low pages + */ + order = get_order(iommu_sw->nslabs << IO_TLB_SHIFT); + iommu_sw->nslabs = SLABS_PER_PAGE << order; + bytes = iommu_sw->nslabs << IO_TLB_SHIFT; + + while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { + iommu_sw->start = (void *)__get_free_pages(GFP_DMA | + __GFP_NOWARN, order); + if (iommu_sw->start) + break; + order--; + } + + if (!iommu_sw->start) + goto cleanup1; + + if (order != get_order(bytes)) { + printk(KERN_WARNING "Warning: only able to allocate %ld MB " + "for software IO TLB\n", (PAGE_SIZE << order) >> 20); + iommu_sw->nslabs = SLABS_PER_PAGE << order; + bytes = iommu_sw->nslabs << IO_TLB_SHIFT; + } + iommu_sw->end = iommu_sw->start + bytes; + memset(iommu_sw->start, 0, bytes); + + /* + * Allocate and initialize the free list array. This array is used + * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE + * between iommu_sw->start and iommu_sw->end. + */ + iommu_sw->list = (unsigned int *)__get_free_pages(GFP_KERNEL, + get_order(iommu_sw->nslabs * sizeof(int))); + if (!iommu_sw->list) + goto cleanup2; + + for (i = 0; i < iommu_sw->nslabs; i++) + iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + iommu_sw->index = 0; + + iommu_sw->orig_addr = (phys_addr_t *) + __get_free_pages(GFP_KERNEL, + get_order(iommu_sw->nslabs * + sizeof(phys_addr_t))); + if (!iommu_sw->orig_addr) + goto cleanup3; + + memset(iommu_sw->orig_addr, 0, iommu_sw->nslabs * sizeof(phys_addr_t)); + + /* + * Get the overflow emergency buffer + */ + iommu_sw->overflow_buffer = (void *)__get_free_pages(GFP_DMA, + get_order(iommu_sw->overflow)); + if (!iommu_sw->overflow_buffer) + goto cleanup4; + + swiotlb_print_info(); + + /* We utilize the private field to figure out whether we + * were allocated late or early. + */ + iommu_sw->priv = (void *)1; + + return 0; + +cleanup4: + free_pages((unsigned long)iommu_sw->orig_addr, + get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); + iommu_sw->orig_addr = NULL; +cleanup3: + free_pages((unsigned long)iommu_sw->list, get_order(iommu_sw->nslabs * + sizeof(int))); + iommu_sw->list = NULL; +cleanup2: + iommu_sw->end = NULL; + free_pages((unsigned long)iommu_sw->start, order); + iommu_sw->start = NULL; +cleanup1: + iommu_sw->nslabs = req_nslabs; + return -ENOMEM; +} + diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 8e65cee..9e72d21 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -40,14 +40,6 @@ #define OFFSET(val,align) ((unsigned long) \ ( (val) & ( (align) - 1))) -#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) - -/* - * Minimum IO TLB size to bother booting with. Systems with mainly - * 64bit capable cards will only lightly use the swiotlb. If we can''t - * allocate a contiguous 1MB, we''re probably in trouble anyway. - */ -#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) /* * Enumeration for sync targets @@ -96,18 +88,6 @@ setup_io_tlb_npages(char *str) __setup("swiotlb=", setup_io_tlb_npages); /* make io_tlb_overflow tunable too? */ -/* Note that this doesn''t work with highmem page */ -static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, - void *address) -{ - return phys_to_dma(hwdev, virt_to_phys(address)); -} - -static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr) -{ - return phys_to_virt(dma_to_phys(hwdev, dev_addr)); -}; - /* * Register a software IO TLB engine. * @@ -165,220 +145,13 @@ void swiotlb_print_info(void) (unsigned long long)pend); } -/* - * Statically reserve bounce buffer space and initialize bounce buffer data - * structures for the software IO TLB used to implement the DMA API. - */ -void __init -swiotlb_init_with_default_size(size_t default_size, int verbose) -{ - unsigned long i, bytes; - - if (!swiotlb_nslabs) { - iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); - iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); - } else - iommu_sw->nslabs = swiotlb_nslabs; - - bytes = iommu_sw->nslabs << IO_TLB_SHIFT; - - /* - * Get IO TLB memory from the low pages - */ - iommu_sw->start = alloc_bootmem_low_pages(bytes); - if (!iommu_sw->start) - panic("Cannot allocate SWIOTLB buffer"); - iommu_sw->end = iommu_sw->start + bytes; - - /* - * Allocate and initialize the free list array. This array is used - * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE - * between iommu_sw->start and iommu_sw->end. - */ - iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int)); - for (i = 0; i < iommu_sw->nslabs; i++) - iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); - iommu_sw->index = 0; - iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs * - sizeof(phys_addr_t)); - - /* - * Get the overflow emergency buffer - */ - iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow); - if (!iommu_sw->overflow_buffer) - panic("Cannot allocate SWIOTLB overflow buffer!\n"); - if (verbose) - swiotlb_print_info(); - - iommu_sw->priv = NULL; -} - -static int swiotlb_release(struct swiotlb_engine *iotlb) -{ - swiotlb_free(); - return 0; -} -static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw, - dma_addr_t dev_addr, phys_addr_t paddr) -{ - return paddr >= virt_to_phys(iommu_sw->start) && - paddr < virt_to_phys(iommu_sw->end); -} - -static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr, - phys_addr_t phys, size_t size) -{ - /* Phys is not neccessary in this case. */ - return dma_capable(hwdev, dma_addr, size); -} - -static struct swiotlb_engine swiotlb_ops = { - .name = "software IO TLB", - .overflow = 32 * 1024, - .release = swiotlb_release, - .dma_capable = swiotlb_dma_capable, - .is_swiotlb_buffer = is_swiotlb_buffer, - .phys_to_bus = phys_to_dma, - .bus_to_phys = dma_to_phys, - .virt_to_bus = swiotlb_virt_to_bus, - .bus_to_virt = swiotlb_bus_to_virt, -}; - -void __init -swiotlb_init(int verbose) -{ - swiotlb_register_engine(&swiotlb_ops); - swiotlb_init_with_default_size(64 * (1<<20), verbose); /* default to 64MB */ -} - -/* - * Systems with larger DMA zones (those that don''t support ISA) can - * initialize the swiotlb later using the slab allocator if needed. - * This should be just like above, but with some error catching. - */ -int -swiotlb_late_init_with_default_size(size_t default_size) -{ - unsigned long i, bytes, req_nslabs = iommu_sw->nslabs; - unsigned int order; - - if (!swiotlb_nslabs) { - iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); - iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); - } else - iommu_sw->nslabs = swiotlb_nslabs; - - /* - * Get IO TLB memory from the low pages - */ - order = get_order(iommu_sw->nslabs << IO_TLB_SHIFT); - iommu_sw->nslabs = SLABS_PER_PAGE << order; - bytes = iommu_sw->nslabs << IO_TLB_SHIFT; - - while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { - iommu_sw->start = (void *)__get_free_pages(GFP_DMA | - __GFP_NOWARN, order); - if (iommu_sw->start) - break; - order--; - } - - if (!iommu_sw->start) - goto cleanup1; - - if (order != get_order(bytes)) { - printk(KERN_WARNING "Warning: only able to allocate %ld MB " - "for software IO TLB\n", (PAGE_SIZE << order) >> 20); - iommu_sw->nslabs = SLABS_PER_PAGE << order; - bytes = iommu_sw->nslabs << IO_TLB_SHIFT; - } - iommu_sw->end = iommu_sw->start + bytes; - memset(iommu_sw->start, 0, bytes); - - /* - * Allocate and initialize the free list array. This array is used - * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE - * between iommu_sw->start and iommu_sw->end. - */ - iommu_sw->list = (unsigned int *)__get_free_pages(GFP_KERNEL, - get_order(iommu_sw->nslabs * sizeof(int))); - if (!iommu_sw->list) - goto cleanup2; - - for (i = 0; i < iommu_sw->nslabs; i++) - iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); - iommu_sw->index = 0; - - iommu_sw->orig_addr = (phys_addr_t *) - __get_free_pages(GFP_KERNEL, - get_order(iommu_sw->nslabs * - sizeof(phys_addr_t))); - if (!iommu_sw->orig_addr) - goto cleanup3; - - memset(iommu_sw->orig_addr, 0, iommu_sw->nslabs * sizeof(phys_addr_t)); - - /* - * Get the overflow emergency buffer - */ - iommu_sw->overflow_buffer = (void *)__get_free_pages(GFP_DMA, - get_order(iommu_sw->overflow)); - if (!iommu_sw->overflow_buffer) - goto cleanup4; - - swiotlb_print_info(); - - /* We utilize the private field to figure out whether we - * were allocated late or early. - */ - iommu_sw->priv = (void *)1; - - return 0; - -cleanup4: - free_pages((unsigned long)iommu_sw->orig_addr, - get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); - iommu_sw->orig_addr = NULL; -cleanup3: - free_pages((unsigned long)iommu_sw->list, get_order(iommu_sw->nslabs * - sizeof(int))); - iommu_sw->list = NULL; -cleanup2: - iommu_sw->end = NULL; - free_pages((unsigned long)iommu_sw->start, order); - iommu_sw->start = NULL; -cleanup1: - iommu_sw->nslabs = req_nslabs; - return -ENOMEM; -} - void __init swiotlb_free(void) { if (!iommu_sw) return; - if (iommu_sw->priv) { - free_pages((unsigned long)iommu_sw->overflow_buffer, - get_order(iommu_sw->overflow)); - free_pages((unsigned long)iommu_sw->orig_addr, - get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); - free_pages((unsigned long)iommu_sw->list, - get_order(iommu_sw->nslabs * sizeof(int))); - free_pages((unsigned long)iommu_sw->start, - get_order(iommu_sw->nslabs << IO_TLB_SHIFT)); - } else { - free_bootmem_late(__pa(iommu_sw->overflow_buffer), - iommu_sw->overflow); - free_bootmem_late(__pa(iommu_sw->orig_addr), - iommu_sw->nslabs * sizeof(phys_addr_t)); - free_bootmem_late(__pa(iommu_sw->list), - iommu_sw->nslabs * sizeof(int)); - free_bootmem_late(__pa(iommu_sw->start), - iommu_sw->nslabs << IO_TLB_SHIFT); - } -} - + iommu_sw->release(iommu_sw); +}; /* * Bounce: copy the swiotlb buffer back to the original dma location -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-14 23:01 UTC
[Xen-devel] [PATCH 15/15] [swiotlb] Take advantage of iommu_sw->name and add %s to printk''s.
Make the printk usage more generic in the SWIOTLB library. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb-default.c | 6 +++--- lib/swiotlb.c | 26 ++++++++++++++++---------- 2 files changed, 19 insertions(+), 13 deletions(-) diff --git a/lib/swiotlb-default.c b/lib/swiotlb-default.c index c490fcf..ebee540 100644 --- a/lib/swiotlb-default.c +++ b/lib/swiotlb-default.c @@ -51,7 +51,7 @@ swiotlb_init_with_default_size(struct swiotlb_engine *iommu_sw, */ iommu_sw->start = alloc_bootmem_low_pages(bytes); if (!iommu_sw->start) - panic("Cannot allocate SWIOTLB buffer"); + panic("Cannot allocate %s buffer", iommu_sw->name); iommu_sw->end = iommu_sw->start + bytes; /* @@ -71,7 +71,7 @@ swiotlb_init_with_default_size(struct swiotlb_engine *iommu_sw, */ iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow); if (!iommu_sw->overflow_buffer) - panic("Cannot allocate SWIOTLB overflow buffer!\n"); + panic("Cannot allocate %s overflow buffer!\n", iommu_sw->name); if (verbose) swiotlb_print_info(); @@ -176,7 +176,7 @@ swiotlb_late_init_with_default_size(struct swiotlb_engine *iommu_sw, if (order != get_order(bytes)) { printk(KERN_WARNING "Warning: only able to allocate %ld MB " - "for software IO TLB\n", (PAGE_SIZE << order) >> 20); + "for %s\n", (PAGE_SIZE << order) >> 20, iommu_sw->name); iommu_sw->nslabs = SLABS_PER_PAGE << order; bytes = iommu_sw->nslabs << IO_TLB_SHIFT; } diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 9e72d21..1f17be0 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -138,9 +138,10 @@ void swiotlb_print_info(void) pstart = virt_to_phys(iommu_sw->start); pend = virt_to_phys(iommu_sw->end); - printk(KERN_INFO "Placing %luMB software IO TLB between %p - %p\n", - bytes >> 20, iommu_sw->start, iommu_sw->end); - printk(KERN_INFO "software IO TLB at phys %#llx - %#llx\n", + printk(KERN_INFO "Placing %luMB %s between %p - %p\n", + bytes >> 20, iommu_sw->name, iommu_sw->start, iommu_sw->end); + printk(KERN_INFO "%s at phys %#llx - %#llx\n", + iommu_sw->name, (unsigned long long)pstart, (unsigned long long)pend); } @@ -408,7 +409,8 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, /* Confirm address can be DMA''d by device */ if (dev_addr + size - 1 > dma_mask) { - printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n", + printk(KERN_ERR "%s:hwdev DMA mask = 0x%016Lx, " \ + "dev_addr = 0x%016Lx\n", iommu_sw->name, (unsigned long long)dma_mask, (unsigned long long)dev_addr); @@ -446,18 +448,21 @@ swiotlb_full(struct device *dev, size_t size, int dir, int do_panic) * When the mapping is small enough return a static buffer to limit * the damage, or panic when the transfer is too big. */ - printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at " - "device %s\n", size, dev ? dev_name(dev) : "?"); + printk(KERN_ERR "%s: Out of space for %zu bytes at " + "device %s\n", iommu_sw->name, size, dev ? dev_name(dev) : "?"); if (size <= iommu_sw->overflow || !do_panic) return; if (dir == DMA_BIDIRECTIONAL) - panic("DMA: Random memory could be DMA accessed\n"); + panic("%s: Random memory could be DMA accessed\n", + iommu_sw->name); if (dir == DMA_FROM_DEVICE) - panic("DMA: Random memory could be DMA written\n"); + panic("%s: Random memory could be DMA written\n", + iommu_sw->name); if (dir == DMA_TO_DEVICE) - panic("DMA: Random memory could be DMA read\n"); + panic("%s: Random memory could be DMA read\n", + iommu_sw->name); } /* @@ -500,7 +505,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, * Ensure that the address returned is DMA''ble */ if (!iommu_sw->dma_capable(dev, dev_addr, phys, size)) - panic("map_single: bounce buffer is not DMA''ble"); + panic("%s map_single: bounce buffer is not DMA''ble", + iommu_sw->name); return dev_addr; } -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 01:22 UTC
[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> Before this patch, if you specified ''swiotlb=force,1024'' it would > ignore both arguments. This fixes it and allows the user specify it > in any order (or none at all). > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Having only one substring of digits makes allowing arbitrary order less useful if more options get added (as in foo,bar,1024,baz,force would make more sense as foo,bar,nslabs=1024,baz,force). Do you think this one is really needed? If so, be useful to update Documentation/kernel-parameters.txt which is slightly out of date now. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 01:33 UTC
[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> The structure contains all of the existing variables used in > software IO TLB (swiotlb.c) collected within a structure. > > Additionally a name variable and a deconstructor (release) function > variable is defined for API usages. > > The other set of functions: is_swiotlb_buffer, dma_capable, phys_to_bus, > bus_to_phys, virt_to_bus, and bus_to_virt server as a method to abstract > them out of the SWIOTLB library. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > include/linux/swiotlb.h | 94 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 94 insertions(+), 0 deletions(-) > > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h > index febedcf..781c3aa 100644 > --- a/include/linux/swiotlb.h > +++ b/include/linux/swiotlb.h > @@ -24,6 +24,100 @@ extern int swiotlb_force; > > extern void swiotlb_init(int verbose); >You can move the comments to a kerneldoc section for proper documentation. /** * struct swiotlb_engine - short desc... * @name: Name of the engine... etc> +struct swiotlb_engine { > + > + /* > + * Name of the engine (ie: "Software IO TLB") > + */ > + const char *name; > + > + /* > + * Used to do a quick range check in unmap_single and > + * sync_single_*, to see if the memory was in fact allocated by this > + * API. > + */ > + char *start; > + char *end;Isn''t this still global to swiotlb, not specific to the backend impl.?> + /* > + * The number of IO TLB blocks (in groups of 64) betweeen start and > + * end. This is command line adjustable via setup_io_tlb_npages. > + */ > + unsigned long nslabs;Same here.> + > + /* > + * When the IOMMU overflows we return a fallback buffer. > + * This sets the size. > + */ > + unsigned long overflow; > + > + void *overflow_buffer;And here.> + /* > + * This is a free list describing the number of free entries available > + * from each index > + */ > + unsigned int *list; > + > + /* > + * Current marker in the start through end location. Is incremented > + * on each map and wraps around. > + */ > + unsigned int index; > + > + /* > + * We need to save away the original address corresponding to a mapped > + * entry for the sync operations. > + */ > + phys_addr_t *orig_addr; > + > + /* > + * IOMMU private data. > + */ > + void *priv; > + /* > + * The API call to free a SWIOTLB engine if another wants to register > + * (or if want to turn SWIOTLB off altogether). > + * It is imperative that this function checks for existing DMA maps > + * and not release the IOTLB if there are out-standing maps. > + */ > + int (*release)(struct swiotlb_engine *); > + > + /* > + * Is the DMA (Bus) address within our bounce buffer (start and end). > + */ > + int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr, > + phys_addr_t phys); > +Why is this implementation specific?> + /* > + * Is the DMA (Bus) address reachable by the PCI device?. > + */ > + bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t);This too...> + /* > + * Physical to bus (DMA) address translation. On > + * most platforms this is an equivalent function. > + */ > + dma_addr_t (*phys_to_bus)(struct device *hwdev, phys_addr_t paddr); > + > + /* > + * Bus (DMA) to physical address translation. On most > + * platforms this is an equivalant function. > + */ > + phys_addr_t (*bus_to_phys)(struct device *hwdev, dma_addr_t baddr); > + > + /* > + * Virtual to bus (DMA) address translation. On most platforms > + * this is a call to __pa(address). > + */ > + dma_addr_t (*virt_to_bus)(struct device *hwdev, void *address); > + > + /* > + * Bus (DMA) to virtual address translation. On most platforms > + * this is a call to __va(address). > + */ > + void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address); > +}; > + > extern void > *swiotlb_alloc_coherent(struct device *hwdev, size_t size, > dma_addr_t *dma_handle, gfp_t flags); > -- > 1.6.2.5_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 01:41 UTC
[Xen-devel] Re: [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> We set the internal iommu_sw pointer to the passed in swiotlb_engine > structure. Obviously we also check if the existing iommu_sw is > set and if so, call iommu_sw->release before the switch-over. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > include/linux/swiotlb.h | 2 + > lib/swiotlb.c | 48 +++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 50 insertions(+), 0 deletions(-) > > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h > index 781c3aa..3bc3c42 100644 > --- a/include/linux/swiotlb.h > +++ b/include/linux/swiotlb.h > @@ -118,6 +118,8 @@ struct swiotlb_engine { > void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address); > }; > > +int swiotlb_register_engine(struct swiotlb_engine *iommu); > + > extern void > *swiotlb_alloc_coherent(struct device *hwdev, size_t size, > dma_addr_t *dma_handle, gfp_t flags); > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > index e6d9e32..e84f269 100644 > --- a/lib/swiotlb.c > +++ b/lib/swiotlb.c > @@ -97,6 +97,12 @@ static phys_addr_t *io_tlb_orig_addr; > */ > static DEFINE_SPINLOCK(io_tlb_lock); > > +/* > + * The software IOMMU this library will utilize. > + */ > +struct swiotlb_engine *iommu_sw; > +EXPORT_SYMBOL(iommu_sw);should be EXPORT_SYMBOL_GPL> static int late_alloc; > > static int __init > @@ -126,6 +132,48 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, > return phys_to_dma(hwdev, virt_to_phys(address)); > } > > +/* > + * Register a software IO TLB engine. > + * > + * The registration allows the software IO TLB functions in the > + * swiotlb library to function properly. > + * > + * All the values in the iotlb structure must be set. > + * > + * If the registration fails, it is assumed that the caller will free > + * all of the resources allocated in the swiotlb_engine structure. > + */ > +int swiotlb_register_engine(struct swiotlb_engine *iommu) > +{ > + if (!iommu || !iommu->name || !iommu->release) { > + printk(KERN_ERR "DMA: Trying to register a SWIOTLB engine" \ > + " improperly!"); > + return -EINVAL; > + } > + > + if (iommu_sw && iommu_sw->name) {According to above, you can''t have !iommu_sw->name.> + int retval = -EINVAL; > + > + /* ''release'' must check for out-standing DMAs and flush them > + * out or fail. */ > + if (iommu_sw->release) > + retval = iommu_sw->release(iommu_sw);Same here, you can''t have !iommu_sw->release, just call unconditionally.> + > + if (retval) { > + printk(KERN_ERR "DMA: %s cannot be released!\n", > + iommu_sw->name); > + return retval; > + } > + printk(KERN_INFO "DMA: Replacing [%s] with [%s]\n", > + iommu_sw->name, iommu->name); > + } > + > + iommu_sw = iommu; > + > + return 0; > +} > +EXPORT_SYMBOL(swiotlb_register_engine); > + > void swiotlb_print_info(void) > { > unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 01:43 UTC
[Xen-devel] Re: [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> We also fix the checkpatch.pl errors that surfaced during > this conversion. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > lib/swiotlb.c | 204 +++++++++++++++++++++++++++++---------------------------- > 1 files changed, 104 insertions(+), 100 deletions(-) > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > index e84f269..3499001 100644 > --- a/lib/swiotlb.c > +++ b/lib/swiotlb.c > @@ -176,14 +176,14 @@ EXPORT_SYMBOL(swiotlb_register_engine); > > void swiotlb_print_info(void) > { > - unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT; > + unsigned long bytes = iommu_sw->nslabs << IO_TLB_SHIFT;This isn''t a bisect friendly way to do this...this would oops w/ iommu_sw == NULL. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 01:47 UTC
[Xen-devel] Re: [PATCH 05/15] [swiotlb] Respect the io_tlb_nslabs argument value.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> The search and replace removed the option to override > the amount of slabs via swiotlb=<x> argument. This puts > it back in.Should just fold a change like this in w/ prior one. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 01:57 UTC
[Xen-devel] Re: [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> For baselevel support we define required functions and fill out > variables. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > lib/swiotlb.c | 22 ++++++++++++++++++++++ > 1 files changed, 22 insertions(+), 0 deletions(-) > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > index cf29f03..3c7bd4e 100644 > --- a/lib/swiotlb.c > +++ b/lib/swiotlb.c > @@ -132,6 +132,11 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, > return phys_to_dma(hwdev, virt_to_phys(address)); > } > > +static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr) > +{ > + return phys_to_virt(dma_to_phys(hwdev, dev_addr)); > +}; > + > /* > * Register a software IO TLB engine. > * > @@ -236,9 +241,26 @@ swiotlb_init_with_default_size(size_t default_size, int verbose) > swiotlb_print_info(); > } > > +static int swiotlb_release(struct swiotlb_engine *iotlb) > +{ > + swiotlb_free(); > + return 0;Do you ever expect a failure case here? You mentioned wait and flush or fail, but that''s not done here. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 02:02 UTC
[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> --- a/lib/swiotlb.c > +++ b/lib/swiotlb.c > @@ -364,7 +364,7 @@ cleanup1: > > void __init swiotlb_free(void) > { > - if (!iommu_sw->overflow_buffer) > + if (!iommu_sw) > return; >Sure this is safe for the case where allocation failed? Wouldn''t this do free_late_bootmem(__pa(0))? thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-15 02:14 UTC
[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> We move all of the initialization functions and as well > all functions defined in the swiotlb_ops to a seperate file. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > lib/Makefile | 2 +- > lib/swiotlb-default.c | 242 +++++++++++++++++++++++++++++++++++++++++++++++++ > lib/swiotlb.c | 231 +---------------------------------------------- > 3 files changed, 245 insertions(+), 230 deletions(-) > create mode 100644 lib/swiotlb-default.c > > diff --git a/lib/Makefile b/lib/Makefile > index 347ad8d..fd96891 100644 > --- a/lib/Makefile > +++ b/lib/Makefile > @@ -77,7 +77,7 @@ obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o > obj-$(CONFIG_SMP) += percpu_counter.o > obj-$(CONFIG_AUDIT_GENERIC) += audit.o > > -obj-$(CONFIG_SWIOTLB) += swiotlb.o > +obj-$(CONFIG_SWIOTLB) += swiotlb.o swiotlb-default.o > obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o > obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o > > diff --git a/lib/swiotlb-default.c b/lib/swiotlb-default.c > new file mode 100644 > index 0000000..c490fcf > --- /dev/null > +++ b/lib/swiotlb-default.c > @@ -0,0 +1,242 @@ > + > +#include <linux/dma-mapping.h> > +#include <linux/swiotlb.h> > +#include <linux/bootmem.h> > + > + > +#define OFFSET(val, align) ((unsigned long) \ > + (val) & ((align) - 1)) > + > +#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) > + > +/* > + * Minimum IO TLB size to bother booting with. Systems with mainly > + * 64bit capable cards will only lightly use the swiotlb. If we can''t > + * allocate a contiguous 1MB, we''re probably in trouble anyway. > + */ > +#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) > + > +/* Note that this doesn''t work with highmem page */ > +static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, > + void *address) > +{ > + return phys_to_dma(hwdev, virt_to_phys(address)); > +} > + > +static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr) > +{ > + return phys_to_virt(dma_to_phys(hwdev, dev_addr)); > +}; > + > +/* > + * Statically reserve bounce buffer space and initialize bounce buffer data > + * structures for the software IO TLB used to implement the DMA API. > + */ > +void __init > +swiotlb_init_with_default_size(struct swiotlb_engine *iommu_sw, > + size_t default_size, int verbose) > +{ > + unsigned long i, bytes; > + > + if (!swiotlb_nslabs) { > + iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT); > + iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE); > + } else > + iommu_sw->nslabs = swiotlb_nslabs; > + > + bytes = iommu_sw->nslabs << IO_TLB_SHIFT; > + > + /* > + * Get IO TLB memory from the low pages > + */ > + iommu_sw->start = alloc_bootmem_low_pages(bytes); > + if (!iommu_sw->start) > + panic("Cannot allocate SWIOTLB buffer"); > + iommu_sw->end = iommu_sw->start + bytes; > + > + /* > + * Allocate and initialize the free list array. This array is used > + * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE > + * between iommu_sw->start and iommu_sw->end. > + */ > + iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int)); > + for (i = 0; i < iommu_sw->nslabs; i++) > + iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); > + iommu_sw->index = 0; > + iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs * > + sizeof(phys_addr_t)); > + > + /* > + * Get the overflow emergency buffer > + */ > + iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow); > + if (!iommu_sw->overflow_buffer) > + panic("Cannot allocate SWIOTLB overflow buffer!\n"); > + if (verbose) > + swiotlb_print_info(); > + > + iommu_sw->priv = NULL; > +} > + > +int swiotlb_release(struct swiotlb_engine *iommu_sw) > +{ > + if (!iommu_sw) > + return -ENODEV; > + > + if (iommu_sw->priv) { > + free_pages((unsigned long)iommu_sw->overflow_buffer, > + get_order(iommu_sw->overflow)); > + free_pages((unsigned long)iommu_sw->orig_addr, > + get_order(iommu_sw->nslabs * sizeof(phys_addr_t))); > + free_pages((unsigned long)iommu_sw->list, > + get_order(iommu_sw->nslabs * sizeof(int))); > + free_pages((unsigned long)iommu_sw->start, > + get_order(iommu_sw->nslabs << IO_TLB_SHIFT)); > + } else { > + free_bootmem_late(__pa(iommu_sw->overflow_buffer), > + iommu_sw->overflow); > + free_bootmem_late(__pa(iommu_sw->orig_addr), > + iommu_sw->nslabs * sizeof(phys_addr_t)); > + free_bootmem_late(__pa(iommu_sw->list), > + iommu_sw->nslabs * sizeof(int)); > + free_bootmem_late(__pa(iommu_sw->start), > + iommu_sw->nslabs << IO_TLB_SHIFT); > + } > + return 0; > +} > + > +static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw, > + dma_addr_t dma_addr, phys_addr_t paddr) > +{ > + return paddr >= virt_to_phys(iommu_sw->start) && > + paddr < virt_to_phys(iommu_sw->end); > +} > + > +static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr, > + phys_addr_t phys, size_t size) > +{ > + /* Phys is not neccessary in this case. */ > + return dma_capable(hwdev, dma_addr, size); > +} > +static struct swiotlb_engine swiotlb_ops = { > + .name = "software IO TLB", > + .overflow = 32 * 1024, > + .release = swiotlb_release, > + .dma_capable = swiotlb_dma_capable, > + .is_swiotlb_buffer = is_swiotlb_buffer, > + .phys_to_bus = phys_to_dma, > + .bus_to_phys = dma_to_phys, > + .virt_to_bus = swiotlb_virt_to_bus, > + .bus_to_virt = swiotlb_bus_to_virt, > +}; > + > +void __init > +swiotlb_init(int verbose) > +{ > + swiotlb_register_engine(&swiotlb_ops); > + swiotlb_init_with_default_size(&swiotlb_ops, 64 * (1<<20), > + verbose); /* default to 64MB */ > +}I''d expect the swiotlb-default file to have only private impl. of the swiotlb_engine. Shouldn''t this and the init stay in swiotlb.c? Also, would you ever call swiotlb_init w/out register_engine, why not move register to the swiotlb_init? thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> Another approach, which this set of patches explores, is to abstract the > address translation and address determination functions away from the > SWIOTLB book-keeping functions. This way the core SWIOTLB library functions > are present in one place, while the address related functions are in > a separate library for different run-time platforms. I would very much > appreciate input on this idea and the set of patches.It seems like it still needs some refinement, since the Xen implementation is hooking into two layers. Both: + swiotlb_register_engine(&xen_ops); and +static struct dma_map_ops xen_swiotlb_dma_ops = { Wouldn''t the idea be to get to the point that you''d use common swiotlb and keep the hooks to one layer? Also, it''s unclear when some of the prior global to swiotlb variables would actually be useful to a private implementation. For example, overflow, which is just 32 * 1024 in both cases. Are those really needed to be private to a swiotlb engine? Do you think you can reduce the swiotlb_engine to just the relevant ops? thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 17:25 UTC
[Xen-devel] Re: [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.
> > +EXPORT_SYMBOL(iommu_sw); > > should be EXPORT_SYMBOL_GPLYup.> > > static int late_alloc; > > > > static int __init > > @@ -126,6 +132,48 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, > > return phys_to_dma(hwdev, virt_to_phys(address)); > > } > > > > +/* > > + * Register a software IO TLB engine. > > + * > > + * The registration allows the software IO TLB functions in the > > + * swiotlb library to function properly. > > + * > > + * All the values in the iotlb structure must be set. > > + * > > + * If the registration fails, it is assumed that the caller will free > > + * all of the resources allocated in the swiotlb_engine structure. > > + */ > > +int swiotlb_register_engine(struct swiotlb_engine *iommu) > > +{ > > + if (!iommu || !iommu->name || !iommu->release) { > > + printk(KERN_ERR "DMA: Trying to register a SWIOTLB engine" \ > > + " improperly!"); > > + return -EINVAL; > > + } > > + > > + if (iommu_sw && iommu_sw->name) { > > According to above, you can''t have !iommu_sw->name.Yup. Artificats of previous implementation.> > > + int retval = -EINVAL; > > + > > + /* ''release'' must check for out-standing DMAs and flush them > > + * out or fail. */ > > + if (iommu_sw->release) > > + retval = iommu_sw->release(iommu_sw); > > Same here, you can''t have !iommu_sw->release, just call unconditionally.Ok. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 17:45 UTC
[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c
> > +void __init > > +swiotlb_init(int verbose) > > +{ > > + swiotlb_register_engine(&swiotlb_ops); > > + swiotlb_init_with_default_size(&swiotlb_ops, 64 * (1<<20), > > + verbose); /* default to 64MB */ > > +} > > I''d expect the swiotlb-default file to have only private impl. of the > swiotlb_engine. Shouldn''t this and the init stay in swiotlb.c? Also,Hmm, were you thinking that it might make sense to pass in a swiotlb_ops to swiotlb_init so that it can make the right assignments? The reason why I stuck here was that the swiotlb_ops needed to be visible to this function, and having it in swiotlb.c would mean it must now include the header definition for swiotlb-defualt.h.> would you ever call swiotlb_init w/out register_engine, why not move > register to the swiotlb_init?In essence combine swiotlb_register_engine with swiotlb_init_with_default_size? There would still be a need for late call mechanism. Perhaps having two variants of swiotlb_init?: swiotlb_early_init(struct swiotlb_engine *swiotlb_ops) and swiotlb_late_init(struct swiotlb_engine *swiotlb_ops)? Or perhaps just pass in an argument: swiotlb_init(int late)? Furthermore have this new swiotlb_init detect if some of the fields (start ,end, overflow_buffer) have been allocated and if so skip the default allocation altogether? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 17:45 UTC
[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.
On Thu, Jan 14, 2010 at 06:02:40PM -0800, Chris Wright wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote: > > --- a/lib/swiotlb.c > > +++ b/lib/swiotlb.c > > @@ -364,7 +364,7 @@ cleanup1: > > > > void __init swiotlb_free(void) > > { > > - if (!iommu_sw->overflow_buffer) > > + if (!iommu_sw) > > return; > > > > Sure this is safe for the case where allocation failed? Wouldn''t this > do free_late_bootmem(__pa(0))?That would indeed fail, but alloc_bootmem_low_pages (___alloc_bootmem) panics the machine if it can''t allocate the buffer. So we would never actually get to swiotlb_free if we failed to allocate the buffers for SWIOTLB. But for the case where the SWIOTLB allocation happens when using swiotlb_late_init_with_default_size, and it fails, this check is not sufficient. I will add a check for that or just make swiotlb_late_init_with_default_size set iommu_sw to NULL when the allocation fails. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 17:45 UTC
[Xen-devel] Re: [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.
> > +static int swiotlb_release(struct swiotlb_engine *iotlb) > > +{ > > + swiotlb_free(); > > + return 0; > > Do you ever expect a failure case here? You mentioned wait and flush or > fail, but that''s not done here.I was thinking to extend that in the next batch of patches. But I can roll that in here. The check is pretty simple - it checks to see if the index value has been changed, and if so returns a failure. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 17:45 UTC
[Xen-devel] Re: [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/
On Thu, Jan 14, 2010 at 05:43:30PM -0800, Chris Wright wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote: > > We also fix the checkpatch.pl errors that surfaced during > > this conversion. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > lib/swiotlb.c | 204 +++++++++++++++++++++++++++++---------------------------- > > 1 files changed, 104 insertions(+), 100 deletions(-) > > > > diff --git a/lib/swiotlb.c b/lib/swiotlb.c > > index e84f269..3499001 100644 > > --- a/lib/swiotlb.c > > +++ b/lib/swiotlb.c > > @@ -176,14 +176,14 @@ EXPORT_SYMBOL(swiotlb_register_engine); > > > > void swiotlb_print_info(void) > > { > > - unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT; > > + unsigned long bytes = iommu_sw->nslabs << IO_TLB_SHIFT; > > This isn''t a bisect friendly way to do this...this would oops w/ > iommu_sw == NULL.Yes. Let me redo this in a friendlier manner. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 17:46 UTC
[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
. snip..> You can move the comments to a kerneldoc section for proper > documentation. > > /** > * struct swiotlb_engine - short desc... > * @name: Name of the engine... > etc<nods>. .. snip ..> > + char *end; > > Isn''t this still global to swiotlb, not specific to the backend impl.?Yes and no. Without the start/end, the "is_swiotlb_buffer" would not be able to determine if the passed in address is within the SWIOTLB buffer.> > > + /* > > + * The number of IO TLB blocks (in groups of 64) betweeen start and > > + * end. This is command line adjustable via setup_io_tlb_npages. > > + */ > > + unsigned long nslabs; > > Same here. >That one can be put back (make it part of lib/swiotlb.c)> > + > > + /* > > + * When the IOMMU overflows we return a fallback buffer. > > + * This sets the size. > > + */ > > + unsigned long overflow; > > + > > + void *overflow_buffer; > > And here.Ditto. ..snip ..> > + * Is the DMA (Bus) address within our bounce buffer (start and end). > > + */ > > + int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr, > > + phys_addr_t phys); > > + > > Why is this implementation specific?In the current implementation, they both use the physical address and do a simple check: return paddr >= virt_to_phys(io_tlb_start) && paddr < virt_to_phys(io_tlb_end); That for virtualized environments where a PCI device is passed in would work too. Unfortunately the problem is when we provide a channel of communication with another domain and we end up doing DMA on behalf of another guest. The short description of the problem is that a page of memory is shared with another domain and the mapping in our domain is correct (bus->physical) the other way (virt->physical->bus) is incorrect for the duration of this page being shared. Hence we need to verify that the page is local to our domain, and for that we need the bus address to verify that the addr == physical->bus(bus->physical(addr)) where addr is the bus address (dma_addr_t). If it is not local (shared with another domain) we MUST not consider it as a SWIOTLB buffer as that can lead to panics and possible corruptions. The trick here is that the phys->virt address can fall within the SWIOTLB buffer for pages that are shared with another domain and we need the DMA address to do an extra check. The long description of the problem is: You are the domain doing some DMA on behalf of another domain. The simple example is you are servicing a block device to the other guests. One way to implement this is to present a one page ring buffer where both domains move the producer and consumer indexes around. Once you get a request (READ/WRITE), you use the virtualized channels to "share" that page into your domain. For this you have a buffer (2MB or bigger) wherein for pages that shared in to you, you over-write the phys->bus mapping. That means that the phys->phys translation is altered for the duration of this request being out-standing. Once it is completed, the phys->bus translation is restored. Here is a little diagram of what happens when a page is shared (and lets assume that we have a situation where virt #1 == virt #2, which means that phys #1 == phys #2). (domain 2) virt#1->phys#1---\ +- bus #1 (domain 3) virt#2->phys#2 ---/ (phys#1 points to bus #1, and phys#2 points to bus #1 too). The converse of the above picture is not true: /---> phys #1-> virt #1. (domain 2). bus#1 + \---> phys #2-> virt #2. (domain 3). phys #1 != phys #2 and hence virt #1 != virt #2. When a page is not shared: (domain 2) virt #1->phys #1--> bus #1 (domain 3) virt #2->phys #2--> bus #2 bus #1 -> phys #1 -> virt #1 (domain 2) bus #2 -> phys #2 -> virt #2 (domain 3) The bus #1 != bus #2, but phys #1 could be same as phys #2 (since there are just PFNs). And virt #1 == virt #2. The reason for these is that each domain has its own memory layout where the memory starts at pfn 0, not at some higher number. So each domain sees the physical address identically, but the bus address MUST point to different areas (except when sharing) otherwise one domain would over-write another domain, ouch. Furthermore when a domain is allocated, the pages for the domain are not guaranteed to be linearly contiguous so we can''t guarantee that phys == bus. So to guard against the situation in which phys #1 ->virt comes out with an address that looks to be within our SWIOTLB buffer we need to do the extra check: addr == physical->bus(bus->physical(addr)) where addr is the bus address And for scenarios where this is not true (page belongs to another domain), that page is not in the SWIOTLB (even thought the virtual and physical address point to it).> > + /* > > + * Is the DMA (Bus) address reachable by the PCI device?. > > + */ > > + bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t);I mentioned in the previous explanation that when a domain is allocated, the pages are not guaranteed to be linearly contiguous. For bare-metal that is not the case and ''dma_capable'' just checks the device DMA mask against the bus address. For virtualized environment we do need to check if the pages are linearly contiguous for the size request. For that we need the physical address to iterate over them doing the phys->bus#1 translation and checking whether the (phys+1)->bus#2 bus#1 == bus#2 + 1. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 17:47 UTC
[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.
On Thu, Jan 14, 2010 at 05:22:13PM -0800, Chris Wright wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote: > > Before this patch, if you specified ''swiotlb=force,1024'' it would > > ignore both arguments. This fixes it and allows the user specify it > > in any order (or none at all). > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Having only one substring of digits makes allowing arbitrary order > less useful if more options get added (as in foo,bar,1024,baz,force > would make more sense as foo,bar,nslabs=1024,baz,force). Do you > think this one is really needed? If so, be useful to updateI got caught a couple of times where I needed to provide both arguments and could not figure out why it did not work. Switching the arguments around fixed it. Thought that it might make sense to remove this potential trap from other folks by this patch. Your point about more options got me thinking about the overflow buffer. I could also provide an over-ride for that, maybe: "swiotlb=force,overflow=32,slabs=1024" (Not sure about the syntax?)> Documentation/kernel-parameters.txt which is slightly out of date now.Oh, good catch. Will roll the patch for that file as well.> > thanks, > -chris_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Jan 14, 2010 at 06:25:10PM -0800, Chris Wright wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote: > > Another approach, which this set of patches explores, is to abstract the > > address translation and address determination functions away from the > > SWIOTLB book-keeping functions. This way the core SWIOTLB library functions > > are present in one place, while the address related functions are in > > a separate library for different run-time platforms. I would very much > > appreciate input on this idea and the set of patches. > > It seems like it still needs some refinement, since the XenOh yes.> implementation is hooking into two layers. Both: > > + swiotlb_register_engine(&xen_ops); > > and > > +static struct dma_map_ops xen_swiotlb_dma_ops = { > > Wouldn''t the idea be to get to the point that you''d use common swiotlb > and keep the hooks to one layer?I would love to. Maybe I can extend those two functions (alloc_coherent and free_coherent) to make an extra call after they have allocated/de-allocated a page? The reason is that in virtualized environments I MUST guarantee that those buffers are linearly contiguous. Meaning I need to post-processing of this buffer: ret = (void *)__get_free_pages(flags, order) If that can''t be done, then I need a mix of DMA ops where the majority is SWIOTLB with the exception of the alloc_coherent and free_coherent). Hmm, I should follow the lead of what x86_swiotlb_alloc_coherent does and just make an extra call to ''is_swiotlb_buffer'' on the return address and if not found to be within that SWIOTLB, do the fixup to make sure the pages are linearly contiguous.> > Also, it''s unclear when some of the prior global to swiotlb variables > would actually be useful to a private implementation. For example, overflow, > which is just 32 * 1024 in both cases. Are those really needed to be > private to a swiotlb engine?Unfortunately yes. The same reason as mentioned above: MUST guarantee that those buffers (start, overflow) are linearly contiguous. For that I was doing something like: void __init xen_swiotlb_init(int verbose) { int rc = 0; swiotlb_register_engine(&xen_ops); swiotlb_init_with_default_size(&xen_ops, 64 * (1<<20), 0); if ((rc = xen_swiotlb_fixup(xen_ops.start, xen_ops.nslabs << IO_TLB_SHIFT, xen_ops.nslabs))) goto error; if ((rc = xen_swiotlb_fixup(xen_ops.overflow_buffer, xen_ops.overflow, xen_ops.overflow >> IO_TLB_SHIFT))) goto error; so that I can "fix" the start and overflow_buffer pages.> > Do you think you can reduce the swiotlb_engine to just the relevant ops?Yes. Let me reduce them.> > thanks, > -chris_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-19 18:23 UTC
[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> On Thu, Jan 14, 2010 at 06:02:40PM -0800, Chris Wright wrote: > > * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote: > > > --- a/lib/swiotlb.c > > > +++ b/lib/swiotlb.c > > > @@ -364,7 +364,7 @@ cleanup1: > > > > > > void __init swiotlb_free(void) > > > { > > > - if (!iommu_sw->overflow_buffer) > > > + if (!iommu_sw) > > > return; > > > > > > > Sure this is safe for the case where allocation failed? Wouldn''t this > > do free_late_bootmem(__pa(0))? > > That would indeed fail, but alloc_bootmem_low_pages (___alloc_bootmem) > panics the machine if it can''t allocate the buffer. So we would never > actually get to swiotlb_free if we failed to allocate the buffers for > SWIOTLB.Ah, right.> But for the case where the SWIOTLB allocation happens when using > swiotlb_late_init_with_default_size, and it fails, this check > is not sufficient. I will add a check for that or just make > swiotlb_late_init_with_default_size set iommu_sw to NULL when > the allocation fails.That one is ok, since kfree(NULL) is safe. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-19 18:43 UTC
[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> > > + * Is the DMA (Bus) address within our bounce buffer (start and end). > > > + */ > > > + int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr, > > > + phys_addr_t phys); > > > + > > > > Why is this implementation specific? > > In the current implementation, they both use the physical address and > do a simple check: > > return paddr >= virt_to_phys(io_tlb_start) && > paddr < virt_to_phys(io_tlb_end); > > That for virtualized environments where a PCI device is passed in would > work too. > > Unfortunately the problem is when we provide a channel of communication > with another domain and we end up doing DMA on behalf of another guest. > The short description of the problem is that a page of memory is shared > with another domain and the mapping in our domain is correct (bus->physical) > the other way (virt->physical->bus) is incorrect for the duration of this page > being shared. Hence we need to verify that the page is local to our > domain, and for that we need the bus address to verify that the > addr == physical->bus(bus->physical(addr)) where addr is the bus > address (dma_addr_t). If it is not local (shared with another domain) > we MUST not consider it as a SWIOTLB buffer as that can lead to > panics and possible corruptions. The trick here is that the phys->virt > address can fall within the SWIOTLB buffer for pages that are > shared with another domain and we need the DMA address to do an extra check. > > The long description of the problem is: > > You are the domain doing some DMA on behalf of another domain. The > simple example is you are servicing a block device to the other guests. > One way to implement this is to present a one page ring buffer where > both domains move the producer and consumer indexes around. Once you get > a request (READ/WRITE), you use the virtualized channels to "share" that page > into your domain. For this you have a buffer (2MB or bigger) wherein for > pages that shared in to you, you over-write the phys->bus mapping. > That means that the phys->phys translation is altered for the duration > of this request being out-standing. Once it is completed, the phys->bus > translation is restored. > > Here is a little diagram of what happens when a page is shared (and lets > assume that we have a situation where virt #1 == virt #2, which means > that phys #1 == phys #2). > > (domain 2) virt#1->phys#1---\ > +- bus #1 > (domain 3) virt#2->phys#2 ---/ > > (phys#1 points to bus #1, and phys#2 points to bus #1 too). > > The converse of the above picture is not true: > > /---> phys #1-> virt #1. (domain 2). > bus#1 + > \---> phys #2-> virt #2. (domain 3). > > phys #1 != phys #2 and hence virt #1 != virt #2. > > When a page is not shared: > > (domain 2) virt #1->phys #1--> bus #1 > (domain 3) virt #2->phys #2--> bus #2 > > bus #1 -> phys #1 -> virt #1 (domain 2) > bus #2 -> phys #2 -> virt #2 (domain 3) > > The bus #1 != bus #2, but phys #1 could be same as phys #2 (since > there are just PFNs). And virt #1 == virt #2. > > The reason for these is that each domain has its own memory layout where > the memory starts at pfn 0, not at some higher number. So each domain > sees the physical address identically, but the bus address MUST point > to different areas (except when sharing) otherwise one domain would > over-write another domain, ouch. > > Furthermore when a domain is allocated, the pages for the domain are not > guaranteed to be linearly contiguous so we can''t guarantee that phys == bus. > > So to guard against the situation in which phys #1 ->virt comes out with > an address that looks to be within our SWIOTLB buffer we need to do the > extra check: > > addr == physical->bus(bus->physical(addr)) where addr is the bus > address > > And for scenarios where this is not true (page belongs to another > domain), that page is not in the SWIOTLB (even thought the virtual and > physical address point to it). > > > > + /* > > > + * Is the DMA (Bus) address reachable by the PCI device?. > > > + */ > > > + bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t); > > I mentioned in the previous explanation that when a domain is allocated, > the pages are not guaranteed to be linearly contiguous. > > For bare-metal that is not the case and ''dma_capable'' just checks > the device DMA mask against the bus address. > > For virtualized environment we do need to check if the pages are linearly > contiguous for the size request. > > For that we need the physical address to iterate over them doing the > phys->bus#1 translation and checking whether the (phys+1)->bus#2 > bus#1 == bus#2 + 1.Right, for both of those cases I was thinking you could make that the base logic and the existing helpers to do addr translation would be enough. But that makes more sense when compiling for a specific arch (i.e. the checks would be noops and compile away when !xen) as opposed to a dynamic setup like this. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-19 18:55 UTC
[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> > > +void __init > > > +swiotlb_init(int verbose) > > > +{ > > > + swiotlb_register_engine(&swiotlb_ops); > > > + swiotlb_init_with_default_size(&swiotlb_ops, 64 * (1<<20), > > > + verbose); /* default to 64MB */ > > > +} > > > > I''d expect the swiotlb-default file to have only private impl. of the > > swiotlb_engine. Shouldn''t this and the init stay in swiotlb.c? Also, > > Hmm, were you thinking that it might make sense to pass in > a swiotlb_ops to swiotlb_init so that it can make the right assignments?Yeah, something like that.> The reason why I stuck here was that the swiotlb_ops needed to be > visible to this function, and having it in swiotlb.c would mean it must > now include the header definition for swiotlb-defualt.h.And part of that need is because the allocator (effectively common/global) is writing to impl. private data, like ->nslabs. But if you move that back, then this may not be an issue.> > would you ever call swiotlb_init w/out register_engine, why not move > > register to the swiotlb_init? > > In essence combine swiotlb_register_engine with swiotlb_init_with_default_size?Yep.> There would still be a need for late call mechanism. > Perhaps having two variants of swiotlb_init?: swiotlb_early_init(struct > swiotlb_engine *swiotlb_ops) and swiotlb_late_init(struct swiotlb_engine > *swiotlb_ops)?That''s basically what we have now, swiotlb{,_late}_init_with_default_size, so seems reasonable to me.> Or perhaps just pass in an argument: swiotlb_init(int late)? > > Furthermore have this new swiotlb_init detect if some of the fields > (start ,end, overflow_buffer) have been allocated and if so skip the > default allocation altogether?That would keep the allocate, ->release, allocate cycle from happening (which seems odd when it''s the same sizes and same core allocation). Part of why I thought there was too much moved into the impl. private engine structure. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Wright
2010-Jan-19 19:00 UTC
[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.
* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> On Thu, Jan 14, 2010 at 05:22:13PM -0800, Chris Wright wrote: > > * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote: > > > Before this patch, if you specified ''swiotlb=force,1024'' it would > > > ignore both arguments. This fixes it and allows the user specify it > > > in any order (or none at all). > > > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > Having only one substring of digits makes allowing arbitrary order > > less useful if more options get added (as in foo,bar,1024,baz,force > > would make more sense as foo,bar,nslabs=1024,baz,force). Do you > > think this one is really needed? If so, be useful to update > > I got caught a couple of times where I needed to provide both arguments > and could not figure out why it did not work. Switching the arguments > around fixed it. Thought that it might make sense to remove this > potential trap from other folks by this patch. > > Your point about more options got me thinking about the overflow buffer. > I could also provide an over-ride for that, maybe: > > "swiotlb=force,overflow=32,slabs=1024"Right, in which case would the is_digit() check remain ahead of the loop to protect the "legacy" format (swiotlb=1024,force), forcing mixing like you did to the new format (swiotlb=force,slabs=1024 or swiotlb=slabs=1024,force)?> (Not sure about the syntax?) > > > Documentation/kernel-parameters.txt which is slightly out of date now. > > Oh, good catch. Will roll the patch for that file as well.thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-19 19:39 UTC
[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.
> > "swiotlb=force,overflow=32,slabs=1024" > > Right, in which case would the is_digit() check remain ahead of the loop > to protect the "legacy" format (swiotlb=1024,force), forcing mixing likeSo, currently the patch that I posted works fine with the "legacy" format.> you did to the new format (swiotlb=force,slabs=1024 or > swiotlb=slabs=1024,force)?I will make sure that the new patch, which will have follow the format you mentioned, also work with the "legacy" format. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
FUJITA Tomonori
2010-Jan-22 01:51 UTC
[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
On Thu, 14 Jan 2010 18:00:51 -0500 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> The structure contains all of the existing variables used in > software IO TLB (swiotlb.c) collected within a structure. > > Additionally a name variable and a deconstructor (release) function > variable is defined for API usages. > > The other set of functions: is_swiotlb_buffer, dma_capable, phys_to_bus, > bus_to_phys, virt_to_bus, and bus_to_virt server as a method to abstract > them out of the SWIOTLB library. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > include/linux/swiotlb.h | 94 +++++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 94 insertions(+), 0 deletions(-) > > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h > index febedcf..781c3aa 100644 > --- a/include/linux/swiotlb.h > +++ b/include/linux/swiotlb.h > @@ -24,6 +24,100 @@ extern int swiotlb_force; > > extern void swiotlb_init(int verbose); > > +struct swiotlb_engine { > + > + /* > + * Name of the engine (ie: "Software IO TLB") > + */ > + const char *name;Please don''t add another ''layer'' to the dma path. This leads to too many indirect function calls. Some people concern about the overhead of even the current code. Create something like libswiotlb or whatever (as I said before) to export functions. Then call them directly. btw, please send swiotlb patches to lkml. Thanks, _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Jan-26 16:20 UTC
[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
> Please don''t add another ''layer'' to the dma path. This leads to too > many indirect function calls. Some people concern about the overhead > of even the current code.Any ideas on a good benchmarking tool for that?> > Create something like libswiotlb or whatever (as I said before) to > export functions. Then call them directly.I think what you suggesting is that that I abstract all of the address translation functions (''phys_to_dma'', ''bus_to_phys'', etc.) from the swiotlb.c layer. Specifically, to altogether remove the phys_to_dma, bus_to_phys, etc. calls from the swiotlb.c file. In place of them, have the results of those functions (both physical and bus address) be passed in to the map_single and unmap_single calls. The layer that calls map_single/unmap_single would then be responsible for translating the phys/bus/virtual addresses. Implementation wise, I could split the swiotlb.c in two files: a) /lib/swiotlb-core.c and b) /lib/swiotlb-default.c. The ''a)'' would have the swiotlb_init_*, swiotlb_free, swiotlb_bounce, sync_single, map_single, and unmap_single. The ''b)'' would contain the DMA function, such as swiotlb_sync_single_*, swiotlb_[map|unmap]_[page|sg]_* and swiotlb_[alloc|free]_coherent. Naturally it would do the phys,bus->bus,phys translations. The Xen complementary to ''b)'' would be in a different library/file (/lib/swiotlb-xen.c?). Is that close to what you were thinking?> > btw, please send swiotlb patches to lkml.Once we''ve nailed down the structure of the changes I''ll definitely send it out to LKML. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
FUJITA Tomonori
2010-Feb-03 02:04 UTC
[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.
On Tue, 26 Jan 2010 11:20:43 -0500 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> > Please don''t add another ''layer'' to the dma path. This leads to too > > many indirect function calls. Some people concern about the overhead > > of even the current code. > > Any ideas on a good benchmarking tool for that?Any tools with a system capable of fast I/Os like hundreds of SSDs?> > Create something like libswiotlb or whatever (as I said before) to > > export functions. Then call them directly. > > I think what you suggesting is that that I abstract all of the address > translation functions (''phys_to_dma'', ''bus_to_phys'', etc.) from the swiotlb.c > layer. > > Specifically, to altogether remove the phys_to_dma, bus_to_phys, etc. calls > from the swiotlb.c file. In place of them, have the results of those functions > (both physical and bus address) be passed in to the map_single and unmap_single > calls. The layer that calls map_single/unmap_single would then be responsible > for translating the phys/bus/virtual addresses. > > Implementation wise, I could split the swiotlb.c in two files: > a) /lib/swiotlb-core.c and b) /lib/swiotlb-default.c. > > The ''a)'' would have the swiotlb_init_*, swiotlb_free, swiotlb_bounce, > sync_single, map_single, and unmap_single.I just want core functions to manage the swiotlb buffer (io_tlb) there. Don''t pass the address translation functions to swiotlb_map_*, etc. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Any ideas on a good benchmarking tool for that? > > Any tools with a system capable of fast I/Os like hundreds of SSDs?That will take some time, but I think I will be able to get some time on the Oracle''s Exadata box.> I just want core functions to manage the swiotlb buffer (io_tlb) > there. Don''t pass the address translation functions to swiotlb_map_*, > etc.Attached is a set of eleven RFC patches that split the SWIOTLB library in two layers: core, and dma_ops related functions. The set of eleven patches is also accessible on: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6.git swiotlb-rfc-0.4 An example of how this can be utilized in both bare-metal and Xen environments is this git tree: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git swiotlb-xen-0.4 Sincerely, Konrad Rzeszutek Wilk P.S. The diffstat: Documentation/x86/x86_64/boot-options.txt | 6 +- include/linux/swiotlb.h | 45 ++- lib/Makefile | 2 +- lib/swiotlb-core.c | 572 ++++++++++++++++++++++++++++ lib/swiotlb.c | 579 +--------------------------- 5 files changed, 637 insertions(+), 567 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 01/11] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.
Before this patch, if you specified ''swiotlb=force,1024'' it would ignore both arguments. This fixes it and allows the user to specify it in any order (or none at all). Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 20 +++++++++++--------- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 437eedb..e6d9e32 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -102,16 +102,18 @@ static int late_alloc; static int __init setup_io_tlb_npages(char *str) { - if (isdigit(*str)) { - io_tlb_nslabs = simple_strtoul(str, &str, 0); - /* avoid tail segment of size < IO_TLB_SEGSIZE */ - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + while (*str) { + if (isdigit(*str)) { + io_tlb_nslabs = simple_strtoul(str, &str, 0); + /* avoid tail segment of size < IO_TLB_SEGSIZE */ + io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + } + if (!strncmp(str, "force", 5)) + swiotlb_force = 1; + str += strcspn(str, ","); + if (*str == '','') + ++str; } - if (*str == '','') - ++str; - if (!strcmp(str, "force")) - swiotlb_force = 1; - return 1; } __setup("swiotlb=", setup_io_tlb_npages); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 02/11] [swiotlb] Make ''setup_io_tlb_npages'' accept new ''swiotlb='' syntax.
The old syntax for ''swiotlb'' is still in effect, and we extend it now to include the overflow buffer size. The syntax is now: swiotlb=[force,][nslabs=<pages>,][overflow=<size>] or more commonly know as: swiotlb=[force] swiotlb=[nslabs=<pages>] swiotlb=[overflow=<size>] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- Documentation/x86/x86_64/boot-options.txt | 6 ++++- lib/swiotlb.c | 36 +++++++++++++++++++++++++--- 2 files changed, 37 insertions(+), 5 deletions(-) diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt index 29a6ff8..81f9b94 100644 --- a/Documentation/x86/x86_64/boot-options.txt +++ b/Documentation/x86/x86_64/boot-options.txt @@ -267,10 +267,14 @@ IOMMU (input/output memory management unit) iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU implementation: - swiotlb=<pages>[,force] + swiotlb=[npages=<pages>] + swiotlb=[force] + swiotlb=[overflow=<size>] + <pages> Prereserve that many 128K pages for the software IO bounce buffering. force Force all IO through the software TLB. + <size> Size in bytes of the overflow buffer. Settings for the IBM Calgary hardware IOMMU currently found in IBM pSeries and xSeries machines: diff --git a/lib/swiotlb.c b/lib/swiotlb.c index e6d9e32..0663879 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -102,7 +102,27 @@ static int late_alloc; static int __init setup_io_tlb_npages(char *str) { + int get_value(const char *token, char *str, char **endp) + { + ssize_t len; + int val = 0; + + len = strlen(token); + if (!strncmp(str, token, len)) { + str += len; + if (*str == ''='') + ++str; + if (*str != ''\0'') + val = simple_strtoul(str, endp, 0); + } + *endp = str; + return val; + } + + int val; + while (*str) { + /* The old syntax */ if (isdigit(*str)) { io_tlb_nslabs = simple_strtoul(str, &str, 0); /* avoid tail segment of size < IO_TLB_SEGSIZE */ @@ -110,14 +130,22 @@ setup_io_tlb_npages(char *str) } if (!strncmp(str, "force", 5)) swiotlb_force = 1; - str += strcspn(str, ","); - if (*str == '','') - ++str; + /* The new syntax: swiotlb=nslabs=16384,overflow=32768,force */ + val = get_value("nslabs", str, &str); + if (val) + io_tlb_nslabs = ALIGN(val, IO_TLB_SEGSIZE); + + val = get_value("overflow", str, &str); + if (val) + io_tlb_overflow = val; + str = strpbrk(str, ","); + if (!str) + break; + str++; /* skip '','' */ } return 1; } __setup("swiotlb=", setup_io_tlb_npages); -/* make io_tlb_overflow tunable too? */ /* Note that this doesn''t work with highmem page */ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 03/11] [swiotlb] Normalize the swiotlb_init_* function''s naming syntax.
The previous function names were misleading by including "with_default_size" as the size was being passed in as argument. Change the functions names to be clear on what they do. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 0663879..7b66fc3 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -174,7 +174,7 @@ void swiotlb_print_info(void) * structures for the software IO TLB used to implement the DMA API. */ void __init -swiotlb_init_with_default_size(size_t default_size, int verbose) +swiotlb_init_early(size_t default_size, int verbose) { unsigned long i, bytes; @@ -217,7 +217,7 @@ swiotlb_init_with_default_size(size_t default_size, int verbose) void __init swiotlb_init(int verbose) { - swiotlb_init_with_default_size(64 * (1<<20), verbose); /* default to 64MB */ + swiotlb_init_early(64 * (1<<20), verbose); /* default to 64MB */ } /* @@ -226,7 +226,7 @@ swiotlb_init(int verbose) * This should be just like above, but with some error catching. */ int -swiotlb_late_init_with_default_size(size_t default_size) +swiotlb_init_late(size_t default_size) { unsigned long i, bytes, req_nslabs = io_tlb_nslabs; unsigned int order; -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 04/11] [swiotlb] Make printk''s use same prefix and include dev_err when possible.
Various printk''s had the prefix ''DMA'' in them, but not all of them. This makes all of the printk''s have the ''DMA'' in them and for error cases use the ''dev_err'' macro. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 7b66fc3..0eb64d7 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -162,9 +162,9 @@ void swiotlb_print_info(void) pstart = virt_to_phys(io_tlb_start); pend = virt_to_phys(io_tlb_end); - printk(KERN_INFO "Placing %luMB software IO TLB between %p - %p\n", + printk(KERN_INFO "DMA: Placing %luMB software IO TLB between %p - %p\n", bytes >> 20, io_tlb_start, io_tlb_end); - printk(KERN_INFO "software IO TLB at phys %#llx - %#llx\n", + printk(KERN_INFO "DMA: software IO TLB at phys %#llx - %#llx\n", (unsigned long long)pstart, (unsigned long long)pend); } @@ -190,7 +190,7 @@ swiotlb_init_early(size_t default_size, int verbose) */ io_tlb_start = alloc_bootmem_low_pages(bytes); if (!io_tlb_start) - panic("Cannot allocate SWIOTLB buffer"); + panic("DMA: Cannot allocate SWIOTLB buffer"); io_tlb_end = io_tlb_start + bytes; /* @@ -209,7 +209,7 @@ swiotlb_init_early(size_t default_size, int verbose) */ io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow); if (!io_tlb_overflow_buffer) - panic("Cannot allocate SWIOTLB overflow buffer!\n"); + panic("DMA: Cannot allocate SWIOTLB overflow buffer!\n"); if (verbose) swiotlb_print_info(); } @@ -255,8 +255,8 @@ swiotlb_init_late(size_t default_size) goto cleanup1; if (order != get_order(bytes)) { - printk(KERN_WARNING "Warning: only able to allocate %ld MB " - "for software IO TLB\n", (PAGE_SIZE << order) >> 20); + printk(KERN_WARNING "DMA: Warning: only able to allocate %ld MB" + " for software IO TLB\n", (PAGE_SIZE << order) >> 20); io_tlb_nslabs = SLABS_PER_PAGE << order; bytes = io_tlb_nslabs << IO_TLB_SHIFT; } @@ -602,7 +602,8 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, /* Confirm address can be DMA''d by device */ if (dev_addr + size - 1 > dma_mask) { - printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n", + dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \ + "dev_addr = 0x%016Lx\n", (unsigned long long)dma_mask, (unsigned long long)dev_addr); @@ -640,8 +641,7 @@ swiotlb_full(struct device *dev, size_t size, int dir, int do_panic) * When the mapping is small enough return a static buffer to limit * the damage, or panic when the transfer is too big. */ - printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at " - "device %s\n", size, dev ? dev_name(dev) : "?"); + dev_err(dev, "DMA: Out of SW-IOMMU space for %zu bytes.", size); if (size <= io_tlb_overflow || !do_panic) return; @@ -694,7 +694,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, * Ensure that the address returned is DMA''ble */ if (!dma_capable(dev, dev_addr, size)) - panic("map_single: bounce buffer is not DMA''ble"); + panic("DMA: swiotlb_map_single: bounce buffer is not DMA''ble"); return dev_addr; } -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 05/11] [swiotlb] Make internal bookkeeping functions have ''do_'' prefix.
The functions that operate on io_tlb_list/io_tlb_start/io_tlb_orig_addr have the prefix ''do_'' now. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 0eb64d7..9085eab 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -60,8 +60,8 @@ enum dma_sync_target { int swiotlb_force; /* - * Used to do a quick range check in unmap_single and - * sync_single_*, to see if the memory was in fact allocated by this + * Used to do a quick range check in do_unmap_single and + * do_sync_single_*, to see if the memory was in fact allocated by this * API. */ static char *io_tlb_start, *io_tlb_end; @@ -394,7 +394,7 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, * Allocates bounce buffer and returns its kernel virtual address. */ static void * -map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) +do_map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) { unsigned long flags; char *dma_addr; @@ -540,7 +540,7 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) } static void -sync_single(struct device *hwdev, char *dma_addr, size_t size, +do_sync_single(struct device *hwdev, char *dma_addr, size_t size, int dir, int target) { int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT; @@ -589,10 +589,10 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, if (!ret) { /* * We are either out of memory or the device can''t DMA - * to GFP_DMA memory; fall back on map_single(), which + * to GFP_DMA memory; fall back on do_map_single(), which * will grab memory from the lowest available address range. */ - ret = map_single(hwdev, 0, size, DMA_FROM_DEVICE); + ret = do_map_single(hwdev, 0, size, DMA_FROM_DEVICE); if (!ret) return NULL; } @@ -626,7 +626,7 @@ swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, if (!is_swiotlb_buffer(paddr)) free_pages((unsigned long)vaddr, get_order(size)); else - /* DMA_TO_DEVICE to avoid memcpy in unmap_single */ + /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE); } EXPORT_SYMBOL(swiotlb_free_coherent); @@ -682,7 +682,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, /* * Oh well, have to allocate and map a bounce buffer. */ - map = map_single(dev, phys, size, dir); + map = do_map_single(dev, phys, size, dir); if (!map) { swiotlb_full(dev, size, dir, 1); map = io_tlb_overflow_buffer; @@ -759,7 +759,7 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, BUG_ON(dir == DMA_NONE); if (is_swiotlb_buffer(paddr)) { - sync_single(hwdev, phys_to_virt(paddr), size, dir, target); + do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target); return; } @@ -847,7 +847,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, if (swiotlb_force || !dma_capable(hwdev, dev_addr, sg->length)) { - void *map = map_single(hwdev, sg_phys(sg), + void *map = do_map_single(hwdev, sg_phys(sg), sg->length, dir); if (!map) { /* Don''t panic here, we expect map_sg users -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 06/11] [swiotlb] do_map_single: abstract out swiotlb_virt_to_bus calls out.
We want to move that function out of do_map_single so that the caller of this function does the virt->phys->bus address translation. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 23 +++++++++++++++-------- 1 files changed, 15 insertions(+), 8 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 9085eab..4ab3885 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -394,20 +394,19 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, * Allocates bounce buffer and returns its kernel virtual address. */ static void * -do_map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir) +do_map_single(struct device *hwdev, phys_addr_t phys, + unsigned long start_dma_addr, size_t size, int dir) { unsigned long flags; char *dma_addr; unsigned int nslots, stride, index, wrap; int i; - unsigned long start_dma_addr; unsigned long mask; unsigned long offset_slots; unsigned long max_slots; mask = dma_get_seg_boundary(hwdev); - start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start) & mask; - + start_dma_addr = start_dma_addr & mask; offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; /* @@ -574,6 +573,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, void *ret; int order = get_order(size); u64 dma_mask = DMA_BIT_MASK(32); + unsigned long start_dma_addr; if (hwdev && hwdev->coherent_dma_mask) dma_mask = hwdev->coherent_dma_mask; @@ -592,7 +592,9 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, * to GFP_DMA memory; fall back on do_map_single(), which * will grab memory from the lowest available address range. */ - ret = do_map_single(hwdev, 0, size, DMA_FROM_DEVICE); + start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); + ret = do_map_single(hwdev, 0, start_dma_addr, size, + DMA_FROM_DEVICE); if (!ret) return NULL; } @@ -607,7 +609,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size, (unsigned long long)dma_mask, (unsigned long long)dev_addr); - /* DMA_TO_DEVICE to avoid memcpy in unmap_single */ + /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE); return NULL; } @@ -666,6 +668,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, enum dma_data_direction dir, struct dma_attrs *attrs) { + unsigned long start_dma_addr; phys_addr_t phys = page_to_phys(page) + offset; dma_addr_t dev_addr = phys_to_dma(dev, phys); void *map; @@ -682,7 +685,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, /* * Oh well, have to allocate and map a bounce buffer. */ - map = do_map_single(dev, phys, size, dir); + start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start); + map = do_map_single(dev, phys, start_dma_addr, size, dir); if (!map) { swiotlb_full(dev, size, dir, 1); map = io_tlb_overflow_buffer; @@ -836,11 +840,13 @@ int swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, enum dma_data_direction dir, struct dma_attrs *attrs) { + unsigned long start_dma_addr; struct scatterlist *sg; int i; BUG_ON(dir == DMA_NONE); + start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); for_each_sg(sgl, sg, nelems, i) { phys_addr_t paddr = sg_phys(sg); dma_addr_t dev_addr = phys_to_dma(hwdev, paddr); @@ -848,7 +854,8 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, if (swiotlb_force || !dma_capable(hwdev, dev_addr, sg->length)) { void *map = do_map_single(hwdev, sg_phys(sg), - sg->length, dir); + start_dma_addr, + sg->length, dir); if (!map) { /* Don''t panic here, we expect map_sg users to do proper error handling. */ -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 07/11] [swiotlb] Fix checkpatch warnings.
From: Konrad Rzeszutek <konrad@t42p-lan.dumpdata.com> I''ve fixed most of the checkpatch warnings except these three: a). WARNING: consider using strict_strtoul in preference to simple_strtoul 115: FILE: swiotlb.c:115: + val = simple_strtoul(str, endp, 0); b). WARNING: consider using strict_strtoul in preference to simple_strtoul 126: FILE: swiotlb.c:126: + io_tlb_nslabs = simple_strtoul(str, &str, 0); c).WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt 151: FILE: swiotlb.c:151: + volatile void *address) total: 0 errors, 3 warnings, 965 lines checked As a) and b) are OK, we MUST use simple_strtoul. For c) the ''volatile-consider*'' document outlines that it is OK for pointers to data structrues in coherent memory which this certainly could be, hence not fixing that. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb.c | 38 +++++++++++++++++++------------------- 1 files changed, 19 insertions(+), 19 deletions(-) diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 4ab3885..80a2306 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -29,16 +29,15 @@ #include <linux/ctype.h> #include <linux/highmem.h> -#include <asm/io.h> +#include <linux/io.h> #include <asm/dma.h> -#include <asm/scatterlist.h> +#include <linux/scatterlist.h> #include <linux/init.h> #include <linux/bootmem.h> #include <linux/iommu-helper.h> -#define OFFSET(val,align) ((unsigned long) \ - ( (val) & ( (align) - 1))) +#define OFFSET(val, align) ((unsigned long) ((val) & ((align) - 1))) #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) @@ -200,7 +199,7 @@ swiotlb_init_early(size_t default_size, int verbose) */ io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int)); for (i = 0; i < io_tlb_nslabs; i++) - io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); io_tlb_index = 0; io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t)); @@ -269,18 +268,16 @@ swiotlb_init_late(size_t default_size) * between io_tlb_start and io_tlb_end. */ io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL, - get_order(io_tlb_nslabs * sizeof(int))); + get_order(io_tlb_nslabs * sizeof(int))); if (!io_tlb_list) goto cleanup2; for (i = 0; i < io_tlb_nslabs; i++) - io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); io_tlb_index = 0; - io_tlb_orig_addr = (phys_addr_t *) - __get_free_pages(GFP_KERNEL, - get_order(io_tlb_nslabs * - sizeof(phys_addr_t))); + io_tlb_orig_addr = (phys_addr_t *) __get_free_pages(GFP_KERNEL, + get_order(io_tlb_nslabs * sizeof(phys_addr_t))); if (!io_tlb_orig_addr) goto cleanup3; @@ -290,7 +287,7 @@ swiotlb_init_late(size_t default_size) * Get the overflow emergency buffer */ io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA, - get_order(io_tlb_overflow)); + get_order(io_tlb_overflow)); if (!io_tlb_overflow_buffer) goto cleanup4; @@ -305,8 +302,8 @@ cleanup4: get_order(io_tlb_nslabs * sizeof(phys_addr_t))); io_tlb_orig_addr = NULL; cleanup3: - free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs * - sizeof(int))); + free_pages((unsigned long)io_tlb_list, + get_order(io_tlb_nslabs * sizeof(int))); io_tlb_list = NULL; cleanup2: io_tlb_end = NULL; @@ -410,8 +407,8 @@ do_map_single(struct device *hwdev, phys_addr_t phys, offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; /* - * Carefully handle integer overflow which can occur when mask == ~0UL. - */ + * Carefully handle integer overflow which can occur when mask == ~0UL. + */ max_slots = mask + 1 ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT); @@ -458,7 +455,8 @@ do_map_single(struct device *hwdev, phys_addr_t phys, for (i = index; i < (int) (index + nslots); i++) io_tlb_list[i] = 0; - for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) + for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) + != IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) io_tlb_list[i] = ++count; dma_addr = io_tlb_start + (index << IO_TLB_SHIFT); @@ -532,7 +530,8 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) * Step 2: merge the returned slots with the preceding slots, * if available (non zero) */ - for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--) + for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+ IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) io_tlb_list[i] = ++count; } spin_unlock_irqrestore(&io_tlb_lock, flags); @@ -888,7 +887,8 @@ EXPORT_SYMBOL(swiotlb_map_sg); */ void swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl, - int nelems, enum dma_data_direction dir, struct dma_attrs *attrs) + int nelems, enum dma_data_direction dir, + struct dma_attrs *attrs) { struct scatterlist *sg; int i; -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 08/11] [swiotlb] Re-order the function declerations.
Move to the top the function declerations dealing with startup/shutdown of SWIOTLB. This is in preperation of next set of patches which split the swiotlb.c file in two, and wherein the bookkeeping, init and free related functions would be declared at the beginning and the dma_ops functions at the end of the header file. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- include/linux/swiotlb.h | 15 ++++++++------- 1 files changed, 8 insertions(+), 7 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index febedcf..84e7a53 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -23,7 +23,15 @@ extern int swiotlb_force; #define IO_TLB_SHIFT 11 extern void swiotlb_init(int verbose); +#ifdef CONFIG_SWIOTLB +extern void __init swiotlb_free(void); +#else +static inline void swiotlb_free(void) { } +#endif +extern void swiotlb_print_info(void); + +/* IOMMU functions. */ extern void *swiotlb_alloc_coherent(struct device *hwdev, size_t size, dma_addr_t *dma_handle, gfp_t flags); @@ -89,11 +97,4 @@ swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr); extern int swiotlb_dma_supported(struct device *hwdev, u64 mask); -#ifdef CONFIG_SWIOTLB -extern void __init swiotlb_free(void); -#else -static inline void swiotlb_free(void) { } -#endif - -extern void swiotlb_print_info(void); #endif /* __LINUX_SWIOTLB_H */ -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 09/11] [swiotlb] Make swiotlb bookkeeping functions visible in the header file.
We put the init, free, and functions dealing with the operations on the SWIOTLB buffer at the top of the header. Also we export some of the variables that are used by the dma_ops functions. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- include/linux/swiotlb.h | 31 ++++++++++++++++++++++++++++++- lib/swiotlb.c | 24 ++++++++---------------- 2 files changed, 38 insertions(+), 17 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index 84e7a53..af66473 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -30,8 +30,37 @@ static inline void swiotlb_free(void) { } #endif extern void swiotlb_print_info(void); +/* Internal book-keeping functions. Must be linked against the library + * to take advantage of them.*/ +#ifdef CONFIG_SWIOTLB +/* + * Enumeration for sync targets + */ +enum dma_sync_target { + SYNC_FOR_CPU = 0, + SYNC_FOR_DEVICE = 1, +}; +extern char *io_tlb_start; +extern char *io_tlb_end; +extern unsigned long io_tlb_nslabs; +extern void *io_tlb_overflow_buffer; +extern unsigned long io_tlb_overflow; +extern int is_swiotlb_buffer(phys_addr_t paddr); +extern void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, + enum dma_data_direction dir); +extern void *do_map_single(struct device *hwdev, phys_addr_t phys, + unsigned long start_dma_addr, size_t size, int dir); + +extern void do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, + int dir); + +extern void do_sync_single(struct device *hwdev, char *dma_addr, size_t size, + int dir, int target); +extern void swiotlb_full(struct device *dev, size_t size, int dir, int do_panic); +extern void __init swiotlb_init_early(size_t default_size, int verbose); +#endif -/* IOMMU functions. */ +/* swiotlb.c: dma_ops functions. */ extern void *swiotlb_alloc_coherent(struct device *hwdev, size_t size, dma_addr_t *dma_handle, gfp_t flags); diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 80a2306..c982d33 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -48,14 +48,6 @@ */ #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) -/* - * Enumeration for sync targets - */ -enum dma_sync_target { - SYNC_FOR_CPU = 0, - SYNC_FOR_DEVICE = 1, -}; - int swiotlb_force; /* @@ -63,18 +55,18 @@ int swiotlb_force; * do_sync_single_*, to see if the memory was in fact allocated by this * API. */ -static char *io_tlb_start, *io_tlb_end; +char *io_tlb_start, *io_tlb_end; /* * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and * io_tlb_end. This is command line adjustable via setup_io_tlb_npages. */ -static unsigned long io_tlb_nslabs; +unsigned long io_tlb_nslabs; /* * When the IOMMU overflows we return a fallback buffer. This sets the size. */ -static unsigned long io_tlb_overflow = 32*1024; +unsigned long io_tlb_overflow = 32*1024; void *io_tlb_overflow_buffer; @@ -340,7 +332,7 @@ void __init swiotlb_free(void) } } -static int is_swiotlb_buffer(phys_addr_t paddr) +int is_swiotlb_buffer(phys_addr_t paddr) { return paddr >= virt_to_phys(io_tlb_start) && paddr < virt_to_phys(io_tlb_end); @@ -349,7 +341,7 @@ static int is_swiotlb_buffer(phys_addr_t paddr) /* * Bounce: copy the swiotlb buffer back to the original dma location */ -static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, +void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, enum dma_data_direction dir) { unsigned long pfn = PFN_DOWN(phys); @@ -390,7 +382,7 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, /* * Allocates bounce buffer and returns its kernel virtual address. */ -static void * +void * do_map_single(struct device *hwdev, phys_addr_t phys, unsigned long start_dma_addr, size_t size, int dir) { @@ -496,7 +488,7 @@ found: /* * dma_addr is the kernel virtual address of the bounce buffer to unmap. */ -static void +void do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) { unsigned long flags; @@ -537,7 +529,7 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) spin_unlock_irqrestore(&io_tlb_lock, flags); } -static void +void do_sync_single(struct device *hwdev, char *dma_addr, size_t size, int dir, int target) { -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 10/11] [swiotlb] Rename swiotlb.c to swiotlb-core.c
From: Konrad Rzeszutek <konrad@t42p-lan.dumpdata.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- include/linux/swiotlb.h | 5 +- lib/Makefile | 2 +- lib/swiotlb-core.c | 957 +++++++++++++++++++++++++++++++++++++++++++++++ lib/swiotlb.c | 957 ----------------------------------------------- 4 files changed, 961 insertions(+), 960 deletions(-) create mode 100644 lib/swiotlb-core.c delete mode 100644 lib/swiotlb.c diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index af66473..6ab9b7c 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -22,6 +22,7 @@ extern int swiotlb_force; */ #define IO_TLB_SHIFT 11 +/* swiotlb-core.c */ extern void swiotlb_init(int verbose); #ifdef CONFIG_SWIOTLB extern void __init swiotlb_free(void); @@ -30,8 +31,8 @@ static inline void swiotlb_free(void) { } #endif extern void swiotlb_print_info(void); -/* Internal book-keeping functions. Must be linked against the library - * to take advantage of them.*/ +/* swiotlb-core.c: Internal book-keeping functions. + * Must be linked against the library to take advantage of them.*/ #ifdef CONFIG_SWIOTLB /* * Enumeration for sync targets diff --git a/lib/Makefile b/lib/Makefile index 3b0b4a6..40728c5 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -78,7 +78,7 @@ obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o obj-$(CONFIG_SMP) += percpu_counter.o obj-$(CONFIG_AUDIT_GENERIC) += audit.o -obj-$(CONFIG_SWIOTLB) += swiotlb.o +obj-$(CONFIG_SWIOTLB) += swiotlb-core.o swiotlb.o obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o diff --git a/lib/swiotlb-core.c b/lib/swiotlb-core.c new file mode 100644 index 0000000..c982d33 --- /dev/null +++ b/lib/swiotlb-core.c @@ -0,0 +1,957 @@ +/* + * Dynamic DMA mapping support. + * + * This implementation is a fallback for platforms that do not support + * I/O TLBs (aka DMA address translation hardware). + * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com> + * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com> + * Copyright (C) 2000, 2003 Hewlett-Packard Co + * David Mosberger-Tang <davidm@hpl.hp.com> + * + * 03/05/07 davidm Switch from PCI-DMA to generic device DMA API. + * 00/12/13 davidm Rename to swiotlb.c and add mark_clean() to avoid + * unnecessary i-cache flushing. + * 04/07/.. ak Better overflow handling. Assorted fixes. + * 05/09/10 linville Add support for syncing ranges, support syncing for + * DMA_BIDIRECTIONAL mappings, miscellaneous cleanup. + * 08/12/11 beckyb Add highmem support + */ + +#include <linux/cache.h> +#include <linux/dma-mapping.h> +#include <linux/mm.h> +#include <linux/module.h> +#include <linux/spinlock.h> +#include <linux/string.h> +#include <linux/swiotlb.h> +#include <linux/pfn.h> +#include <linux/types.h> +#include <linux/ctype.h> +#include <linux/highmem.h> + +#include <linux/io.h> +#include <asm/dma.h> +#include <linux/scatterlist.h> + +#include <linux/init.h> +#include <linux/bootmem.h> +#include <linux/iommu-helper.h> + +#define OFFSET(val, align) ((unsigned long) ((val) & ((align) - 1))) + +#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) + +/* + * Minimum IO TLB size to bother booting with. Systems with mainly + * 64bit capable cards will only lightly use the swiotlb. If we can''t + * allocate a contiguous 1MB, we''re probably in trouble anyway. + */ +#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) + +int swiotlb_force; + +/* + * Used to do a quick range check in do_unmap_single and + * do_sync_single_*, to see if the memory was in fact allocated by this + * API. + */ +char *io_tlb_start, *io_tlb_end; + +/* + * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and + * io_tlb_end. This is command line adjustable via setup_io_tlb_npages. + */ +unsigned long io_tlb_nslabs; + +/* + * When the IOMMU overflows we return a fallback buffer. This sets the size. + */ +unsigned long io_tlb_overflow = 32*1024; + +void *io_tlb_overflow_buffer; + +/* + * This is a free list describing the number of free entries available from + * each index + */ +static unsigned int *io_tlb_list; +static unsigned int io_tlb_index; + +/* + * We need to save away the original address corresponding to a mapped entry + * for the sync operations. + */ +static phys_addr_t *io_tlb_orig_addr; + +/* + * Protect the above data structures in the map and unmap calls + */ +static DEFINE_SPINLOCK(io_tlb_lock); + +static int late_alloc; + +static int __init +setup_io_tlb_npages(char *str) +{ + int get_value(const char *token, char *str, char **endp) + { + ssize_t len; + int val = 0; + + len = strlen(token); + if (!strncmp(str, token, len)) { + str += len; + if (*str == ''='') + ++str; + if (*str != ''\0'') + val = simple_strtoul(str, endp, 0); + } + *endp = str; + return val; + } + + int val; + + while (*str) { + /* The old syntax */ + if (isdigit(*str)) { + io_tlb_nslabs = simple_strtoul(str, &str, 0); + /* avoid tail segment of size < IO_TLB_SEGSIZE */ + io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + } + if (!strncmp(str, "force", 5)) + swiotlb_force = 1; + /* The new syntax: swiotlb=nslabs=16384,overflow=32768,force */ + val = get_value("nslabs", str, &str); + if (val) + io_tlb_nslabs = ALIGN(val, IO_TLB_SEGSIZE); + + val = get_value("overflow", str, &str); + if (val) + io_tlb_overflow = val; + str = strpbrk(str, ","); + if (!str) + break; + str++; /* skip '','' */ + } + return 1; +} +__setup("swiotlb=", setup_io_tlb_npages); + +/* Note that this doesn''t work with highmem page */ +static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, + volatile void *address) +{ + return phys_to_dma(hwdev, virt_to_phys(address)); +} + +void swiotlb_print_info(void) +{ + unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT; + phys_addr_t pstart, pend; + + pstart = virt_to_phys(io_tlb_start); + pend = virt_to_phys(io_tlb_end); + + printk(KERN_INFO "DMA: Placing %luMB software IO TLB between %p - %p\n", + bytes >> 20, io_tlb_start, io_tlb_end); + printk(KERN_INFO "DMA: software IO TLB at phys %#llx - %#llx\n", + (unsigned long long)pstart, + (unsigned long long)pend); +} + +/* + * Statically reserve bounce buffer space and initialize bounce buffer data + * structures for the software IO TLB used to implement the DMA API. + */ +void __init +swiotlb_init_early(size_t default_size, int verbose) +{ + unsigned long i, bytes; + + if (!io_tlb_nslabs) { + io_tlb_nslabs = (default_size >> IO_TLB_SHIFT); + io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + } + + bytes = io_tlb_nslabs << IO_TLB_SHIFT; + + /* + * Get IO TLB memory from the low pages + */ + io_tlb_start = alloc_bootmem_low_pages(bytes); + if (!io_tlb_start) + panic("DMA: Cannot allocate SWIOTLB buffer"); + io_tlb_end = io_tlb_start + bytes; + + /* + * Allocate and initialize the free list array. This array is used + * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE + * between io_tlb_start and io_tlb_end. + */ + io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int)); + for (i = 0; i < io_tlb_nslabs; i++) + io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + io_tlb_index = 0; + io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t)); + + /* + * Get the overflow emergency buffer + */ + io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow); + if (!io_tlb_overflow_buffer) + panic("DMA: Cannot allocate SWIOTLB overflow buffer!\n"); + if (verbose) + swiotlb_print_info(); +} + +void __init +swiotlb_init(int verbose) +{ + swiotlb_init_early(64 * (1<<20), verbose); /* default to 64MB */ +} + +/* + * Systems with larger DMA zones (those that don''t support ISA) can + * initialize the swiotlb later using the slab allocator if needed. + * This should be just like above, but with some error catching. + */ +int +swiotlb_init_late(size_t default_size) +{ + unsigned long i, bytes, req_nslabs = io_tlb_nslabs; + unsigned int order; + + if (!io_tlb_nslabs) { + io_tlb_nslabs = (default_size >> IO_TLB_SHIFT); + io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); + } + + /* + * Get IO TLB memory from the low pages + */ + order = get_order(io_tlb_nslabs << IO_TLB_SHIFT); + io_tlb_nslabs = SLABS_PER_PAGE << order; + bytes = io_tlb_nslabs << IO_TLB_SHIFT; + + while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { + io_tlb_start = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, + order); + if (io_tlb_start) + break; + order--; + } + + if (!io_tlb_start) + goto cleanup1; + + if (order != get_order(bytes)) { + printk(KERN_WARNING "DMA: Warning: only able to allocate %ld MB" + " for software IO TLB\n", (PAGE_SIZE << order) >> 20); + io_tlb_nslabs = SLABS_PER_PAGE << order; + bytes = io_tlb_nslabs << IO_TLB_SHIFT; + } + io_tlb_end = io_tlb_start + bytes; + memset(io_tlb_start, 0, bytes); + + /* + * Allocate and initialize the free list array. This array is used + * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE + * between io_tlb_start and io_tlb_end. + */ + io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL, + get_order(io_tlb_nslabs * sizeof(int))); + if (!io_tlb_list) + goto cleanup2; + + for (i = 0; i < io_tlb_nslabs; i++) + io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); + io_tlb_index = 0; + + io_tlb_orig_addr = (phys_addr_t *) __get_free_pages(GFP_KERNEL, + get_order(io_tlb_nslabs * sizeof(phys_addr_t))); + if (!io_tlb_orig_addr) + goto cleanup3; + + memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t)); + + /* + * Get the overflow emergency buffer + */ + io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA, + get_order(io_tlb_overflow)); + if (!io_tlb_overflow_buffer) + goto cleanup4; + + swiotlb_print_info(); + + late_alloc = 1; + + return 0; + +cleanup4: + free_pages((unsigned long)io_tlb_orig_addr, + get_order(io_tlb_nslabs * sizeof(phys_addr_t))); + io_tlb_orig_addr = NULL; +cleanup3: + free_pages((unsigned long)io_tlb_list, + get_order(io_tlb_nslabs * sizeof(int))); + io_tlb_list = NULL; +cleanup2: + io_tlb_end = NULL; + free_pages((unsigned long)io_tlb_start, order); + io_tlb_start = NULL; +cleanup1: + io_tlb_nslabs = req_nslabs; + return -ENOMEM; +} + +void __init swiotlb_free(void) +{ + if (!io_tlb_overflow_buffer) + return; + + if (late_alloc) { + free_pages((unsigned long)io_tlb_overflow_buffer, + get_order(io_tlb_overflow)); + free_pages((unsigned long)io_tlb_orig_addr, + get_order(io_tlb_nslabs * sizeof(phys_addr_t))); + free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs * + sizeof(int))); + free_pages((unsigned long)io_tlb_start, + get_order(io_tlb_nslabs << IO_TLB_SHIFT)); + } else { + free_bootmem_late(__pa(io_tlb_overflow_buffer), + io_tlb_overflow); + free_bootmem_late(__pa(io_tlb_orig_addr), + io_tlb_nslabs * sizeof(phys_addr_t)); + free_bootmem_late(__pa(io_tlb_list), + io_tlb_nslabs * sizeof(int)); + free_bootmem_late(__pa(io_tlb_start), + io_tlb_nslabs << IO_TLB_SHIFT); + } +} + +int is_swiotlb_buffer(phys_addr_t paddr) +{ + return paddr >= virt_to_phys(io_tlb_start) && + paddr < virt_to_phys(io_tlb_end); +} + +/* + * Bounce: copy the swiotlb buffer back to the original dma location + */ +void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, + enum dma_data_direction dir) +{ + unsigned long pfn = PFN_DOWN(phys); + + if (PageHighMem(pfn_to_page(pfn))) { + /* The buffer does not have a mapping. Map it in and copy */ + unsigned int offset = phys & ~PAGE_MASK; + char *buffer; + unsigned int sz = 0; + unsigned long flags; + + while (size) { + sz = min_t(size_t, PAGE_SIZE - offset, size); + + local_irq_save(flags); + buffer = kmap_atomic(pfn_to_page(pfn), + KM_BOUNCE_READ); + if (dir == DMA_TO_DEVICE) + memcpy(dma_addr, buffer + offset, sz); + else + memcpy(buffer + offset, dma_addr, sz); + kunmap_atomic(buffer, KM_BOUNCE_READ); + local_irq_restore(flags); + + size -= sz; + pfn++; + dma_addr += sz; + offset = 0; + } + } else { + if (dir == DMA_TO_DEVICE) + memcpy(dma_addr, phys_to_virt(phys), size); + else + memcpy(phys_to_virt(phys), dma_addr, size); + } +} + +/* + * Allocates bounce buffer and returns its kernel virtual address. + */ +void * +do_map_single(struct device *hwdev, phys_addr_t phys, + unsigned long start_dma_addr, size_t size, int dir) +{ + unsigned long flags; + char *dma_addr; + unsigned int nslots, stride, index, wrap; + int i; + unsigned long mask; + unsigned long offset_slots; + unsigned long max_slots; + + mask = dma_get_seg_boundary(hwdev); + start_dma_addr = start_dma_addr & mask; + offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; + + /* + * Carefully handle integer overflow which can occur when mask == ~0UL. + */ + max_slots = mask + 1 + ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT + : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT); + + /* + * For mappings greater than a page, we limit the stride (and + * hence alignment) to a page size. + */ + nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; + if (size > PAGE_SIZE) + stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT)); + else + stride = 1; + + BUG_ON(!nslots); + + /* + * Find suitable number of IO TLB entries size that will fit this + * request and allocate a buffer from that IO TLB pool. + */ + spin_lock_irqsave(&io_tlb_lock, flags); + index = ALIGN(io_tlb_index, stride); + if (index >= io_tlb_nslabs) + index = 0; + wrap = index; + + do { + while (iommu_is_span_boundary(index, nslots, offset_slots, + max_slots)) { + index += stride; + if (index >= io_tlb_nslabs) + index = 0; + if (index == wrap) + goto not_found; + } + + /* + * If we find a slot that indicates we have ''nslots'' number of + * contiguous buffers, we allocate the buffers from that slot + * and mark the entries as ''0'' indicating unavailable. + */ + if (io_tlb_list[index] >= nslots) { + int count = 0; + + for (i = index; i < (int) (index + nslots); i++) + io_tlb_list[i] = 0; + for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) + != IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) + io_tlb_list[i] = ++count; + dma_addr = io_tlb_start + (index << IO_TLB_SHIFT); + + /* + * Update the indices to avoid searching in the next + * round. + */ + io_tlb_index = ((index + nslots) < io_tlb_nslabs + ? (index + nslots) : 0); + + goto found; + } + index += stride; + if (index >= io_tlb_nslabs) + index = 0; + } while (index != wrap); + +not_found: + spin_unlock_irqrestore(&io_tlb_lock, flags); + return NULL; +found: + spin_unlock_irqrestore(&io_tlb_lock, flags); + + /* + * Save away the mapping from the original address to the DMA address. + * This is needed when we sync the memory. Then we sync the buffer if + * needed. + */ + for (i = 0; i < nslots; i++) + io_tlb_orig_addr[index+i] = phys + (i << IO_TLB_SHIFT); + if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) + swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE); + + return dma_addr; +} + +/* + * dma_addr is the kernel virtual address of the bounce buffer to unmap. + */ +void +do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) +{ + unsigned long flags; + int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; + int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT; + phys_addr_t phys = io_tlb_orig_addr[index]; + + /* + * First, sync the memory before unmapping the entry + */ + if (phys && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))) + swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE); + + /* + * Return the buffer to the free list by setting the corresponding + * entries to indicate the number of contiguous entries available. + * While returning the entries to the free list, we merge the entries + * with slots below and above the pool being returned. + */ + spin_lock_irqsave(&io_tlb_lock, flags); + { + count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ? + io_tlb_list[index + nslots] : 0); + /* + * Step 1: return the slots to the free list, merging the + * slots with superceeding slots + */ + for (i = index + nslots - 1; i >= index; i--) + io_tlb_list[i] = ++count; + /* + * Step 2: merge the returned slots with the preceding slots, + * if available (non zero) + */ + for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+ IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) + io_tlb_list[i] = ++count; + } + spin_unlock_irqrestore(&io_tlb_lock, flags); +} + +void +do_sync_single(struct device *hwdev, char *dma_addr, size_t size, + int dir, int target) +{ + int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT; + phys_addr_t phys = io_tlb_orig_addr[index]; + + phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1)); + + switch (target) { + case SYNC_FOR_CPU: + if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) + swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE); + else + BUG_ON(dir != DMA_TO_DEVICE); + break; + case SYNC_FOR_DEVICE: + if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) + swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE); + else + BUG_ON(dir != DMA_FROM_DEVICE); + break; + default: + BUG(); + } +} + +void * +swiotlb_alloc_coherent(struct device *hwdev, size_t size, + dma_addr_t *dma_handle, gfp_t flags) +{ + dma_addr_t dev_addr; + void *ret; + int order = get_order(size); + u64 dma_mask = DMA_BIT_MASK(32); + unsigned long start_dma_addr; + + if (hwdev && hwdev->coherent_dma_mask) + dma_mask = hwdev->coherent_dma_mask; + + ret = (void *)__get_free_pages(flags, order); + if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) { + /* + * The allocated memory isn''t reachable by the device. + */ + free_pages((unsigned long) ret, order); + ret = NULL; + } + if (!ret) { + /* + * We are either out of memory or the device can''t DMA + * to GFP_DMA memory; fall back on do_map_single(), which + * will grab memory from the lowest available address range. + */ + start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); + ret = do_map_single(hwdev, 0, start_dma_addr, size, + DMA_FROM_DEVICE); + if (!ret) + return NULL; + } + + memset(ret, 0, size); + dev_addr = swiotlb_virt_to_bus(hwdev, ret); + + /* Confirm address can be DMA''d by device */ + if (dev_addr + size - 1 > dma_mask) { + dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \ + "dev_addr = 0x%016Lx\n", + (unsigned long long)dma_mask, + (unsigned long long)dev_addr); + + /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ + do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE); + return NULL; + } + *dma_handle = dev_addr; + return ret; +} +EXPORT_SYMBOL(swiotlb_alloc_coherent); + +void +swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, + dma_addr_t dev_addr) +{ + phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + + WARN_ON(irqs_disabled()); + if (!is_swiotlb_buffer(paddr)) + free_pages((unsigned long)vaddr, get_order(size)); + else + /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ + do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE); +} +EXPORT_SYMBOL(swiotlb_free_coherent); + +static void +swiotlb_full(struct device *dev, size_t size, int dir, int do_panic) +{ + /* + * Ran out of IOMMU space for this operation. This is very bad. + * Unfortunately the drivers cannot handle this operation properly. + * unless they check for dma_mapping_error (most don''t) + * When the mapping is small enough return a static buffer to limit + * the damage, or panic when the transfer is too big. + */ + dev_err(dev, "DMA: Out of SW-IOMMU space for %zu bytes.", size); + + if (size <= io_tlb_overflow || !do_panic) + return; + + if (dir == DMA_BIDIRECTIONAL) + panic("DMA: Random memory could be DMA accessed\n"); + if (dir == DMA_FROM_DEVICE) + panic("DMA: Random memory could be DMA written\n"); + if (dir == DMA_TO_DEVICE) + panic("DMA: Random memory could be DMA read\n"); +} + +/* + * Map a single buffer of the indicated size for DMA in streaming mode. The + * physical address to use is returned. + * + * Once the device is given the dma address, the device owns this memory until + * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed. + */ +dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + unsigned long start_dma_addr; + phys_addr_t phys = page_to_phys(page) + offset; + dma_addr_t dev_addr = phys_to_dma(dev, phys); + void *map; + + BUG_ON(dir == DMA_NONE); + /* + * If the address happens to be in the device''s DMA window, + * we can safely return the device addr and not worry about bounce + * buffering it. + */ + if (dma_capable(dev, dev_addr, size) && !swiotlb_force) + return dev_addr; + + /* + * Oh well, have to allocate and map a bounce buffer. + */ + start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start); + map = do_map_single(dev, phys, start_dma_addr, size, dir); + if (!map) { + swiotlb_full(dev, size, dir, 1); + map = io_tlb_overflow_buffer; + } + + dev_addr = swiotlb_virt_to_bus(dev, map); + + /* + * Ensure that the address returned is DMA''ble + */ + if (!dma_capable(dev, dev_addr, size)) + panic("DMA: swiotlb_map_single: bounce buffer is not DMA''ble"); + + return dev_addr; +} +EXPORT_SYMBOL_GPL(swiotlb_map_page); + +/* + * Unmap a single streaming mode DMA translation. The dma_addr and size must + * match what was provided for in a previous swiotlb_map_page call. All + * other usages are undefined. + * + * After this call, reads by the cpu to the buffer are guaranteed to see + * whatever the device wrote there. + */ +static void unmap_single(struct device *hwdev, dma_addr_t dev_addr, + size_t size, int dir) +{ + phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + + BUG_ON(dir == DMA_NONE); + + if (is_swiotlb_buffer(paddr)) { + do_unmap_single(hwdev, phys_to_virt(paddr), size, dir); + return; + } + + if (dir != DMA_FROM_DEVICE) + return; + + /* + * phys_to_virt doesn''t work with hihgmem page but we could + * call dma_mark_clean() with hihgmem page here. However, we + * are fine since dma_mark_clean() is null on POWERPC. We can + * make dma_mark_clean() take a physical address if necessary. + */ + dma_mark_clean(phys_to_virt(paddr), size); +} + +void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + unmap_single(hwdev, dev_addr, size, dir); +} +EXPORT_SYMBOL_GPL(swiotlb_unmap_page); + +/* + * Make physical memory consistent for a single streaming mode DMA translation + * after a transfer. + * + * If you perform a swiotlb_map_page() but wish to interrogate the buffer + * using the cpu, yet do not wish to teardown the dma mapping, you must + * call this function before doing so. At the next point you give the dma + * address back to the card, you must first perform a + * swiotlb_dma_sync_for_device, and then the device again owns the buffer + */ +static void +swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, + size_t size, int dir, int target) +{ + phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + + BUG_ON(dir == DMA_NONE); + + if (is_swiotlb_buffer(paddr)) { + do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target); + return; + } + + if (dir != DMA_FROM_DEVICE) + return; + + dma_mark_clean(phys_to_virt(paddr), size); +} + +void +swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr, + size_t size, enum dma_data_direction dir) +{ + swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU); +} +EXPORT_SYMBOL(swiotlb_sync_single_for_cpu); + +void +swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr, + size_t size, enum dma_data_direction dir) +{ + swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE); +} +EXPORT_SYMBOL(swiotlb_sync_single_for_device); + +/* + * Same as above, but for a sub-range of the mapping. + */ +static void +swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr, + unsigned long offset, size_t size, + int dir, int target) +{ + swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target); +} + +void +swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr, + unsigned long offset, size_t size, + enum dma_data_direction dir) +{ + swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, + SYNC_FOR_CPU); +} +EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu); + +void +swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr, + unsigned long offset, size_t size, + enum dma_data_direction dir) +{ + swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, + SYNC_FOR_DEVICE); +} +EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device); + +/* + * Map a set of buffers described by scatterlist in streaming mode for DMA. + * This is the scatter-gather version of the above swiotlb_map_page + * interface. Here the scatter gather list elements are each tagged with the + * appropriate dma address and length. They are obtained via + * sg_dma_{address,length}(SG). + * + * NOTE: An implementation may be able to use a smaller number of + * DMA address/length pairs than there are SG table elements. + * (for example via virtual mapping capabilities) + * The routine returns the number of addr/length pairs actually + * used, at most nents. + * + * Device ownership issues as mentioned above for swiotlb_map_page are the + * same here. + */ +int +swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + unsigned long start_dma_addr; + struct scatterlist *sg; + int i; + + BUG_ON(dir == DMA_NONE); + + start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); + for_each_sg(sgl, sg, nelems, i) { + phys_addr_t paddr = sg_phys(sg); + dma_addr_t dev_addr = phys_to_dma(hwdev, paddr); + + if (swiotlb_force || + !dma_capable(hwdev, dev_addr, sg->length)) { + void *map = do_map_single(hwdev, sg_phys(sg), + start_dma_addr, + sg->length, dir); + if (!map) { + /* Don''t panic here, we expect map_sg users + to do proper error handling. */ + swiotlb_full(hwdev, sg->length, dir, 0); + swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir, + attrs); + sgl[0].dma_length = 0; + return 0; + } + sg->dma_address = swiotlb_virt_to_bus(hwdev, map); + } else + sg->dma_address = dev_addr; + sg->dma_length = sg->length; + } + return nelems; +} +EXPORT_SYMBOL(swiotlb_map_sg_attrs); + +int +swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, + int dir) +{ + return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL); +} +EXPORT_SYMBOL(swiotlb_map_sg); + +/* + * Unmap a set of streaming mode DMA translations. Again, cpu read rules + * concerning calls here are the same as for swiotlb_unmap_page() above. + */ +void +swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl, + int nelems, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + struct scatterlist *sg; + int i; + + BUG_ON(dir == DMA_NONE); + + for_each_sg(sgl, sg, nelems, i) + unmap_single(hwdev, sg->dma_address, sg->dma_length, dir); + +} +EXPORT_SYMBOL(swiotlb_unmap_sg_attrs); + +void +swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, + int dir) +{ + return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL); +} +EXPORT_SYMBOL(swiotlb_unmap_sg); + +/* + * Make physical memory consistent for a set of streaming mode DMA translations + * after a transfer. + * + * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules + * and usage. + */ +static void +swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl, + int nelems, int dir, int target) +{ + struct scatterlist *sg; + int i; + + for_each_sg(sgl, sg, nelems, i) + swiotlb_sync_single(hwdev, sg->dma_address, + sg->dma_length, dir, target); +} + +void +swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg, + int nelems, enum dma_data_direction dir) +{ + swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU); +} +EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu); + +void +swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg, + int nelems, enum dma_data_direction dir) +{ + swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE); +} +EXPORT_SYMBOL(swiotlb_sync_sg_for_device); + +int +swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr) +{ + return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer)); +} +EXPORT_SYMBOL(swiotlb_dma_mapping_error); + +/* + * Return whether the given device DMA address mask can be supported + * properly. For example, if your device can only drive the low 24-bits + * during bus mastering, then you would pass 0x00ffffff as the mask to + * this function. + */ +int +swiotlb_dma_supported(struct device *hwdev, u64 mask) +{ + return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask; +} +EXPORT_SYMBOL(swiotlb_dma_supported); diff --git a/lib/swiotlb.c b/lib/swiotlb.c deleted file mode 100644 index c982d33..0000000 --- a/lib/swiotlb.c +++ /dev/null @@ -1,957 +0,0 @@ -/* - * Dynamic DMA mapping support. - * - * This implementation is a fallback for platforms that do not support - * I/O TLBs (aka DMA address translation hardware). - * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com> - * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com> - * Copyright (C) 2000, 2003 Hewlett-Packard Co - * David Mosberger-Tang <davidm@hpl.hp.com> - * - * 03/05/07 davidm Switch from PCI-DMA to generic device DMA API. - * 00/12/13 davidm Rename to swiotlb.c and add mark_clean() to avoid - * unnecessary i-cache flushing. - * 04/07/.. ak Better overflow handling. Assorted fixes. - * 05/09/10 linville Add support for syncing ranges, support syncing for - * DMA_BIDIRECTIONAL mappings, miscellaneous cleanup. - * 08/12/11 beckyb Add highmem support - */ - -#include <linux/cache.h> -#include <linux/dma-mapping.h> -#include <linux/mm.h> -#include <linux/module.h> -#include <linux/spinlock.h> -#include <linux/string.h> -#include <linux/swiotlb.h> -#include <linux/pfn.h> -#include <linux/types.h> -#include <linux/ctype.h> -#include <linux/highmem.h> - -#include <linux/io.h> -#include <asm/dma.h> -#include <linux/scatterlist.h> - -#include <linux/init.h> -#include <linux/bootmem.h> -#include <linux/iommu-helper.h> - -#define OFFSET(val, align) ((unsigned long) ((val) & ((align) - 1))) - -#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) - -/* - * Minimum IO TLB size to bother booting with. Systems with mainly - * 64bit capable cards will only lightly use the swiotlb. If we can''t - * allocate a contiguous 1MB, we''re probably in trouble anyway. - */ -#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) - -int swiotlb_force; - -/* - * Used to do a quick range check in do_unmap_single and - * do_sync_single_*, to see if the memory was in fact allocated by this - * API. - */ -char *io_tlb_start, *io_tlb_end; - -/* - * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and - * io_tlb_end. This is command line adjustable via setup_io_tlb_npages. - */ -unsigned long io_tlb_nslabs; - -/* - * When the IOMMU overflows we return a fallback buffer. This sets the size. - */ -unsigned long io_tlb_overflow = 32*1024; - -void *io_tlb_overflow_buffer; - -/* - * This is a free list describing the number of free entries available from - * each index - */ -static unsigned int *io_tlb_list; -static unsigned int io_tlb_index; - -/* - * We need to save away the original address corresponding to a mapped entry - * for the sync operations. - */ -static phys_addr_t *io_tlb_orig_addr; - -/* - * Protect the above data structures in the map and unmap calls - */ -static DEFINE_SPINLOCK(io_tlb_lock); - -static int late_alloc; - -static int __init -setup_io_tlb_npages(char *str) -{ - int get_value(const char *token, char *str, char **endp) - { - ssize_t len; - int val = 0; - - len = strlen(token); - if (!strncmp(str, token, len)) { - str += len; - if (*str == ''='') - ++str; - if (*str != ''\0'') - val = simple_strtoul(str, endp, 0); - } - *endp = str; - return val; - } - - int val; - - while (*str) { - /* The old syntax */ - if (isdigit(*str)) { - io_tlb_nslabs = simple_strtoul(str, &str, 0); - /* avoid tail segment of size < IO_TLB_SEGSIZE */ - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); - } - if (!strncmp(str, "force", 5)) - swiotlb_force = 1; - /* The new syntax: swiotlb=nslabs=16384,overflow=32768,force */ - val = get_value("nslabs", str, &str); - if (val) - io_tlb_nslabs = ALIGN(val, IO_TLB_SEGSIZE); - - val = get_value("overflow", str, &str); - if (val) - io_tlb_overflow = val; - str = strpbrk(str, ","); - if (!str) - break; - str++; /* skip '','' */ - } - return 1; -} -__setup("swiotlb=", setup_io_tlb_npages); - -/* Note that this doesn''t work with highmem page */ -static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, - volatile void *address) -{ - return phys_to_dma(hwdev, virt_to_phys(address)); -} - -void swiotlb_print_info(void) -{ - unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT; - phys_addr_t pstart, pend; - - pstart = virt_to_phys(io_tlb_start); - pend = virt_to_phys(io_tlb_end); - - printk(KERN_INFO "DMA: Placing %luMB software IO TLB between %p - %p\n", - bytes >> 20, io_tlb_start, io_tlb_end); - printk(KERN_INFO "DMA: software IO TLB at phys %#llx - %#llx\n", - (unsigned long long)pstart, - (unsigned long long)pend); -} - -/* - * Statically reserve bounce buffer space and initialize bounce buffer data - * structures for the software IO TLB used to implement the DMA API. - */ -void __init -swiotlb_init_early(size_t default_size, int verbose) -{ - unsigned long i, bytes; - - if (!io_tlb_nslabs) { - io_tlb_nslabs = (default_size >> IO_TLB_SHIFT); - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); - } - - bytes = io_tlb_nslabs << IO_TLB_SHIFT; - - /* - * Get IO TLB memory from the low pages - */ - io_tlb_start = alloc_bootmem_low_pages(bytes); - if (!io_tlb_start) - panic("DMA: Cannot allocate SWIOTLB buffer"); - io_tlb_end = io_tlb_start + bytes; - - /* - * Allocate and initialize the free list array. This array is used - * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE - * between io_tlb_start and io_tlb_end. - */ - io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int)); - for (i = 0; i < io_tlb_nslabs; i++) - io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); - io_tlb_index = 0; - io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t)); - - /* - * Get the overflow emergency buffer - */ - io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow); - if (!io_tlb_overflow_buffer) - panic("DMA: Cannot allocate SWIOTLB overflow buffer!\n"); - if (verbose) - swiotlb_print_info(); -} - -void __init -swiotlb_init(int verbose) -{ - swiotlb_init_early(64 * (1<<20), verbose); /* default to 64MB */ -} - -/* - * Systems with larger DMA zones (those that don''t support ISA) can - * initialize the swiotlb later using the slab allocator if needed. - * This should be just like above, but with some error catching. - */ -int -swiotlb_init_late(size_t default_size) -{ - unsigned long i, bytes, req_nslabs = io_tlb_nslabs; - unsigned int order; - - if (!io_tlb_nslabs) { - io_tlb_nslabs = (default_size >> IO_TLB_SHIFT); - io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE); - } - - /* - * Get IO TLB memory from the low pages - */ - order = get_order(io_tlb_nslabs << IO_TLB_SHIFT); - io_tlb_nslabs = SLABS_PER_PAGE << order; - bytes = io_tlb_nslabs << IO_TLB_SHIFT; - - while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { - io_tlb_start = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN, - order); - if (io_tlb_start) - break; - order--; - } - - if (!io_tlb_start) - goto cleanup1; - - if (order != get_order(bytes)) { - printk(KERN_WARNING "DMA: Warning: only able to allocate %ld MB" - " for software IO TLB\n", (PAGE_SIZE << order) >> 20); - io_tlb_nslabs = SLABS_PER_PAGE << order; - bytes = io_tlb_nslabs << IO_TLB_SHIFT; - } - io_tlb_end = io_tlb_start + bytes; - memset(io_tlb_start, 0, bytes); - - /* - * Allocate and initialize the free list array. This array is used - * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE - * between io_tlb_start and io_tlb_end. - */ - io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL, - get_order(io_tlb_nslabs * sizeof(int))); - if (!io_tlb_list) - goto cleanup2; - - for (i = 0; i < io_tlb_nslabs; i++) - io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE); - io_tlb_index = 0; - - io_tlb_orig_addr = (phys_addr_t *) __get_free_pages(GFP_KERNEL, - get_order(io_tlb_nslabs * sizeof(phys_addr_t))); - if (!io_tlb_orig_addr) - goto cleanup3; - - memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t)); - - /* - * Get the overflow emergency buffer - */ - io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA, - get_order(io_tlb_overflow)); - if (!io_tlb_overflow_buffer) - goto cleanup4; - - swiotlb_print_info(); - - late_alloc = 1; - - return 0; - -cleanup4: - free_pages((unsigned long)io_tlb_orig_addr, - get_order(io_tlb_nslabs * sizeof(phys_addr_t))); - io_tlb_orig_addr = NULL; -cleanup3: - free_pages((unsigned long)io_tlb_list, - get_order(io_tlb_nslabs * sizeof(int))); - io_tlb_list = NULL; -cleanup2: - io_tlb_end = NULL; - free_pages((unsigned long)io_tlb_start, order); - io_tlb_start = NULL; -cleanup1: - io_tlb_nslabs = req_nslabs; - return -ENOMEM; -} - -void __init swiotlb_free(void) -{ - if (!io_tlb_overflow_buffer) - return; - - if (late_alloc) { - free_pages((unsigned long)io_tlb_overflow_buffer, - get_order(io_tlb_overflow)); - free_pages((unsigned long)io_tlb_orig_addr, - get_order(io_tlb_nslabs * sizeof(phys_addr_t))); - free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs * - sizeof(int))); - free_pages((unsigned long)io_tlb_start, - get_order(io_tlb_nslabs << IO_TLB_SHIFT)); - } else { - free_bootmem_late(__pa(io_tlb_overflow_buffer), - io_tlb_overflow); - free_bootmem_late(__pa(io_tlb_orig_addr), - io_tlb_nslabs * sizeof(phys_addr_t)); - free_bootmem_late(__pa(io_tlb_list), - io_tlb_nslabs * sizeof(int)); - free_bootmem_late(__pa(io_tlb_start), - io_tlb_nslabs << IO_TLB_SHIFT); - } -} - -int is_swiotlb_buffer(phys_addr_t paddr) -{ - return paddr >= virt_to_phys(io_tlb_start) && - paddr < virt_to_phys(io_tlb_end); -} - -/* - * Bounce: copy the swiotlb buffer back to the original dma location - */ -void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size, - enum dma_data_direction dir) -{ - unsigned long pfn = PFN_DOWN(phys); - - if (PageHighMem(pfn_to_page(pfn))) { - /* The buffer does not have a mapping. Map it in and copy */ - unsigned int offset = phys & ~PAGE_MASK; - char *buffer; - unsigned int sz = 0; - unsigned long flags; - - while (size) { - sz = min_t(size_t, PAGE_SIZE - offset, size); - - local_irq_save(flags); - buffer = kmap_atomic(pfn_to_page(pfn), - KM_BOUNCE_READ); - if (dir == DMA_TO_DEVICE) - memcpy(dma_addr, buffer + offset, sz); - else - memcpy(buffer + offset, dma_addr, sz); - kunmap_atomic(buffer, KM_BOUNCE_READ); - local_irq_restore(flags); - - size -= sz; - pfn++; - dma_addr += sz; - offset = 0; - } - } else { - if (dir == DMA_TO_DEVICE) - memcpy(dma_addr, phys_to_virt(phys), size); - else - memcpy(phys_to_virt(phys), dma_addr, size); - } -} - -/* - * Allocates bounce buffer and returns its kernel virtual address. - */ -void * -do_map_single(struct device *hwdev, phys_addr_t phys, - unsigned long start_dma_addr, size_t size, int dir) -{ - unsigned long flags; - char *dma_addr; - unsigned int nslots, stride, index, wrap; - int i; - unsigned long mask; - unsigned long offset_slots; - unsigned long max_slots; - - mask = dma_get_seg_boundary(hwdev); - start_dma_addr = start_dma_addr & mask; - offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; - - /* - * Carefully handle integer overflow which can occur when mask == ~0UL. - */ - max_slots = mask + 1 - ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT - : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT); - - /* - * For mappings greater than a page, we limit the stride (and - * hence alignment) to a page size. - */ - nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; - if (size > PAGE_SIZE) - stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT)); - else - stride = 1; - - BUG_ON(!nslots); - - /* - * Find suitable number of IO TLB entries size that will fit this - * request and allocate a buffer from that IO TLB pool. - */ - spin_lock_irqsave(&io_tlb_lock, flags); - index = ALIGN(io_tlb_index, stride); - if (index >= io_tlb_nslabs) - index = 0; - wrap = index; - - do { - while (iommu_is_span_boundary(index, nslots, offset_slots, - max_slots)) { - index += stride; - if (index >= io_tlb_nslabs) - index = 0; - if (index == wrap) - goto not_found; - } - - /* - * If we find a slot that indicates we have ''nslots'' number of - * contiguous buffers, we allocate the buffers from that slot - * and mark the entries as ''0'' indicating unavailable. - */ - if (io_tlb_list[index] >= nslots) { - int count = 0; - - for (i = index; i < (int) (index + nslots); i++) - io_tlb_list[i] = 0; - for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) - != IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) - io_tlb_list[i] = ++count; - dma_addr = io_tlb_start + (index << IO_TLB_SHIFT); - - /* - * Update the indices to avoid searching in the next - * round. - */ - io_tlb_index = ((index + nslots) < io_tlb_nslabs - ? (index + nslots) : 0); - - goto found; - } - index += stride; - if (index >= io_tlb_nslabs) - index = 0; - } while (index != wrap); - -not_found: - spin_unlock_irqrestore(&io_tlb_lock, flags); - return NULL; -found: - spin_unlock_irqrestore(&io_tlb_lock, flags); - - /* - * Save away the mapping from the original address to the DMA address. - * This is needed when we sync the memory. Then we sync the buffer if - * needed. - */ - for (i = 0; i < nslots; i++) - io_tlb_orig_addr[index+i] = phys + (i << IO_TLB_SHIFT); - if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) - swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE); - - return dma_addr; -} - -/* - * dma_addr is the kernel virtual address of the bounce buffer to unmap. - */ -void -do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir) -{ - unsigned long flags; - int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT; - int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT; - phys_addr_t phys = io_tlb_orig_addr[index]; - - /* - * First, sync the memory before unmapping the entry - */ - if (phys && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))) - swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE); - - /* - * Return the buffer to the free list by setting the corresponding - * entries to indicate the number of contiguous entries available. - * While returning the entries to the free list, we merge the entries - * with slots below and above the pool being returned. - */ - spin_lock_irqsave(&io_tlb_lock, flags); - { - count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ? - io_tlb_list[index + nslots] : 0); - /* - * Step 1: return the slots to the free list, merging the - * slots with superceeding slots - */ - for (i = index + nslots - 1; i >= index; i--) - io_tlb_list[i] = ++count; - /* - * Step 2: merge the returned slots with the preceding slots, - * if available (non zero) - */ - for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !- IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--) - io_tlb_list[i] = ++count; - } - spin_unlock_irqrestore(&io_tlb_lock, flags); -} - -void -do_sync_single(struct device *hwdev, char *dma_addr, size_t size, - int dir, int target) -{ - int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT; - phys_addr_t phys = io_tlb_orig_addr[index]; - - phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1)); - - switch (target) { - case SYNC_FOR_CPU: - if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL)) - swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE); - else - BUG_ON(dir != DMA_TO_DEVICE); - break; - case SYNC_FOR_DEVICE: - if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)) - swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE); - else - BUG_ON(dir != DMA_FROM_DEVICE); - break; - default: - BUG(); - } -} - -void * -swiotlb_alloc_coherent(struct device *hwdev, size_t size, - dma_addr_t *dma_handle, gfp_t flags) -{ - dma_addr_t dev_addr; - void *ret; - int order = get_order(size); - u64 dma_mask = DMA_BIT_MASK(32); - unsigned long start_dma_addr; - - if (hwdev && hwdev->coherent_dma_mask) - dma_mask = hwdev->coherent_dma_mask; - - ret = (void *)__get_free_pages(flags, order); - if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) { - /* - * The allocated memory isn''t reachable by the device. - */ - free_pages((unsigned long) ret, order); - ret = NULL; - } - if (!ret) { - /* - * We are either out of memory or the device can''t DMA - * to GFP_DMA memory; fall back on do_map_single(), which - * will grab memory from the lowest available address range. - */ - start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); - ret = do_map_single(hwdev, 0, start_dma_addr, size, - DMA_FROM_DEVICE); - if (!ret) - return NULL; - } - - memset(ret, 0, size); - dev_addr = swiotlb_virt_to_bus(hwdev, ret); - - /* Confirm address can be DMA''d by device */ - if (dev_addr + size - 1 > dma_mask) { - dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \ - "dev_addr = 0x%016Lx\n", - (unsigned long long)dma_mask, - (unsigned long long)dev_addr); - - /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ - do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE); - return NULL; - } - *dma_handle = dev_addr; - return ret; -} -EXPORT_SYMBOL(swiotlb_alloc_coherent); - -void -swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, - dma_addr_t dev_addr) -{ - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); - - WARN_ON(irqs_disabled()); - if (!is_swiotlb_buffer(paddr)) - free_pages((unsigned long)vaddr, get_order(size)); - else - /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ - do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE); -} -EXPORT_SYMBOL(swiotlb_free_coherent); - -static void -swiotlb_full(struct device *dev, size_t size, int dir, int do_panic) -{ - /* - * Ran out of IOMMU space for this operation. This is very bad. - * Unfortunately the drivers cannot handle this operation properly. - * unless they check for dma_mapping_error (most don''t) - * When the mapping is small enough return a static buffer to limit - * the damage, or panic when the transfer is too big. - */ - dev_err(dev, "DMA: Out of SW-IOMMU space for %zu bytes.", size); - - if (size <= io_tlb_overflow || !do_panic) - return; - - if (dir == DMA_BIDIRECTIONAL) - panic("DMA: Random memory could be DMA accessed\n"); - if (dir == DMA_FROM_DEVICE) - panic("DMA: Random memory could be DMA written\n"); - if (dir == DMA_TO_DEVICE) - panic("DMA: Random memory could be DMA read\n"); -} - -/* - * Map a single buffer of the indicated size for DMA in streaming mode. The - * physical address to use is returned. - * - * Once the device is given the dma address, the device owns this memory until - * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed. - */ -dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, - enum dma_data_direction dir, - struct dma_attrs *attrs) -{ - unsigned long start_dma_addr; - phys_addr_t phys = page_to_phys(page) + offset; - dma_addr_t dev_addr = phys_to_dma(dev, phys); - void *map; - - BUG_ON(dir == DMA_NONE); - /* - * If the address happens to be in the device''s DMA window, - * we can safely return the device addr and not worry about bounce - * buffering it. - */ - if (dma_capable(dev, dev_addr, size) && !swiotlb_force) - return dev_addr; - - /* - * Oh well, have to allocate and map a bounce buffer. - */ - start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start); - map = do_map_single(dev, phys, start_dma_addr, size, dir); - if (!map) { - swiotlb_full(dev, size, dir, 1); - map = io_tlb_overflow_buffer; - } - - dev_addr = swiotlb_virt_to_bus(dev, map); - - /* - * Ensure that the address returned is DMA''ble - */ - if (!dma_capable(dev, dev_addr, size)) - panic("DMA: swiotlb_map_single: bounce buffer is not DMA''ble"); - - return dev_addr; -} -EXPORT_SYMBOL_GPL(swiotlb_map_page); - -/* - * Unmap a single streaming mode DMA translation. The dma_addr and size must - * match what was provided for in a previous swiotlb_map_page call. All - * other usages are undefined. - * - * After this call, reads by the cpu to the buffer are guaranteed to see - * whatever the device wrote there. - */ -static void unmap_single(struct device *hwdev, dma_addr_t dev_addr, - size_t size, int dir) -{ - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); - - BUG_ON(dir == DMA_NONE); - - if (is_swiotlb_buffer(paddr)) { - do_unmap_single(hwdev, phys_to_virt(paddr), size, dir); - return; - } - - if (dir != DMA_FROM_DEVICE) - return; - - /* - * phys_to_virt doesn''t work with hihgmem page but we could - * call dma_mark_clean() with hihgmem page here. However, we - * are fine since dma_mark_clean() is null on POWERPC. We can - * make dma_mark_clean() take a physical address if necessary. - */ - dma_mark_clean(phys_to_virt(paddr), size); -} - -void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr, - size_t size, enum dma_data_direction dir, - struct dma_attrs *attrs) -{ - unmap_single(hwdev, dev_addr, size, dir); -} -EXPORT_SYMBOL_GPL(swiotlb_unmap_page); - -/* - * Make physical memory consistent for a single streaming mode DMA translation - * after a transfer. - * - * If you perform a swiotlb_map_page() but wish to interrogate the buffer - * using the cpu, yet do not wish to teardown the dma mapping, you must - * call this function before doing so. At the next point you give the dma - * address back to the card, you must first perform a - * swiotlb_dma_sync_for_device, and then the device again owns the buffer - */ -static void -swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, - size_t size, int dir, int target) -{ - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); - - BUG_ON(dir == DMA_NONE); - - if (is_swiotlb_buffer(paddr)) { - do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target); - return; - } - - if (dir != DMA_FROM_DEVICE) - return; - - dma_mark_clean(phys_to_virt(paddr), size); -} - -void -swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr, - size_t size, enum dma_data_direction dir) -{ - swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU); -} -EXPORT_SYMBOL(swiotlb_sync_single_for_cpu); - -void -swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr, - size_t size, enum dma_data_direction dir) -{ - swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE); -} -EXPORT_SYMBOL(swiotlb_sync_single_for_device); - -/* - * Same as above, but for a sub-range of the mapping. - */ -static void -swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr, - unsigned long offset, size_t size, - int dir, int target) -{ - swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target); -} - -void -swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr, - unsigned long offset, size_t size, - enum dma_data_direction dir) -{ - swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, - SYNC_FOR_CPU); -} -EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu); - -void -swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr, - unsigned long offset, size_t size, - enum dma_data_direction dir) -{ - swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, - SYNC_FOR_DEVICE); -} -EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device); - -/* - * Map a set of buffers described by scatterlist in streaming mode for DMA. - * This is the scatter-gather version of the above swiotlb_map_page - * interface. Here the scatter gather list elements are each tagged with the - * appropriate dma address and length. They are obtained via - * sg_dma_{address,length}(SG). - * - * NOTE: An implementation may be able to use a smaller number of - * DMA address/length pairs than there are SG table elements. - * (for example via virtual mapping capabilities) - * The routine returns the number of addr/length pairs actually - * used, at most nents. - * - * Device ownership issues as mentioned above for swiotlb_map_page are the - * same here. - */ -int -swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, - enum dma_data_direction dir, struct dma_attrs *attrs) -{ - unsigned long start_dma_addr; - struct scatterlist *sg; - int i; - - BUG_ON(dir == DMA_NONE); - - start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); - for_each_sg(sgl, sg, nelems, i) { - phys_addr_t paddr = sg_phys(sg); - dma_addr_t dev_addr = phys_to_dma(hwdev, paddr); - - if (swiotlb_force || - !dma_capable(hwdev, dev_addr, sg->length)) { - void *map = do_map_single(hwdev, sg_phys(sg), - start_dma_addr, - sg->length, dir); - if (!map) { - /* Don''t panic here, we expect map_sg users - to do proper error handling. */ - swiotlb_full(hwdev, sg->length, dir, 0); - swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir, - attrs); - sgl[0].dma_length = 0; - return 0; - } - sg->dma_address = swiotlb_virt_to_bus(hwdev, map); - } else - sg->dma_address = dev_addr; - sg->dma_length = sg->length; - } - return nelems; -} -EXPORT_SYMBOL(swiotlb_map_sg_attrs); - -int -swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, - int dir) -{ - return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL); -} -EXPORT_SYMBOL(swiotlb_map_sg); - -/* - * Unmap a set of streaming mode DMA translations. Again, cpu read rules - * concerning calls here are the same as for swiotlb_unmap_page() above. - */ -void -swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl, - int nelems, enum dma_data_direction dir, - struct dma_attrs *attrs) -{ - struct scatterlist *sg; - int i; - - BUG_ON(dir == DMA_NONE); - - for_each_sg(sgl, sg, nelems, i) - unmap_single(hwdev, sg->dma_address, sg->dma_length, dir); - -} -EXPORT_SYMBOL(swiotlb_unmap_sg_attrs); - -void -swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, - int dir) -{ - return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL); -} -EXPORT_SYMBOL(swiotlb_unmap_sg); - -/* - * Make physical memory consistent for a set of streaming mode DMA translations - * after a transfer. - * - * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules - * and usage. - */ -static void -swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl, - int nelems, int dir, int target) -{ - struct scatterlist *sg; - int i; - - for_each_sg(sgl, sg, nelems, i) - swiotlb_sync_single(hwdev, sg->dma_address, - sg->dma_length, dir, target); -} - -void -swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg, - int nelems, enum dma_data_direction dir) -{ - swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU); -} -EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu); - -void -swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg, - int nelems, enum dma_data_direction dir) -{ - swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE); -} -EXPORT_SYMBOL(swiotlb_sync_sg_for_device); - -int -swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr) -{ - return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer)); -} -EXPORT_SYMBOL(swiotlb_dma_mapping_error); - -/* - * Return whether the given device DMA address mask can be supported - * properly. For example, if your device can only drive the low 24-bits - * during bus mastering, then you would pass 0x00ffffff as the mask to - * this function. - */ -int -swiotlb_dma_supported(struct device *hwdev, u64 mask) -{ - return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask; -} -EXPORT_SYMBOL(swiotlb_dma_supported); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2010-Feb-03 17:08 UTC
[Xen-devel] [PATCH 11/11] [swiotlb] move dma_ops functions to swiotlb.c.
From: Konrad Rzeszutek <konrad@t42p-lan.dumpdata.com> In essence, leave in swiotlb-core.c functions dealing with the bookkeeping of the IOMMU. And functions which are declared in dma_ops structures are moved over to swiotlb.c. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- lib/swiotlb-core.c | 385 --------------------------------------------------- lib/swiotlb.c | 391 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 391 insertions(+), 385 deletions(-) create mode 100644 lib/swiotlb.c diff --git a/lib/swiotlb-core.c b/lib/swiotlb-core.c index c982d33..2534d6d 100644 --- a/lib/swiotlb-core.c +++ b/lib/swiotlb-core.c @@ -138,13 +138,6 @@ setup_io_tlb_npages(char *str) } __setup("swiotlb=", setup_io_tlb_npages); -/* Note that this doesn''t work with highmem page */ -static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, - volatile void *address) -{ - return phys_to_dma(hwdev, virt_to_phys(address)); -} - void swiotlb_print_info(void) { unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT; @@ -555,76 +548,7 @@ do_sync_single(struct device *hwdev, char *dma_addr, size_t size, BUG(); } } - -void * -swiotlb_alloc_coherent(struct device *hwdev, size_t size, - dma_addr_t *dma_handle, gfp_t flags) -{ - dma_addr_t dev_addr; - void *ret; - int order = get_order(size); - u64 dma_mask = DMA_BIT_MASK(32); - unsigned long start_dma_addr; - - if (hwdev && hwdev->coherent_dma_mask) - dma_mask = hwdev->coherent_dma_mask; - - ret = (void *)__get_free_pages(flags, order); - if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) { - /* - * The allocated memory isn''t reachable by the device. - */ - free_pages((unsigned long) ret, order); - ret = NULL; - } - if (!ret) { - /* - * We are either out of memory or the device can''t DMA - * to GFP_DMA memory; fall back on do_map_single(), which - * will grab memory from the lowest available address range. - */ - start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); - ret = do_map_single(hwdev, 0, start_dma_addr, size, - DMA_FROM_DEVICE); - if (!ret) - return NULL; - } - - memset(ret, 0, size); - dev_addr = swiotlb_virt_to_bus(hwdev, ret); - - /* Confirm address can be DMA''d by device */ - if (dev_addr + size - 1 > dma_mask) { - dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \ - "dev_addr = 0x%016Lx\n", - (unsigned long long)dma_mask, - (unsigned long long)dev_addr); - - /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ - do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE); - return NULL; - } - *dma_handle = dev_addr; - return ret; -} -EXPORT_SYMBOL(swiotlb_alloc_coherent); - void -swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, - dma_addr_t dev_addr) -{ - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); - - WARN_ON(irqs_disabled()); - if (!is_swiotlb_buffer(paddr)) - free_pages((unsigned long)vaddr, get_order(size)); - else - /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ - do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE); -} -EXPORT_SYMBOL(swiotlb_free_coherent); - -static void swiotlb_full(struct device *dev, size_t size, int dir, int do_panic) { /* @@ -646,312 +570,3 @@ swiotlb_full(struct device *dev, size_t size, int dir, int do_panic) if (dir == DMA_TO_DEVICE) panic("DMA: Random memory could be DMA read\n"); } - -/* - * Map a single buffer of the indicated size for DMA in streaming mode. The - * physical address to use is returned. - * - * Once the device is given the dma address, the device owns this memory until - * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed. - */ -dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, - unsigned long offset, size_t size, - enum dma_data_direction dir, - struct dma_attrs *attrs) -{ - unsigned long start_dma_addr; - phys_addr_t phys = page_to_phys(page) + offset; - dma_addr_t dev_addr = phys_to_dma(dev, phys); - void *map; - - BUG_ON(dir == DMA_NONE); - /* - * If the address happens to be in the device''s DMA window, - * we can safely return the device addr and not worry about bounce - * buffering it. - */ - if (dma_capable(dev, dev_addr, size) && !swiotlb_force) - return dev_addr; - - /* - * Oh well, have to allocate and map a bounce buffer. - */ - start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start); - map = do_map_single(dev, phys, start_dma_addr, size, dir); - if (!map) { - swiotlb_full(dev, size, dir, 1); - map = io_tlb_overflow_buffer; - } - - dev_addr = swiotlb_virt_to_bus(dev, map); - - /* - * Ensure that the address returned is DMA''ble - */ - if (!dma_capable(dev, dev_addr, size)) - panic("DMA: swiotlb_map_single: bounce buffer is not DMA''ble"); - - return dev_addr; -} -EXPORT_SYMBOL_GPL(swiotlb_map_page); - -/* - * Unmap a single streaming mode DMA translation. The dma_addr and size must - * match what was provided for in a previous swiotlb_map_page call. All - * other usages are undefined. - * - * After this call, reads by the cpu to the buffer are guaranteed to see - * whatever the device wrote there. - */ -static void unmap_single(struct device *hwdev, dma_addr_t dev_addr, - size_t size, int dir) -{ - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); - - BUG_ON(dir == DMA_NONE); - - if (is_swiotlb_buffer(paddr)) { - do_unmap_single(hwdev, phys_to_virt(paddr), size, dir); - return; - } - - if (dir != DMA_FROM_DEVICE) - return; - - /* - * phys_to_virt doesn''t work with hihgmem page but we could - * call dma_mark_clean() with hihgmem page here. However, we - * are fine since dma_mark_clean() is null on POWERPC. We can - * make dma_mark_clean() take a physical address if necessary. - */ - dma_mark_clean(phys_to_virt(paddr), size); -} - -void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr, - size_t size, enum dma_data_direction dir, - struct dma_attrs *attrs) -{ - unmap_single(hwdev, dev_addr, size, dir); -} -EXPORT_SYMBOL_GPL(swiotlb_unmap_page); - -/* - * Make physical memory consistent for a single streaming mode DMA translation - * after a transfer. - * - * If you perform a swiotlb_map_page() but wish to interrogate the buffer - * using the cpu, yet do not wish to teardown the dma mapping, you must - * call this function before doing so. At the next point you give the dma - * address back to the card, you must first perform a - * swiotlb_dma_sync_for_device, and then the device again owns the buffer - */ -static void -swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, - size_t size, int dir, int target) -{ - phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); - - BUG_ON(dir == DMA_NONE); - - if (is_swiotlb_buffer(paddr)) { - do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target); - return; - } - - if (dir != DMA_FROM_DEVICE) - return; - - dma_mark_clean(phys_to_virt(paddr), size); -} - -void -swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr, - size_t size, enum dma_data_direction dir) -{ - swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU); -} -EXPORT_SYMBOL(swiotlb_sync_single_for_cpu); - -void -swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr, - size_t size, enum dma_data_direction dir) -{ - swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE); -} -EXPORT_SYMBOL(swiotlb_sync_single_for_device); - -/* - * Same as above, but for a sub-range of the mapping. - */ -static void -swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr, - unsigned long offset, size_t size, - int dir, int target) -{ - swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target); -} - -void -swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr, - unsigned long offset, size_t size, - enum dma_data_direction dir) -{ - swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, - SYNC_FOR_CPU); -} -EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu); - -void -swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr, - unsigned long offset, size_t size, - enum dma_data_direction dir) -{ - swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, - SYNC_FOR_DEVICE); -} -EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device); - -/* - * Map a set of buffers described by scatterlist in streaming mode for DMA. - * This is the scatter-gather version of the above swiotlb_map_page - * interface. Here the scatter gather list elements are each tagged with the - * appropriate dma address and length. They are obtained via - * sg_dma_{address,length}(SG). - * - * NOTE: An implementation may be able to use a smaller number of - * DMA address/length pairs than there are SG table elements. - * (for example via virtual mapping capabilities) - * The routine returns the number of addr/length pairs actually - * used, at most nents. - * - * Device ownership issues as mentioned above for swiotlb_map_page are the - * same here. - */ -int -swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, - enum dma_data_direction dir, struct dma_attrs *attrs) -{ - unsigned long start_dma_addr; - struct scatterlist *sg; - int i; - - BUG_ON(dir == DMA_NONE); - - start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); - for_each_sg(sgl, sg, nelems, i) { - phys_addr_t paddr = sg_phys(sg); - dma_addr_t dev_addr = phys_to_dma(hwdev, paddr); - - if (swiotlb_force || - !dma_capable(hwdev, dev_addr, sg->length)) { - void *map = do_map_single(hwdev, sg_phys(sg), - start_dma_addr, - sg->length, dir); - if (!map) { - /* Don''t panic here, we expect map_sg users - to do proper error handling. */ - swiotlb_full(hwdev, sg->length, dir, 0); - swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir, - attrs); - sgl[0].dma_length = 0; - return 0; - } - sg->dma_address = swiotlb_virt_to_bus(hwdev, map); - } else - sg->dma_address = dev_addr; - sg->dma_length = sg->length; - } - return nelems; -} -EXPORT_SYMBOL(swiotlb_map_sg_attrs); - -int -swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, - int dir) -{ - return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL); -} -EXPORT_SYMBOL(swiotlb_map_sg); - -/* - * Unmap a set of streaming mode DMA translations. Again, cpu read rules - * concerning calls here are the same as for swiotlb_unmap_page() above. - */ -void -swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl, - int nelems, enum dma_data_direction dir, - struct dma_attrs *attrs) -{ - struct scatterlist *sg; - int i; - - BUG_ON(dir == DMA_NONE); - - for_each_sg(sgl, sg, nelems, i) - unmap_single(hwdev, sg->dma_address, sg->dma_length, dir); - -} -EXPORT_SYMBOL(swiotlb_unmap_sg_attrs); - -void -swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, - int dir) -{ - return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL); -} -EXPORT_SYMBOL(swiotlb_unmap_sg); - -/* - * Make physical memory consistent for a set of streaming mode DMA translations - * after a transfer. - * - * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules - * and usage. - */ -static void -swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl, - int nelems, int dir, int target) -{ - struct scatterlist *sg; - int i; - - for_each_sg(sgl, sg, nelems, i) - swiotlb_sync_single(hwdev, sg->dma_address, - sg->dma_length, dir, target); -} - -void -swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg, - int nelems, enum dma_data_direction dir) -{ - swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU); -} -EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu); - -void -swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg, - int nelems, enum dma_data_direction dir) -{ - swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE); -} -EXPORT_SYMBOL(swiotlb_sync_sg_for_device); - -int -swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr) -{ - return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer)); -} -EXPORT_SYMBOL(swiotlb_dma_mapping_error); - -/* - * Return whether the given device DMA address mask can be supported - * properly. For example, if your device can only drive the low 24-bits - * during bus mastering, then you would pass 0x00ffffff as the mask to - * this function. - */ -int -swiotlb_dma_supported(struct device *hwdev, u64 mask) -{ - return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask; -} -EXPORT_SYMBOL(swiotlb_dma_supported); diff --git a/lib/swiotlb.c b/lib/swiotlb.c new file mode 100644 index 0000000..f6bbcd1 --- /dev/null +++ b/lib/swiotlb.c @@ -0,0 +1,391 @@ + +#include <linux/dma-mapping.h> +#include <linux/module.h> +#include <linux/swiotlb.h> + +#include <asm/scatterlist.h> +#include <linux/iommu-helper.h> + + +/* Note that this doesn''t work with highmem page */ +static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev, + volatile void *address) +{ + return phys_to_dma(hwdev, virt_to_phys(address)); +} +void * +swiotlb_alloc_coherent(struct device *hwdev, size_t size, + dma_addr_t *dma_handle, gfp_t flags) +{ + dma_addr_t dev_addr; + void *ret; + int order = get_order(size); + u64 dma_mask = DMA_BIT_MASK(32); + unsigned long start_dma_addr; + + if (hwdev && hwdev->coherent_dma_mask) + dma_mask = hwdev->coherent_dma_mask; + + ret = (void *)__get_free_pages(flags, order); + if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) { + /* + * The allocated memory isn''t reachable by the device. + */ + free_pages((unsigned long) ret, order); + ret = NULL; + } + if (!ret) { + /* + * We are either out of memory or the device can''t DMA + * to GFP_DMA memory; fall back on do_map_single(), which + * will grab memory from the lowest available address range. + */ + start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); + ret = do_map_single(hwdev, 0, start_dma_addr, size, + DMA_FROM_DEVICE); + if (!ret) + return NULL; + } + + memset(ret, 0, size); + dev_addr = swiotlb_virt_to_bus(hwdev, ret); + + /* Confirm address can be DMA''d by device */ + if (dev_addr + size - 1 > dma_mask) { + dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \ + "dev_addr = 0x%016Lx\n", + (unsigned long long)dma_mask, + (unsigned long long)dev_addr); + + /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ + do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE); + return NULL; + } + *dma_handle = dev_addr; + return ret; +} +EXPORT_SYMBOL(swiotlb_alloc_coherent); + +void +swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr, + dma_addr_t dev_addr) +{ + phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + + WARN_ON(irqs_disabled()); + if (!is_swiotlb_buffer(paddr)) + free_pages((unsigned long)vaddr, get_order(size)); + else + /* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */ + do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE); +} +EXPORT_SYMBOL(swiotlb_free_coherent); + +/* + * Map a single buffer of the indicated size for DMA in streaming mode. The + * physical address to use is returned. + * + * Once the device is given the dma address, the device owns this memory until + * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed. + */ +dma_addr_t swiotlb_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + unsigned long start_dma_addr; + phys_addr_t phys = page_to_phys(page) + offset; + dma_addr_t dev_addr = phys_to_dma(dev, phys); + void *map; + + BUG_ON(dir == DMA_NONE); + /* + * If the address happens to be in the device''s DMA window, + * we can safely return the device addr and not worry about bounce + * buffering it. + */ + if (dma_capable(dev, dev_addr, size) && !swiotlb_force) + return dev_addr; + + /* + * Oh well, have to allocate and map a bounce buffer. + */ + start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start); + map = do_map_single(dev, phys, start_dma_addr, size, dir); + if (!map) { + swiotlb_full(dev, size, dir, 1); + map = io_tlb_overflow_buffer; + } + + dev_addr = swiotlb_virt_to_bus(dev, map); + + /* + * Ensure that the address returned is DMA''ble + */ + if (!dma_capable(dev, dev_addr, size)) + panic("DMA: swiotlb_map_single: bounce buffer is not DMA''ble"); + + return dev_addr; +} +EXPORT_SYMBOL_GPL(swiotlb_map_page); + +/* + * Unmap a single streaming mode DMA translation. The dma_addr and size must + * match what was provided for in a previous swiotlb_map_page call. All + * other usages are undefined. + * + * After this call, reads by the cpu to the buffer are guaranteed to see + * whatever the device wrote there. + */ +static void unmap_single(struct device *hwdev, dma_addr_t dev_addr, + size_t size, int dir) +{ + phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + + BUG_ON(dir == DMA_NONE); + + if (is_swiotlb_buffer(paddr)) { + do_unmap_single(hwdev, phys_to_virt(paddr), size, dir); + return; + } + + if (dir != DMA_FROM_DEVICE) + return; + + /* + * phys_to_virt doesn''t work with hihgmem page but we could + * call dma_mark_clean() with hihgmem page here. However, we + * are fine since dma_mark_clean() is null on POWERPC. We can + * make dma_mark_clean() take a physical address if necessary. + */ + dma_mark_clean(phys_to_virt(paddr), size); +} + +void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr, + size_t size, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + unmap_single(hwdev, dev_addr, size, dir); +} +EXPORT_SYMBOL_GPL(swiotlb_unmap_page); + +/* + * Make physical memory consistent for a single streaming mode DMA translation + * after a transfer. + * + * If you perform a swiotlb_map_page() but wish to interrogate the buffer + * using the cpu, yet do not wish to teardown the dma mapping, you must + * call this function before doing so. At the next point you give the dma + * address back to the card, you must first perform a + * swiotlb_dma_sync_for_device, and then the device again owns the buffer + */ +static void +swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr, + size_t size, int dir, int target) +{ + phys_addr_t paddr = dma_to_phys(hwdev, dev_addr); + + BUG_ON(dir == DMA_NONE); + + if (is_swiotlb_buffer(paddr)) { + do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target); + return; + } + + if (dir != DMA_FROM_DEVICE) + return; + + dma_mark_clean(phys_to_virt(paddr), size); +} + +void +swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr, + size_t size, enum dma_data_direction dir) +{ + swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU); +} +EXPORT_SYMBOL(swiotlb_sync_single_for_cpu); + +void +swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr, + size_t size, enum dma_data_direction dir) +{ + swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE); +} +EXPORT_SYMBOL(swiotlb_sync_single_for_device); + +/* + * Same as above, but for a sub-range of the mapping. + */ +static void +swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr, + unsigned long offset, size_t size, + int dir, int target) +{ + swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target); +} + +void +swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr, + unsigned long offset, size_t size, + enum dma_data_direction dir) +{ + swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, + SYNC_FOR_CPU); +} +EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu); + +void +swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr, + unsigned long offset, size_t size, + enum dma_data_direction dir) +{ + swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir, + SYNC_FOR_DEVICE); +} +EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device); + +/* + * Map a set of buffers described by scatterlist in streaming mode for DMA. + * This is the scatter-gather version of the above swiotlb_map_page + * interface. Here the scatter gather list elements are each tagged with the + * appropriate dma address and length. They are obtained via + * sg_dma_{address,length}(SG). + * + * NOTE: An implementation may be able to use a smaller number of + * DMA address/length pairs than there are SG table elements. + * (for example via virtual mapping capabilities) + * The routine returns the number of addr/length pairs actually + * used, at most nents. + * + * Device ownership issues as mentioned above for swiotlb_map_page are the + * same here. + */ +int +swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems, + enum dma_data_direction dir, struct dma_attrs *attrs) +{ + unsigned long start_dma_addr; + struct scatterlist *sg; + int i; + + BUG_ON(dir == DMA_NONE); + + start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start); + for_each_sg(sgl, sg, nelems, i) { + phys_addr_t paddr = sg_phys(sg); + dma_addr_t dev_addr = phys_to_dma(hwdev, paddr); + + if (swiotlb_force || + !dma_capable(hwdev, dev_addr, sg->length)) { + void *map = do_map_single(hwdev, sg_phys(sg), + start_dma_addr, + sg->length, dir); + if (!map) { + /* Don''t panic here, we expect map_sg users + to do proper error handling. */ + swiotlb_full(hwdev, sg->length, dir, 0); + swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir, + attrs); + sgl[0].dma_length = 0; + return 0; + } + sg->dma_address = swiotlb_virt_to_bus(hwdev, map); + } else + sg->dma_address = dev_addr; + sg->dma_length = sg->length; + } + return nelems; +} +EXPORT_SYMBOL(swiotlb_map_sg_attrs); + +int +swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, + int dir) +{ + return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL); +} +EXPORT_SYMBOL(swiotlb_map_sg); + +/* + * Unmap a set of streaming mode DMA translations. Again, cpu read rules + * concerning calls here are the same as for swiotlb_unmap_page() above. + */ +void +swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl, + int nelems, enum dma_data_direction dir, + struct dma_attrs *attrs) +{ + struct scatterlist *sg; + int i; + + BUG_ON(dir == DMA_NONE); + + for_each_sg(sgl, sg, nelems, i) + unmap_single(hwdev, sg->dma_address, sg->dma_length, dir); + +} +EXPORT_SYMBOL(swiotlb_unmap_sg_attrs); + +void +swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems, + int dir) +{ + return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL); +} +EXPORT_SYMBOL(swiotlb_unmap_sg); + +/* + * Make physical memory consistent for a set of streaming mode DMA translations + * after a transfer. + * + * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules + * and usage. + */ +static void +swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl, + int nelems, int dir, int target) +{ + struct scatterlist *sg; + int i; + + for_each_sg(sgl, sg, nelems, i) + swiotlb_sync_single(hwdev, sg->dma_address, + sg->dma_length, dir, target); +} + +void +swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg, + int nelems, enum dma_data_direction dir) +{ + swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU); +} +EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu); + +void +swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg, + int nelems, enum dma_data_direction dir) +{ + swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE); +} +EXPORT_SYMBOL(swiotlb_sync_sg_for_device); + +int +swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr) +{ + return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer)); +} +EXPORT_SYMBOL(swiotlb_dma_mapping_error); + +/* + * Return whether the given device DMA address mask can be supported + * properly. For example, if your device can only drive the low 24-bits + * during bus mastering, then you would pass 0x00ffffff as the mask to + * this function. + */ +int +swiotlb_dma_supported(struct device *hwdev, u64 mask) +{ + return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask; +} +EXPORT_SYMBOL(swiotlb_dma_supported); -- 1.6.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, 3 Feb 2010 12:08:01 -0500 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> Attached is a set of eleven RFC patches that split the SWIOTLB library in > two layers: core, and dma_ops related functions.What''s the point of splitting swiotlb.c? Why can''t you just export some of functions in swiotlb.c? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, Feb 04, 2010 at 09:17:31AM +0900, FUJITA Tomonori wrote:> On Wed, 3 Feb 2010 12:08:01 -0500 > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > Attached is a set of eleven RFC patches that split the SWIOTLB library in > > two layers: core, and dma_ops related functions. > > What''s the point of splitting swiotlb.c? Why can''t you just export > some of functions in swiotlb.c?I was emulating some of the other libraries that are in the kernel, where the core functionality was in -core.c file and the users of it are in subsequent once (libata). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, Feb 03, 2010 at 10:07:49PM -0500, Konrad Rzeszutek Wilk wrote:> On Thu, Feb 04, 2010 at 09:17:31AM +0900, FUJITA Tomonori wrote: > > On Wed, 3 Feb 2010 12:08:01 -0500 > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > > > Attached is a set of eleven RFC patches that split the SWIOTLB library in > > > two layers: core, and dma_ops related functions. > > > > What''s the point of splitting swiotlb.c? Why can''t you just export > > some of functions in swiotlb.c? > > I was emulating some of the other libraries that are in the kernel, > where the core functionality was in -core.c file and the users of it > are in subsequent once (libata).Thought there is one thing Jens Axboe mentioned that I didn''t think off: Keep it as simple and as few. I''ve redone the patches, this time without the splitting and have exported the symbols. The git tree is: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6.git swiotlb-0.5 And the LKML posting is: https://lists.linux-foundation.org/pipermail/iommu/2010-February/002066.html> _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/iommu_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel