thr3ads.net - Xen devel - [Xen-devel] [RFC SWIOTLB-0.2] [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

[Xen-devel] [RFC SWIOTLB-0.2]

Fujita-san et al.

Attached is a set of fifteen RFC patches that separate the address translation
(virt_to_phys, virt_to_bus, etc) from the SWIOTLB library.

The idea behind this set of patches is to make it possible to have separate
mechanisms for translating virtual to physical or virtual to DMA addresses
on platforms which need an SWIOTLB, and where physical != PCI bus address.

One  customers of this, is the pv-ops project, which can switch between
different modes of operation depending in which environment it is running. One
of the usage model is the PCI pass-through in a virtualized environment
where an IOMMU is required since the Linux kernel''s idea of physical
address
is not the real physical address (worst yet, the PFNs that look like they
are under 4GB, could be actually pointing above 4GB - presents an interesting
set of bugs). Pv-ops kernel provides a set address translation mechanisms
to translate physical (PFNs) to real-physical (Machine Frame Number) frame
numbers (and vice-versa).

For the IOMMU, one solution has been to wholesale copy the SWIOTLB, stick it in
arch/x86/xen/swiotlb.c and modify the virt_to_phys, phys_to_virt and others
to use the Xen address translation functions. Unfortunately, since the kernel
can
run on bare-metal, there is big code overlap with the real SWIOTLB.
(git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git
xen/dom0/swiotlb-new)

Another approach, which this set of patches explores, is to abstract the
address translation and address determination functions away from the
SWIOTLB book-keeping functions. This way the core SWIOTLB library functions
are present in one place, while the address related functions are in
a separate library for different run-time platforms. I would very much
appreciate input on this idea and the set of patches.


The set of fifteen patches is also accessible on:

git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6.git
swiotlb-rfc-0.2

An example of how this can be utilized in both bare-metal and Xen environments
is this git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git swiotlb-xen-0.2

Sincerely,

Konrad Rzeszutek Wilk

 include/linux/swiotlb.h |   97 ++++++++++++
 lib/Makefile            |    2 +-
 lib/swiotlb-default.c   |  263 ++++++++++++++++++++++++++++++++
 lib/swiotlb.c           |  389 ++++++++++++++---------------------------------
 4 files changed, 477 insertions(+), 274 deletions(-)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

Before this patch, if you specified ''swiotlb=force,1024'' it
would
ignore both arguments. This fixes it and allows the user specify it
in any order (or none at all).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 437eedb..e6d9e32 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -102,16 +102,18 @@ static int late_alloc;
 static int __init
 setup_io_tlb_npages(char *str)
 {
-	if (isdigit(*str)) {
-		io_tlb_nslabs = simple_strtoul(str, &str, 0);
-		/* avoid tail segment of size < IO_TLB_SEGSIZE */
-		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+	while (*str) {
+		if (isdigit(*str)) {
+			io_tlb_nslabs = simple_strtoul(str, &str, 0);
+			/* avoid tail segment of size < IO_TLB_SEGSIZE */
+			io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+		}
+		if (!strncmp(str, "force", 5))
+			swiotlb_force = 1;
+		str += strcspn(str, ",");
+		if (*str == '','')
+			++str;
 	}
-	if (*str == '','')
-		++str;
-	if (!strcmp(str, "force"))
-		swiotlb_force = 1;
-
 	return 1;
 }
 __setup("swiotlb=", setup_io_tlb_npages);
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

The structure contains all of the existing variables used in
software IO TLB (swiotlb.c) collected within a structure.

Additionally a name variable and a deconstructor (release) function
variable is defined for API usages.

The other set of functions: is_swiotlb_buffer, dma_capable, phys_to_bus,
bus_to_phys, virt_to_bus, and bus_to_virt server as a method to abstract
them out of the SWIOTLB library.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/swiotlb.h |   94 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 94 insertions(+), 0 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index febedcf..781c3aa 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -24,6 +24,100 @@ extern int swiotlb_force;
 
 extern void swiotlb_init(int verbose);
 
+struct swiotlb_engine {
+
+	/*
+	 * Name of the engine (ie: "Software IO TLB")
+	 */
+	const char	*name;
+
+	/*
+	 * Used to do a quick range check in unmap_single and
+	 * sync_single_*, to see if the memory was in fact allocated by this
+	 * API.
+	 */
+	char *start;
+	char *end;
+
+	/*
+	 * The number of IO TLB blocks (in groups of 64) betweeen start and
+	 * end.  This is command line adjustable via setup_io_tlb_npages.
+	 */
+	unsigned long nslabs;
+
+	/*
+	 * When the IOMMU overflows we return a fallback buffer.
+	 * This sets the size.
+	 */
+	unsigned long overflow;
+
+	void *overflow_buffer;
+
+	/*
+	 * This is a free list describing the number of free entries available
+	 * from each index
+	 */
+	unsigned int *list;
+
+	/*
+	 * Current marker in the start through end location. Is incremented
+	 * on each map and wraps around.
+	 */
+	unsigned int index;
+
+	/*
+	 * We need to save away the original address corresponding to a mapped
+	 * entry for the sync operations.
+	 */
+	phys_addr_t *orig_addr;
+
+	/*
+	 * IOMMU private data.
+	 */
+	void *priv;
+	/*
+	 * The API call to free a SWIOTLB engine if another wants to register
+	 * (or if want to turn SWIOTLB off altogether).
+	 * It is imperative that this function checks for existing DMA maps
+	 * and not release the IOTLB if there are out-standing maps.
+	 */
+	int (*release)(struct swiotlb_engine *);
+
+	/*
+	 * Is the DMA (Bus) address within our bounce buffer (start and end).
+	 */
+	int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr,
+				 phys_addr_t phys);
+
+	/*
+	 * Is the DMA (Bus) address reachable by the PCI device?.
+	 */
+	bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t);
+	/*
+	 *  Physical to bus (DMA) address translation. On
+	 *  most platforms this is an equivalent function.
+	 */
+	dma_addr_t (*phys_to_bus)(struct device *hwdev, phys_addr_t paddr);
+
+	/*
+	 * Bus (DMA) to physical address translation. On most
+	 * platforms this is an equivalant function.
+	 */
+	phys_addr_t (*bus_to_phys)(struct device *hwdev, dma_addr_t baddr);
+
+	/*
+	 * Virtual to bus (DMA) address translation. On most platforms
+	 * this is a call to __pa(address).
+	 */
+	dma_addr_t (*virt_to_bus)(struct device *hwdev, void *address);
+
+	/*
+	* Bus (DMA) to virtual address translation. On most platforms
+	* this is a call to __va(address).
+	*/
+	void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address);
+};
+
 extern void
 *swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 			dma_addr_t *dma_handle, gfp_t flags);
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.

We set the internal iommu_sw pointer to the passed in swiotlb_engine
structure. Obviously we also check if the existing iommu_sw is
set and if so, call iommu_sw->release before the switch-over.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/swiotlb.h |    2 +
 lib/swiotlb.c           |   48 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 781c3aa..3bc3c42 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -118,6 +118,8 @@ struct swiotlb_engine {
 	void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address);
 };
 
+int swiotlb_register_engine(struct swiotlb_engine *iommu);
+
 extern void
 *swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 			dma_addr_t *dma_handle, gfp_t flags);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index e6d9e32..e84f269 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -97,6 +97,12 @@ static phys_addr_t *io_tlb_orig_addr;
  */
 static DEFINE_SPINLOCK(io_tlb_lock);
 
+/*
+ * The software IOMMU this library will utilize.
+ */
+struct swiotlb_engine *iommu_sw;
+EXPORT_SYMBOL(iommu_sw);
+
 static int late_alloc;
 
 static int __init
@@ -126,6 +132,48 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
 	return phys_to_dma(hwdev, virt_to_phys(address));
 }
 
+/*
+ * Register a software IO TLB engine.
+ *
+ * The registration allows the software IO TLB functions in the
+ * swiotlb library to function properly.
+ *
+ * All the values in the iotlb structure must be set.
+ *
+ * If the registration fails, it is assumed that the caller will free
+ * all of the resources allocated in the swiotlb_engine structure.
+ */
+int swiotlb_register_engine(struct swiotlb_engine *iommu)
+{
+	if (!iommu || !iommu->name || !iommu->release) {
+		printk(KERN_ERR "DMA: Trying to register a SWIOTLB engine" \
+				" improperly!");
+		return -EINVAL;
+	}
+
+	if (iommu_sw && iommu_sw->name) {
+		int retval = -EINVAL;
+
+		/* ''release'' must check for out-standing DMAs and flush
them
+		 *  out or fail. */
+		if (iommu_sw->release)
+			retval = iommu_sw->release(iommu_sw);
+
+		if (retval) {
+			printk(KERN_ERR "DMA: %s cannot be released!\n",
+				iommu_sw->name);
+			return retval;
+		}
+		printk(KERN_INFO "DMA: Replacing [%s] with [%s]\n",
+			iommu_sw->name, iommu->name);
+	}
+
+	iommu_sw = iommu;
+
+	return 0;
+}
+EXPORT_SYMBOL(swiotlb_register_engine);
+
 void swiotlb_print_info(void)
 {
 	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/

We also fix the checkpatch.pl errors that surfaced during
this conversion.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |  204 +++++++++++++++++++++++++++++----------------------------
 1 files changed, 104 insertions(+), 100 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index e84f269..3499001 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -176,14 +176,14 @@ EXPORT_SYMBOL(swiotlb_register_engine);
 
 void swiotlb_print_info(void)
 {
-	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+	unsigned long bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
 	phys_addr_t pstart, pend;
 
-	pstart = virt_to_phys(io_tlb_start);
-	pend = virt_to_phys(io_tlb_end);
+	pstart = virt_to_phys(iommu_sw->start);
+	pend = virt_to_phys(iommu_sw->end);
 
 	printk(KERN_INFO "Placing %luMB software IO TLB between %p - %p\n",
-	       bytes >> 20, io_tlb_start, io_tlb_end);
+	       bytes >> 20, iommu_sw->start, iommu_sw->end);
 	printk(KERN_INFO "software IO TLB at phys %#llx - %#llx\n",
 	       (unsigned long long)pstart,
 	       (unsigned long long)pend);
@@ -198,37 +198,38 @@ swiotlb_init_with_default_size(size_t default_size, int
verbose)
 {
 	unsigned long i, bytes;
 
-	if (!io_tlb_nslabs) {
-		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+	if (!iommu_sw->nslabs) {
+		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
+		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
 	}
 
-	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
 
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	io_tlb_start = alloc_bootmem_low_pages(bytes);
-	if (!io_tlb_start)
+	iommu_sw->start = alloc_bootmem_low_pages(bytes);
+	if (!iommu_sw->start)
 		panic("Cannot allocate SWIOTLB buffer");
-	io_tlb_end = io_tlb_start + bytes;
+	iommu_sw->end = iommu_sw->start + bytes;
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
-	 * between io_tlb_start and io_tlb_end.
+	 * between iommu_sw->start and iommu_sw->end.
 	 */
-	io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int));
-	for (i = 0; i < io_tlb_nslabs; i++)
- 		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	io_tlb_index = 0;
-	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t));
+	iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int));
+	for (i = 0; i < iommu_sw->nslabs; i++)
+		iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+	iommu_sw->index = 0;
+	iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs *
+					    sizeof(phys_addr_t));
 
 	/*
 	 * Get the overflow emergency buffer
 	 */
-	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
-	if (!io_tlb_overflow_buffer)
+	iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow);
+	if (!iommu_sw->overflow_buffer)
 		panic("Cannot allocate SWIOTLB overflow buffer!\n");
 	if (verbose)
 		swiotlb_print_info();
@@ -248,70 +249,70 @@ swiotlb_init(int verbose)
 int
 swiotlb_late_init_with_default_size(size_t default_size)
 {
-	unsigned long i, bytes, req_nslabs = io_tlb_nslabs;
+	unsigned long i, bytes, req_nslabs = iommu_sw->nslabs;
 	unsigned int order;
 
-	if (!io_tlb_nslabs) {
-		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+	if (!iommu_sw->nslabs) {
+		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
+		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
 	}
 
 	/*
 	 * Get IO TLB memory from the low pages
 	 */
-	order = get_order(io_tlb_nslabs << IO_TLB_SHIFT);
-	io_tlb_nslabs = SLABS_PER_PAGE << order;
-	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+	order = get_order(iommu_sw->nslabs << IO_TLB_SHIFT);
+	iommu_sw->nslabs = SLABS_PER_PAGE << order;
+	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
 
 	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-		io_tlb_start = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
-							order);
-		if (io_tlb_start)
+		iommu_sw->start = (void *)__get_free_pages(GFP_DMA |
+							__GFP_NOWARN, order);
+		if (iommu_sw->start)
 			break;
 		order--;
 	}
 
-	if (!io_tlb_start)
+	if (!iommu_sw->start)
 		goto cleanup1;
 
 	if (order != get_order(bytes)) {
 		printk(KERN_WARNING "Warning: only able to allocate %ld MB "
 		       "for software IO TLB\n", (PAGE_SIZE << order) >>
20);
-		io_tlb_nslabs = SLABS_PER_PAGE << order;
-		bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+		iommu_sw->nslabs = SLABS_PER_PAGE << order;
+		bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
 	}
-	io_tlb_end = io_tlb_start + bytes;
-	memset(io_tlb_start, 0, bytes);
+	iommu_sw->end = iommu_sw->start + bytes;
+	memset(iommu_sw->start, 0, bytes);
 
 	/*
 	 * Allocate and initialize the free list array.  This array is used
 	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
-	 * between io_tlb_start and io_tlb_end.
+	 * between iommu_sw->start and iommu_sw->end.
 	 */
-	io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL,
-	                              get_order(io_tlb_nslabs * sizeof(int)));
-	if (!io_tlb_list)
+	iommu_sw->list = (unsigned int *)__get_free_pages(GFP_KERNEL,
+				get_order(iommu_sw->nslabs * sizeof(int)));
+	if (!iommu_sw->list)
 		goto cleanup2;
 
-	for (i = 0; i < io_tlb_nslabs; i++)
- 		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	io_tlb_index = 0;
+	for (i = 0; i < iommu_sw->nslabs; i++)
+		iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+	iommu_sw->index = 0;
 
-	io_tlb_orig_addr = (phys_addr_t *)
+	iommu_sw->orig_addr = (phys_addr_t *)
 		__get_free_pages(GFP_KERNEL,
-				 get_order(io_tlb_nslabs *
+				 get_order(iommu_sw->nslabs *
 					   sizeof(phys_addr_t)));
-	if (!io_tlb_orig_addr)
+	if (!iommu_sw->orig_addr)
 		goto cleanup3;
 
-	memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t));
+	memset(iommu_sw->orig_addr, 0, iommu_sw->nslabs * sizeof(phys_addr_t));
 
 	/*
 	 * Get the overflow emergency buffer
 	 */
-	io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
-	                                          get_order(io_tlb_overflow));
-	if (!io_tlb_overflow_buffer)
+	iommu_sw->overflow_buffer = (void *)__get_free_pages(GFP_DMA,
+						get_order(iommu_sw->overflow));
+	if (!iommu_sw->overflow_buffer)
 		goto cleanup4;
 
 	swiotlb_print_info();
@@ -321,52 +322,52 @@ swiotlb_late_init_with_default_size(size_t default_size)
 	return 0;
 
 cleanup4:
-	free_pages((unsigned long)io_tlb_orig_addr,
-		   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
-	io_tlb_orig_addr = NULL;
+	free_pages((unsigned long)iommu_sw->orig_addr,
+		   get_order(iommu_sw->nslabs * sizeof(phys_addr_t)));
+	iommu_sw->orig_addr = NULL;
 cleanup3:
-	free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
+	free_pages((unsigned long)iommu_sw->list, get_order(iommu_sw->nslabs *
 	                                                 sizeof(int)));
-	io_tlb_list = NULL;
+	iommu_sw->list = NULL;
 cleanup2:
-	io_tlb_end = NULL;
-	free_pages((unsigned long)io_tlb_start, order);
-	io_tlb_start = NULL;
+	iommu_sw->end = NULL;
+	free_pages((unsigned long)iommu_sw->start, order);
+	iommu_sw->start = NULL;
 cleanup1:
-	io_tlb_nslabs = req_nslabs;
+	iommu_sw->nslabs = req_nslabs;
 	return -ENOMEM;
 }
 
 void __init swiotlb_free(void)
 {
-	if (!io_tlb_overflow_buffer)
+	if (!iommu_sw->overflow_buffer)
 		return;
 
 	if (late_alloc) {
-		free_pages((unsigned long)io_tlb_overflow_buffer,
-			   get_order(io_tlb_overflow));
-		free_pages((unsigned long)io_tlb_orig_addr,
-			   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
-		free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
-								 sizeof(int)));
-		free_pages((unsigned long)io_tlb_start,
-			   get_order(io_tlb_nslabs << IO_TLB_SHIFT));
+		free_pages((unsigned long)iommu_sw->overflow_buffer,
+			   get_order(iommu_sw->overflow));
+		free_pages((unsigned long)iommu_sw->orig_addr,
+			   get_order(iommu_sw->nslabs * sizeof(phys_addr_t)));
+		free_pages((unsigned long)iommu_sw->list,
+			   get_order(iommu_sw->nslabs * sizeof(int)));
+		free_pages((unsigned long)iommu_sw->start,
+			   get_order(iommu_sw->nslabs << IO_TLB_SHIFT));
 	} else {
-		free_bootmem_late(__pa(io_tlb_overflow_buffer),
-				  io_tlb_overflow);
-		free_bootmem_late(__pa(io_tlb_orig_addr),
-				  io_tlb_nslabs * sizeof(phys_addr_t));
-		free_bootmem_late(__pa(io_tlb_list),
-				  io_tlb_nslabs * sizeof(int));
-		free_bootmem_late(__pa(io_tlb_start),
-				  io_tlb_nslabs << IO_TLB_SHIFT);
+		free_bootmem_late(__pa(iommu_sw->overflow_buffer),
+				  iommu_sw->overflow);
+		free_bootmem_late(__pa(iommu_sw->orig_addr),
+				  iommu_sw->nslabs * sizeof(phys_addr_t));
+		free_bootmem_late(__pa(iommu_sw->list),
+				  iommu_sw->nslabs * sizeof(int));
+		free_bootmem_late(__pa(iommu_sw->start),
+				  iommu_sw->nslabs << IO_TLB_SHIFT);
 	}
 }
 
 static int is_swiotlb_buffer(phys_addr_t paddr)
 {
-	return paddr >= virt_to_phys(io_tlb_start) &&
-		paddr < virt_to_phys(io_tlb_end);
+	return paddr >= virt_to_phys(iommu_sw->start) &&
+		paddr < virt_to_phys(iommu_sw->end);
 }
 
 /*
@@ -426,7 +427,7 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t
size, int dir)
 	unsigned long max_slots;
 
 	mask = dma_get_seg_boundary(hwdev);
-	start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start) & mask;
+	start_dma_addr = swiotlb_virt_to_bus(hwdev, iommu_sw->start) & mask;
 
 	offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
 
@@ -454,8 +455,8 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t
size, int dir)
 	 * request and allocate a buffer from that IO TLB pool.
 	 */
 	spin_lock_irqsave(&io_tlb_lock, flags);
-	index = ALIGN(io_tlb_index, stride);
-	if (index >= io_tlb_nslabs)
+	index = ALIGN(iommu_sw->index, stride);
+	if (index >= iommu_sw->nslabs)
 		index = 0;
 	wrap = index;
 
@@ -463,7 +464,7 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t
size, int dir)
 		while (iommu_is_span_boundary(index, nslots, offset_slots,
 					      max_slots)) {
 			index += stride;
-			if (index >= io_tlb_nslabs)
+			if (index >= iommu_sw->nslabs)
 				index = 0;
 			if (index == wrap)
 				goto not_found;
@@ -474,26 +475,27 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t
size, int dir)
 		 * contiguous buffers, we allocate the buffers from that slot
 		 * and mark the entries as ''0'' indicating unavailable.
 		 */
-		if (io_tlb_list[index] >= nslots) {
+		if (iommu_sw->list[index] >= nslots) {
 			int count = 0;
 
 			for (i = index; i < (int) (index + nslots); i++)
-				io_tlb_list[i] = 0;
-			for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1)
&& io_tlb_list[i]; i--)
-				io_tlb_list[i] = ++count;
-			dma_addr = io_tlb_start + (index << IO_TLB_SHIFT);
+				iommu_sw->list[i] = 0;
+			for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+				IO_TLB_SEGSIZE - 1)
&& iommu_sw->list[i]; i--)
+				iommu_sw->list[i] = ++count;
+			dma_addr = iommu_sw->start + (index << IO_TLB_SHIFT);
 
 			/*
 			 * Update the indices to avoid searching in the next
 			 * round.
 			 */
-			io_tlb_index = ((index + nslots) < io_tlb_nslabs
+			iommu_sw->index = ((index + nslots) < iommu_sw->nslabs
 					? (index + nslots) : 0);
 
 			goto found;
 		}
 		index += stride;
-		if (index >= io_tlb_nslabs)
+		if (index >= iommu_sw->nslabs)
 			index = 0;
 	} while (index != wrap);
 
@@ -509,7 +511,7 @@ found:
 	 * needed.
 	 */
 	for (i = 0; i < nslots; i++)
-		io_tlb_orig_addr[index+i] = phys + (i << IO_TLB_SHIFT);
+		iommu_sw->orig_addr[index+i] = phys + (i << IO_TLB_SHIFT);
 	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
 		swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE);
 
@@ -524,8 +526,8 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t
size, int dir)
 {
 	unsigned long flags;
 	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
-	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	phys_addr_t phys = io_tlb_orig_addr[index];
+	int index = (dma_addr - iommu_sw->start) >> IO_TLB_SHIFT;
+	phys_addr_t phys = iommu_sw->orig_addr[index];
 
 	/*
 	 * First, sync the memory before unmapping the entry
@@ -542,19 +544,20 @@ do_unmap_single(struct device *hwdev, char *dma_addr,
size_t size, int dir)
 	spin_lock_irqsave(&io_tlb_lock, flags);
 	{
 		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[index + nslots] : 0);
+			 iommu_sw->list[index + nslots] : 0);
 		/*
 		 * Step 1: return the slots to the free list, merging the
 		 * slots with superceeding slots
 		 */
 		for (i = index + nslots - 1; i >= index; i--)
-			io_tlb_list[i] = ++count;
+			iommu_sw->list[i] = ++count;
 		/*
 		 * Step 2: merge the returned slots with the preceding slots,
 		 * if available (non zero)
 		 */
-		for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE -1)
&& io_tlb_list[i]; i--)
-			io_tlb_list[i] = ++count;
+		for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+				IO_TLB_SEGSIZE - 1)
&& iommu_sw->list[i]; i--)
+			iommu_sw->list[i] = ++count;
 	}
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 }
@@ -563,8 +566,8 @@ static void
 sync_single(struct device *hwdev, char *dma_addr, size_t size,
 	    int dir, int target)
 {
-	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	phys_addr_t phys = io_tlb_orig_addr[index];
+	int index = (dma_addr - iommu_sw->start) >> IO_TLB_SHIFT;
+	phys_addr_t phys = iommu_sw->orig_addr[index];
 
 	phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1));
 
@@ -663,7 +666,7 @@ swiotlb_full(struct device *dev, size_t size, int dir, int
do_panic)
 	printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at "
 	       "device %s\n", size, dev ? dev_name(dev) : "?");
 
-	if (size <= io_tlb_overflow || !do_panic)
+	if (size <= iommu_sw->overflow || !do_panic)
 		return;
 
 	if (dir == DMA_BIDIRECTIONAL)
@@ -705,7 +708,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 	map = map_single(dev, phys, size, dir);
 	if (!map) {
 		swiotlb_full(dev, size, dir, 1);
-		map = io_tlb_overflow_buffer;
+		map = iommu_sw->overflow_buffer;
 	}
 
 	dev_addr = swiotlb_virt_to_bus(dev, map);
@@ -960,7 +963,8 @@ EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
 int
 swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
 {
-	return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
+	return (dma_addr == swiotlb_virt_to_bus(hwdev,
+						iommu_sw->overflow_buffer));
 }
 EXPORT_SYMBOL(swiotlb_dma_mapping_error);
 
@@ -973,6 +977,6 @@ EXPORT_SYMBOL(swiotlb_dma_mapping_error);
 int
 swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
-	return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask;
+	return swiotlb_virt_to_bus(hwdev, iommu_sw->end - 1) <= mask;
 }
 EXPORT_SYMBOL(swiotlb_dma_supported);
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 05/15] [swiotlb] Respect the io_tlb_nslabs argument value.

The search and replace removed the option to override
the amount of slabs via swiotlb=<x> argument. This puts
it back in.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 3499001..cf29f03 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -198,10 +198,11 @@ swiotlb_init_with_default_size(size_t default_size, int
verbose)
 {
 	unsigned long i, bytes;
 
-	if (!iommu_sw->nslabs) {
+	if (!io_tlb_nslabs) {
 		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
 		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
-	}
+	} else
+		iommu_sw->nslabs = io_tlb_nslabs;
 
 	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
 
@@ -252,10 +253,11 @@ swiotlb_late_init_with_default_size(size_t default_size)
 	unsigned long i, bytes, req_nslabs = iommu_sw->nslabs;
 	unsigned int order;
 
-	if (!iommu_sw->nslabs) {
+	if (!io_tlb_nslabs) {
 		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
 		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
-	}
+	} else
+		iommu_sw->nslabs = io_tlb_nslabs;
 
 	/*
 	 * Get IO TLB memory from the low pages
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.

For baselevel support we define required functions and fill out
variables.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   22 ++++++++++++++++++++++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index cf29f03..3c7bd4e 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -132,6 +132,11 @@ static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
 	return phys_to_dma(hwdev, virt_to_phys(address));
 }
 
+static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr)
+{
+	return phys_to_virt(dma_to_phys(hwdev, dev_addr));
+};
+
 /*
  * Register a software IO TLB engine.
  *
@@ -236,9 +241,26 @@ swiotlb_init_with_default_size(size_t default_size, int
verbose)
 		swiotlb_print_info();
 }
 
+static int swiotlb_release(struct swiotlb_engine *iotlb)
+{
+	swiotlb_free();
+	return 0;
+}
+
+static struct swiotlb_engine swiotlb_ops = {
+	.name = "software IO TLB",
+	.overflow = 32 * 1024,
+	.release = swiotlb_release,
+	.phys_to_bus =  phys_to_dma,
+	.bus_to_phys = dma_to_phys,
+	.virt_to_bus = swiotlb_virt_to_bus,
+	.bus_to_virt = swiotlb_bus_to_virt,
+};
+
 void __init
 swiotlb_init(int verbose)
 {
+	swiotlb_register_engine(&swiotlb_ops);
 	swiotlb_init_with_default_size(64 * (1<<20), verbose);	/* default to
64MB */
 }
 
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

Previous to the usage of the iommu_sw pointer, we would check
if io_tlb_overflow_buffer was set and use that to determine whether
we had been called.

With the iommu_sw introduction, we just need to check whether that
pointer is not NULL.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 3c7bd4e..72c9abe 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -364,7 +364,7 @@ cleanup1:
 
 void __init swiotlb_free(void)
 {
-	if (!iommu_sw->overflow_buffer)
+	if (!iommu_sw)
 		return;
 
 	if (late_alloc) {
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 08/15] [swiotlb] Add ''is_swiotlb_buffer'' to the swiotlb_ops function decleration.

We move the ''io_swiotlb_buffer'' function before the
swiotlb_ops_
structure decleration to avoid compilation problems.

Also we replace the calls to is_swiotlb_buffer to go through the
iommu_sw->is_swiotlb_buffer function.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 72c9abe..688965d 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -246,11 +246,18 @@ static int swiotlb_release(struct swiotlb_engine *iotlb)
 	swiotlb_free();
 	return 0;
 }
+static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw,
+			     dma_addr_t dev_addr, phys_addr_t paddr)
+{
+	return paddr >= virt_to_phys(iommu_sw->start) &&
+		paddr < virt_to_phys(iommu_sw->end);
+}
 
 static struct swiotlb_engine swiotlb_ops = {
 	.name = "software IO TLB",
 	.overflow = 32 * 1024,
 	.release = swiotlb_release,
+	.is_swiotlb_buffer = is_swiotlb_buffer,
 	.phys_to_bus =  phys_to_dma,
 	.bus_to_phys = dma_to_phys,
 	.virt_to_bus = swiotlb_virt_to_bus,
@@ -388,11 +395,6 @@ void __init swiotlb_free(void)
 	}
 }
 
-static int is_swiotlb_buffer(phys_addr_t paddr)
-{
-	return paddr >= virt_to_phys(iommu_sw->start) &&
-		paddr < virt_to_phys(iommu_sw->end);
-}
 
 /*
  * Bounce: copy the swiotlb buffer back to the original dma location
@@ -669,7 +671,7 @@ swiotlb_free_coherent(struct device *hwdev, size_t size,
void *vaddr,
 	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
 
 	WARN_ON(irqs_disabled());
-	if (!is_swiotlb_buffer(paddr))
+	if (!iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr))
 		free_pages((unsigned long)vaddr, get_order(size));
 	else
 		/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
@@ -762,7 +764,7 @@ static void unmap_single(struct device *hwdev, dma_addr_t
dev_addr,
 
 	BUG_ON(dir == DMA_NONE);
 
-	if (is_swiotlb_buffer(paddr)) {
+	if (iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr)) {
 		do_unmap_single(hwdev, phys_to_virt(paddr), size, dir);
 		return;
 	}
@@ -805,7 +807,7 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t
dev_addr,
 
 	BUG_ON(dir == DMA_NONE);
 
-	if (is_swiotlb_buffer(paddr)) {
+	if (iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr)) {
 		sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
 		return;
 	}
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 09/15] [swiotlb] Add ''dma_capable'' to the swiotlb_ops structure.

And we also replace the ''dma_capable'' with
iommu_sw->dma_capable
to abstract the functionality of that function.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 688965d..4da8151 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -253,10 +253,18 @@ static int is_swiotlb_buffer(struct swiotlb_engine
*iommu_sw,
 		paddr < virt_to_phys(iommu_sw->end);
 }
 
+static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr,
+				phys_addr_t phys, size_t size)
+{
+	/* Phys is not neccessary in this case. */
+	return dma_capable(hwdev, dma_addr, size);
+}
+
 static struct swiotlb_engine swiotlb_ops = {
 	.name = "software IO TLB",
 	.overflow = 32 * 1024,
 	.release = swiotlb_release,
+	.dma_capable = swiotlb_dma_capable,
 	.is_swiotlb_buffer = is_swiotlb_buffer,
 	.phys_to_bus =  phys_to_dma,
 	.bus_to_phys = dma_to_phys,
@@ -725,7 +733,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 	 * we can safely return the device addr and not worry about bounce
 	 * buffering it.
 	 */
-	if (dma_capable(dev, dev_addr, size) && !swiotlb_force)
+	if (iommu_sw->dma_capable(dev, dev_addr, phys, size) &&
!swiotlb_force)
 		return dev_addr;
 
 	/*
@@ -742,7 +750,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 	/*
 	 * Ensure that the address returned is DMA''ble
 	 */
-	if (!dma_capable(dev, dev_addr, size))
+	if (!iommu_sw->dma_capable(dev, dev_addr, phys, size))
 		panic("map_single: bounce buffer is not DMA''ble");
 
 	return dev_addr;
@@ -895,7 +903,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct
scatterlist *sgl, int nelems,
 		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
 
 		if (swiotlb_force ||
-		    !dma_capable(hwdev, dev_addr, sg->length)) {
+		    !iommu_sw->dma_capable(hwdev, dev_addr, paddr, sg->length)) {
 			void *map = map_single(hwdev, sg_phys(sg),
 					       sg->length, dir);
 			if (!map) {
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:00 UTC

head link

[Xen-devel] [PATCH 10/15] [swiotlb] Replace the [phys, bus]->virt and virt->[bus, phys] functions with iommu_sw calls.

We replace all of the address translation calls to go through the
iommu_sw functions.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   26 +++++++++++++-------------
 1 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 4da8151..075b56c 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -127,7 +127,7 @@ __setup("swiotlb=", setup_io_tlb_npages);
 
 /* Note that this doesn''t work with highmem page */
 static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
-				      volatile void *address)
+				      void *address)
 {
 	return phys_to_dma(hwdev, virt_to_phys(address));
 }
@@ -461,7 +461,7 @@ map_single(struct device *hwdev, phys_addr_t phys, size_t
size, int dir)
 	unsigned long max_slots;
 
 	mask = dma_get_seg_boundary(hwdev);
-	start_dma_addr = swiotlb_virt_to_bus(hwdev, iommu_sw->start) & mask;
+	start_dma_addr = iommu_sw->virt_to_bus(hwdev, iommu_sw->start) &
mask;
 
 	offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
 
@@ -636,7 +636,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		dma_mask = hwdev->coherent_dma_mask;
 
 	ret = (void *)__get_free_pages(flags, order);
-	if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) {
+	if (ret && iommu_sw->virt_to_bus(hwdev, ret) + size - 1 >
dma_mask) {
 		/*
 		 * The allocated memory isn''t reachable by the device.
 		 */
@@ -655,7 +655,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 	}
 
 	memset(ret, 0, size);
-	dev_addr = swiotlb_virt_to_bus(hwdev, ret);
+	dev_addr = iommu_sw->virt_to_bus(hwdev, ret);
 
 	/* Confirm address can be DMA''d by device */
 	if (dev_addr + size - 1 > dma_mask) {
@@ -676,7 +676,7 @@ void
 swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
 		      dma_addr_t dev_addr)
 {
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+	phys_addr_t paddr = iommu_sw->bus_to_phys(hwdev, dev_addr);
 
 	WARN_ON(irqs_disabled());
 	if (!iommu_sw->is_swiotlb_buffer(iommu_sw, dev_addr, paddr))
@@ -724,7 +724,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 			    struct dma_attrs *attrs)
 {
 	phys_addr_t phys = page_to_phys(page) + offset;
-	dma_addr_t dev_addr = phys_to_dma(dev, phys);
+	dma_addr_t dev_addr = iommu_sw->phys_to_bus(dev, phys);
 	void *map;
 
 	BUG_ON(dir == DMA_NONE);
@@ -745,7 +745,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 		map = iommu_sw->overflow_buffer;
 	}
 
-	dev_addr = swiotlb_virt_to_bus(dev, map);
+	dev_addr = iommu_sw->virt_to_bus(dev, map);
 
 	/*
 	 * Ensure that the address returned is DMA''ble
@@ -768,7 +768,7 @@ EXPORT_SYMBOL_GPL(swiotlb_map_page);
 static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
 			 size_t size, int dir)
 {
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+	phys_addr_t paddr = iommu_sw->bus_to_phys(hwdev, dev_addr);
 
 	BUG_ON(dir == DMA_NONE);
 
@@ -811,7 +811,7 @@ static void
 swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
 		    size_t size, int dir, int target)
 {
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+	phys_addr_t paddr = iommu_sw->bus_to_phys(hwdev, dev_addr);
 
 	BUG_ON(dir == DMA_NONE);
 
@@ -900,7 +900,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct
scatterlist *sgl, int nelems,
 
 	for_each_sg(sgl, sg, nelems, i) {
 		phys_addr_t paddr = sg_phys(sg);
-		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
+		dma_addr_t dev_addr = iommu_sw->phys_to_bus(hwdev, paddr);
 
 		if (swiotlb_force ||
 		    !iommu_sw->dma_capable(hwdev, dev_addr, paddr, sg->length)) {
@@ -915,7 +915,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct
scatterlist *sgl, int nelems,
 				sgl[0].dma_length = 0;
 				return 0;
 			}
-			sg->dma_address = swiotlb_virt_to_bus(hwdev, map);
+			sg->dma_address = iommu_sw->virt_to_bus(hwdev, map);
 		} else
 			sg->dma_address = dev_addr;
 		sg->dma_length = sg->length;
@@ -997,7 +997,7 @@ EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
 int
 swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
 {
-	return (dma_addr == swiotlb_virt_to_bus(hwdev,
+	return (dma_addr == iommu_sw->virt_to_bus(hwdev,
 						iommu_sw->overflow_buffer));
 }
 EXPORT_SYMBOL(swiotlb_dma_mapping_error);
@@ -1011,6 +1011,6 @@ EXPORT_SYMBOL(swiotlb_dma_mapping_error);
 int
 swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
-	return swiotlb_virt_to_bus(hwdev, iommu_sw->end - 1) <= mask;
+	return iommu_sw->virt_to_bus(hwdev, iommu_sw->end - 1) <= mask;
 }
 EXPORT_SYMBOL(swiotlb_dma_supported);
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:01 UTC

head link

[Xen-devel] [PATCH 11/15] [swiotlb] Replace late_alloc with iommu_sw->priv usage.

We utilize the private placeholder to figure out whether
we are initialized late or early. Obviously the ->priv
can be expanded to point to a structure for more internal
data but for right now this all we need.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   11 +++++++----
 1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 075b56c..20df588 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -103,8 +103,6 @@ static DEFINE_SPINLOCK(io_tlb_lock);
 struct swiotlb_engine *iommu_sw;
 EXPORT_SYMBOL(iommu_sw);
 
-static int late_alloc;
-
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -239,6 +237,8 @@ swiotlb_init_with_default_size(size_t default_size, int
verbose)
 		panic("Cannot allocate SWIOTLB overflow buffer!\n");
 	if (verbose)
 		swiotlb_print_info();
+
+	iommu_sw->priv = NULL;
 }
 
 static int swiotlb_release(struct swiotlb_engine *iotlb)
@@ -356,7 +356,10 @@ swiotlb_late_init_with_default_size(size_t default_size)
 
 	swiotlb_print_info();
 
-	late_alloc = 1;
+	/* We utilize the private field to figure out whether we
+	 * were allocated late or early.
+	 */
+	iommu_sw->priv = (void *)1;
 
 	return 0;
 
@@ -382,7 +385,7 @@ void __init swiotlb_free(void)
 	if (!iommu_sw)
 		return;
 
-	if (late_alloc) {
+	if (iommu_sw->priv) {
 		free_pages((unsigned long)iommu_sw->overflow_buffer,
 			   get_order(iommu_sw->overflow));
 		free_pages((unsigned long)iommu_sw->orig_addr,
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:01 UTC

head link

[Xen-devel] [PATCH 12/15] [swiotlb] Remove un-used static declerations obsoleted by iommu_sw.

This includes the io_tlb_start, io_tlb_end, io_tlb_list, etc.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   29 +----------------------------
 1 files changed, 1 insertions(+), 28 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 20df588..c11dcb1 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -60,40 +60,13 @@ enum dma_sync_target {
 int swiotlb_force;
 
 /*
- * Used to do a quick range check in unmap_single and
- * sync_single_*, to see if the memory was in fact allocated by this
- * API.
- */
-static char *io_tlb_start, *io_tlb_end;
-
-/*
  * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and
  * io_tlb_end.  This is command line adjustable via setup_io_tlb_npages.
  */
 static unsigned long io_tlb_nslabs;
 
 /*
- * When the IOMMU overflows we return a fallback buffer. This sets the size.
- */
-static unsigned long io_tlb_overflow = 32*1024;
-
-void *io_tlb_overflow_buffer;
-
-/*
- * This is a free list describing the number of free entries available from
- * each index
- */
-static unsigned int *io_tlb_list;
-static unsigned int io_tlb_index;
-
-/*
- * We need to save away the original address corresponding to a mapped entry
- * for the sync operations.
- */
-static phys_addr_t *io_tlb_orig_addr;
-
-/*
- * Protect the above data structures in the map and unmap calls
+ * Protect the iommu_sw data structures in the map and unmap calls
  */
 static DEFINE_SPINLOCK(io_tlb_lock);
 
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:01 UTC

head link

[Xen-devel] [PATCH 13/15] [swiotlb] Make io_tlb_nslabs visible outside lib/swiotlb.c and rename it.

We rename it to something more generic: swiotlb_nslabs and make it
visible outside the lib/swiotlb.c library.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/swiotlb.h |    1 +
 lib/swiotlb.c           |   14 +++++++-------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 3bc3c42..23739b0 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -9,6 +9,7 @@ struct scatterlist;
 
 extern int swiotlb_force;
 
+extern unsigned long swiotlb_nslabs;
 /*
  * Maximum allowable number of contiguous slabs to map,
  * must be a power of 2.  What is the appropriate value ?
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index c11dcb1..8e65cee 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -63,7 +63,7 @@ int swiotlb_force;
  * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and
  * io_tlb_end.  This is command line adjustable via setup_io_tlb_npages.
  */
-static unsigned long io_tlb_nslabs;
+unsigned long swiotlb_nslabs;
 
 /*
  * Protect the iommu_sw data structures in the map and unmap calls
@@ -81,9 +81,9 @@ setup_io_tlb_npages(char *str)
 {
 	while (*str) {
 		if (isdigit(*str)) {
-			io_tlb_nslabs = simple_strtoul(str, &str, 0);
+			swiotlb_nslabs = simple_strtoul(str, &str, 0);
 			/* avoid tail segment of size < IO_TLB_SEGSIZE */
-			io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+			swiotlb_nslabs = ALIGN(swiotlb_nslabs, IO_TLB_SEGSIZE);
 		}
 		if (!strncmp(str, "force", 5))
 			swiotlb_force = 1;
@@ -174,11 +174,11 @@ swiotlb_init_with_default_size(size_t default_size, int
verbose)
 {
 	unsigned long i, bytes;
 
-	if (!io_tlb_nslabs) {
+	if (!swiotlb_nslabs) {
 		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
 		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
 	} else
-		iommu_sw->nslabs = io_tlb_nslabs;
+		iommu_sw->nslabs = swiotlb_nslabs;
 
 	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
 
@@ -263,11 +263,11 @@ swiotlb_late_init_with_default_size(size_t default_size)
 	unsigned long i, bytes, req_nslabs = iommu_sw->nslabs;
 	unsigned int order;
 
-	if (!io_tlb_nslabs) {
+	if (!swiotlb_nslabs) {
 		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
 		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
 	} else
-		iommu_sw->nslabs = io_tlb_nslabs;
+		iommu_sw->nslabs = swiotlb_nslabs;
 
 	/*
 	 * Get IO TLB memory from the low pages
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:01 UTC

head link

[Xen-devel] [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

We move all of the initialization functions and as well
all functions defined in the swiotlb_ops to a seperate file.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/Makefile          |    2 +-
 lib/swiotlb-default.c |  242 +++++++++++++++++++++++++++++++++++++++++++++++++
 lib/swiotlb.c         |  231 +----------------------------------------------
 3 files changed, 245 insertions(+), 230 deletions(-)
 create mode 100644 lib/swiotlb-default.c

diff --git a/lib/Makefile b/lib/Makefile
index 347ad8d..fd96891 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -77,7 +77,7 @@ obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o
 obj-$(CONFIG_SMP) += percpu_counter.o
 obj-$(CONFIG_AUDIT_GENERIC) += audit.o
 
-obj-$(CONFIG_SWIOTLB) += swiotlb.o
+obj-$(CONFIG_SWIOTLB) += swiotlb.o swiotlb-default.o
 obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o
 obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
diff --git a/lib/swiotlb-default.c b/lib/swiotlb-default.c
new file mode 100644
index 0000000..c490fcf
--- /dev/null
+++ b/lib/swiotlb-default.c
@@ -0,0 +1,242 @@
+
+#include <linux/dma-mapping.h>
+#include <linux/swiotlb.h>
+#include <linux/bootmem.h>
+
+
+#define OFFSET(val, align) ((unsigned long)	\
+				 (val) & ((align) - 1))
+
+#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
+
+/*
+ * Minimum IO TLB size to bother booting with.  Systems with mainly
+ * 64bit capable cards will only lightly use the swiotlb.  If we can''t
+ * allocate a contiguous 1MB, we''re probably in trouble anyway.
+ */
+#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
+
+/* Note that this doesn''t work with highmem page */
+static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
+				      void *address)
+{
+	return phys_to_dma(hwdev, virt_to_phys(address));
+}
+
+static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr)
+{
+	return phys_to_virt(dma_to_phys(hwdev, dev_addr));
+};
+
+/*
+ * Statically reserve bounce buffer space and initialize bounce buffer data
+ * structures for the software IO TLB used to implement the DMA API.
+ */
+void __init
+swiotlb_init_with_default_size(struct swiotlb_engine *iommu_sw,
+			       size_t default_size, int verbose)
+{
+	unsigned long i, bytes;
+
+	if (!swiotlb_nslabs) {
+		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
+		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
+	} else
+		iommu_sw->nslabs = swiotlb_nslabs;
+
+	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
+
+	/*
+	 * Get IO TLB memory from the low pages
+	 */
+	iommu_sw->start = alloc_bootmem_low_pages(bytes);
+	if (!iommu_sw->start)
+		panic("Cannot allocate SWIOTLB buffer");
+	iommu_sw->end = iommu_sw->start + bytes;
+
+	/*
+	 * Allocate and initialize the free list array.  This array is used
+	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
+	 * between iommu_sw->start and iommu_sw->end.
+	 */
+	iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int));
+	for (i = 0; i < iommu_sw->nslabs; i++)
+		iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+	iommu_sw->index = 0;
+	iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs *
+					    sizeof(phys_addr_t));
+
+	/*
+	 * Get the overflow emergency buffer
+	 */
+	iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow);
+	if (!iommu_sw->overflow_buffer)
+		panic("Cannot allocate SWIOTLB overflow buffer!\n");
+	if (verbose)
+		swiotlb_print_info();
+
+	iommu_sw->priv = NULL;
+}
+
+int swiotlb_release(struct swiotlb_engine *iommu_sw)
+{
+	if (!iommu_sw)
+		return -ENODEV;
+
+	if (iommu_sw->priv) {
+		free_pages((unsigned long)iommu_sw->overflow_buffer,
+			   get_order(iommu_sw->overflow));
+		free_pages((unsigned long)iommu_sw->orig_addr,
+			   get_order(iommu_sw->nslabs * sizeof(phys_addr_t)));
+		free_pages((unsigned long)iommu_sw->list,
+			   get_order(iommu_sw->nslabs * sizeof(int)));
+		free_pages((unsigned long)iommu_sw->start,
+			   get_order(iommu_sw->nslabs << IO_TLB_SHIFT));
+	} else {
+		free_bootmem_late(__pa(iommu_sw->overflow_buffer),
+				  iommu_sw->overflow);
+		free_bootmem_late(__pa(iommu_sw->orig_addr),
+				  iommu_sw->nslabs * sizeof(phys_addr_t));
+		free_bootmem_late(__pa(iommu_sw->list),
+				  iommu_sw->nslabs * sizeof(int));
+		free_bootmem_late(__pa(iommu_sw->start),
+				  iommu_sw->nslabs << IO_TLB_SHIFT);
+	}
+	return 0;
+}
+
+static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw,
+			     dma_addr_t dma_addr, phys_addr_t paddr)
+{
+	return paddr >= virt_to_phys(iommu_sw->start) &&
+		paddr < virt_to_phys(iommu_sw->end);
+}
+
+static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr,
+				phys_addr_t phys, size_t size)
+{
+	/* Phys is not neccessary in this case. */
+	return dma_capable(hwdev, dma_addr, size);
+}
+static struct swiotlb_engine swiotlb_ops = {
+	.name = "software IO TLB",
+	.overflow = 32 * 1024,
+	.release = swiotlb_release,
+	.dma_capable = swiotlb_dma_capable,
+	.is_swiotlb_buffer = is_swiotlb_buffer,
+	.phys_to_bus =  phys_to_dma,
+	.bus_to_phys = dma_to_phys,
+	.virt_to_bus = swiotlb_virt_to_bus,
+	.bus_to_virt = swiotlb_bus_to_virt,
+};
+
+void __init
+swiotlb_init(int verbose)
+{
+	swiotlb_register_engine(&swiotlb_ops);
+	swiotlb_init_with_default_size(&swiotlb_ops, 64 * (1<<20),
+					verbose);	/* default to 64MB */
+}
+
+/*
+ * Systems with larger DMA zones (those that don''t support ISA) can
+ * initialize the swiotlb later using the slab allocator if needed.
+ * This should be just like above, but with some error catching.
+ */
+int
+swiotlb_late_init_with_default_size(struct swiotlb_engine *iommu_sw,
+				    size_t default_size)
+{
+	unsigned long i, bytes, req_nslabs = iommu_sw->nslabs;
+	unsigned int order;
+
+	if (!swiotlb_nslabs) {
+		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
+		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
+	} else
+		iommu_sw->nslabs = swiotlb_nslabs;
+
+	/*
+	 * Get IO TLB memory from the low pages
+	 */
+	order = get_order(iommu_sw->nslabs << IO_TLB_SHIFT);
+	iommu_sw->nslabs = SLABS_PER_PAGE << order;
+	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
+
+	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
+		iommu_sw->start = (void *)__get_free_pages(GFP_DMA |
+							__GFP_NOWARN, order);
+		if (iommu_sw->start)
+			break;
+		order--;
+	}
+
+	if (!iommu_sw->start)
+		goto cleanup1;
+
+	if (order != get_order(bytes)) {
+		printk(KERN_WARNING "Warning: only able to allocate %ld MB "
+		       "for software IO TLB\n", (PAGE_SIZE << order) >>
20);
+		iommu_sw->nslabs = SLABS_PER_PAGE << order;
+		bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
+	}
+	iommu_sw->end = iommu_sw->start + bytes;
+	memset(iommu_sw->start, 0, bytes);
+
+	/*
+	 * Allocate and initialize the free list array.  This array is used
+	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
+	 * between iommu_sw->start and iommu_sw->end.
+	 */
+	iommu_sw->list = (unsigned int *)__get_free_pages(GFP_KERNEL,
+				get_order(iommu_sw->nslabs * sizeof(int)));
+	if (!iommu_sw->list)
+		goto cleanup2;
+
+	for (i = 0; i < iommu_sw->nslabs; i++)
+		iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+	iommu_sw->index = 0;
+
+	iommu_sw->orig_addr = (phys_addr_t *)
+		__get_free_pages(GFP_KERNEL,
+				 get_order(iommu_sw->nslabs *
+					   sizeof(phys_addr_t)));
+	if (!iommu_sw->orig_addr)
+		goto cleanup3;
+
+	memset(iommu_sw->orig_addr, 0, iommu_sw->nslabs * sizeof(phys_addr_t));
+
+	/*
+	 * Get the overflow emergency buffer
+	 */
+	iommu_sw->overflow_buffer = (void *)__get_free_pages(GFP_DMA,
+						get_order(iommu_sw->overflow));
+	if (!iommu_sw->overflow_buffer)
+		goto cleanup4;
+
+	swiotlb_print_info();
+
+	/* We utilize the private field to figure out whether we
+	 * were allocated late or early.
+	 */
+	iommu_sw->priv = (void *)1;
+
+	return 0;
+
+cleanup4:
+	free_pages((unsigned long)iommu_sw->orig_addr,
+		   get_order(iommu_sw->nslabs * sizeof(phys_addr_t)));
+	iommu_sw->orig_addr = NULL;
+cleanup3:
+	free_pages((unsigned long)iommu_sw->list, get_order(iommu_sw->nslabs *
+						sizeof(int)));
+	iommu_sw->list = NULL;
+cleanup2:
+	iommu_sw->end = NULL;
+	free_pages((unsigned long)iommu_sw->start, order);
+	iommu_sw->start = NULL;
+cleanup1:
+	iommu_sw->nslabs = req_nslabs;
+	return -ENOMEM;
+}
+
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 8e65cee..9e72d21 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -40,14 +40,6 @@
 #define OFFSET(val,align) ((unsigned long)	\
 	                   ( (val) & ( (align) - 1)))
 
-#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
-
-/*
- * Minimum IO TLB size to bother booting with.  Systems with mainly
- * 64bit capable cards will only lightly use the swiotlb.  If we can''t
- * allocate a contiguous 1MB, we''re probably in trouble anyway.
- */
-#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 
 /*
  * Enumeration for sync targets
@@ -96,18 +88,6 @@ setup_io_tlb_npages(char *str)
 __setup("swiotlb=", setup_io_tlb_npages);
 /* make io_tlb_overflow tunable too? */
 
-/* Note that this doesn''t work with highmem page */
-static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
-				      void *address)
-{
-	return phys_to_dma(hwdev, virt_to_phys(address));
-}
-
-static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t dev_addr)
-{
-	return phys_to_virt(dma_to_phys(hwdev, dev_addr));
-};
-
 /*
  * Register a software IO TLB engine.
  *
@@ -165,220 +145,13 @@ void swiotlb_print_info(void)
 	       (unsigned long long)pend);
 }
 
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the DMA API.
- */
-void __init
-swiotlb_init_with_default_size(size_t default_size, int verbose)
-{
-	unsigned long i, bytes;
-
-	if (!swiotlb_nslabs) {
-		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
-		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
-	} else
-		iommu_sw->nslabs = swiotlb_nslabs;
-
-	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
-
-	/*
-	 * Get IO TLB memory from the low pages
-	 */
-	iommu_sw->start = alloc_bootmem_low_pages(bytes);
-	if (!iommu_sw->start)
-		panic("Cannot allocate SWIOTLB buffer");
-	iommu_sw->end = iommu_sw->start + bytes;
-
-	/*
-	 * Allocate and initialize the free list array.  This array is used
-	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
-	 * between iommu_sw->start and iommu_sw->end.
-	 */
-	iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int));
-	for (i = 0; i < iommu_sw->nslabs; i++)
-		iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	iommu_sw->index = 0;
-	iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs *
-					    sizeof(phys_addr_t));
-
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow);
-	if (!iommu_sw->overflow_buffer)
-		panic("Cannot allocate SWIOTLB overflow buffer!\n");
-	if (verbose)
-		swiotlb_print_info();
-
-	iommu_sw->priv = NULL;
-}
-
-static int swiotlb_release(struct swiotlb_engine *iotlb)
-{
-	swiotlb_free();
-	return 0;
-}
-static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw,
-			     dma_addr_t dev_addr, phys_addr_t paddr)
-{
-	return paddr >= virt_to_phys(iommu_sw->start) &&
-		paddr < virt_to_phys(iommu_sw->end);
-}
-
-static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr,
-				phys_addr_t phys, size_t size)
-{
-	/* Phys is not neccessary in this case. */
-	return dma_capable(hwdev, dma_addr, size);
-}
-
-static struct swiotlb_engine swiotlb_ops = {
-	.name = "software IO TLB",
-	.overflow = 32 * 1024,
-	.release = swiotlb_release,
-	.dma_capable = swiotlb_dma_capable,
-	.is_swiotlb_buffer = is_swiotlb_buffer,
-	.phys_to_bus =  phys_to_dma,
-	.bus_to_phys = dma_to_phys,
-	.virt_to_bus = swiotlb_virt_to_bus,
-	.bus_to_virt = swiotlb_bus_to_virt,
-};
-
-void __init
-swiotlb_init(int verbose)
-{
-	swiotlb_register_engine(&swiotlb_ops);
-	swiotlb_init_with_default_size(64 * (1<<20), verbose);	/* default to
64MB */
-}
-
-/*
- * Systems with larger DMA zones (those that don''t support ISA) can
- * initialize the swiotlb later using the slab allocator if needed.
- * This should be just like above, but with some error catching.
- */
-int
-swiotlb_late_init_with_default_size(size_t default_size)
-{
-	unsigned long i, bytes, req_nslabs = iommu_sw->nslabs;
-	unsigned int order;
-
-	if (!swiotlb_nslabs) {
-		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
-		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
-	} else
-		iommu_sw->nslabs = swiotlb_nslabs;
-
-	/*
-	 * Get IO TLB memory from the low pages
-	 */
-	order = get_order(iommu_sw->nslabs << IO_TLB_SHIFT);
-	iommu_sw->nslabs = SLABS_PER_PAGE << order;
-	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
-
-	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-		iommu_sw->start = (void *)__get_free_pages(GFP_DMA |
-							__GFP_NOWARN, order);
-		if (iommu_sw->start)
-			break;
-		order--;
-	}
-
-	if (!iommu_sw->start)
-		goto cleanup1;
-
-	if (order != get_order(bytes)) {
-		printk(KERN_WARNING "Warning: only able to allocate %ld MB "
-		       "for software IO TLB\n", (PAGE_SIZE << order) >>
20);
-		iommu_sw->nslabs = SLABS_PER_PAGE << order;
-		bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
-	}
-	iommu_sw->end = iommu_sw->start + bytes;
-	memset(iommu_sw->start, 0, bytes);
-
-	/*
-	 * Allocate and initialize the free list array.  This array is used
-	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
-	 * between iommu_sw->start and iommu_sw->end.
-	 */
-	iommu_sw->list = (unsigned int *)__get_free_pages(GFP_KERNEL,
-				get_order(iommu_sw->nslabs * sizeof(int)));
-	if (!iommu_sw->list)
-		goto cleanup2;
-
-	for (i = 0; i < iommu_sw->nslabs; i++)
-		iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	iommu_sw->index = 0;
-
-	iommu_sw->orig_addr = (phys_addr_t *)
-		__get_free_pages(GFP_KERNEL,
-				 get_order(iommu_sw->nslabs *
-					   sizeof(phys_addr_t)));
-	if (!iommu_sw->orig_addr)
-		goto cleanup3;
-
-	memset(iommu_sw->orig_addr, 0, iommu_sw->nslabs * sizeof(phys_addr_t));
-
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	iommu_sw->overflow_buffer = (void *)__get_free_pages(GFP_DMA,
-						get_order(iommu_sw->overflow));
-	if (!iommu_sw->overflow_buffer)
-		goto cleanup4;
-
-	swiotlb_print_info();
-
-	/* We utilize the private field to figure out whether we
-	 * were allocated late or early.
-	 */
-	iommu_sw->priv = (void *)1;
-
-	return 0;
-
-cleanup4:
-	free_pages((unsigned long)iommu_sw->orig_addr,
-		   get_order(iommu_sw->nslabs * sizeof(phys_addr_t)));
-	iommu_sw->orig_addr = NULL;
-cleanup3:
-	free_pages((unsigned long)iommu_sw->list, get_order(iommu_sw->nslabs *
-	                                                 sizeof(int)));
-	iommu_sw->list = NULL;
-cleanup2:
-	iommu_sw->end = NULL;
-	free_pages((unsigned long)iommu_sw->start, order);
-	iommu_sw->start = NULL;
-cleanup1:
-	iommu_sw->nslabs = req_nslabs;
-	return -ENOMEM;
-}
-
 void __init swiotlb_free(void)
 {
 	if (!iommu_sw)
 		return;
 
-	if (iommu_sw->priv) {
-		free_pages((unsigned long)iommu_sw->overflow_buffer,
-			   get_order(iommu_sw->overflow));
-		free_pages((unsigned long)iommu_sw->orig_addr,
-			   get_order(iommu_sw->nslabs * sizeof(phys_addr_t)));
-		free_pages((unsigned long)iommu_sw->list,
-			   get_order(iommu_sw->nslabs * sizeof(int)));
-		free_pages((unsigned long)iommu_sw->start,
-			   get_order(iommu_sw->nslabs << IO_TLB_SHIFT));
-	} else {
-		free_bootmem_late(__pa(iommu_sw->overflow_buffer),
-				  iommu_sw->overflow);
-		free_bootmem_late(__pa(iommu_sw->orig_addr),
-				  iommu_sw->nslabs * sizeof(phys_addr_t));
-		free_bootmem_late(__pa(iommu_sw->list),
-				  iommu_sw->nslabs * sizeof(int));
-		free_bootmem_late(__pa(iommu_sw->start),
-				  iommu_sw->nslabs << IO_TLB_SHIFT);
-	}
-}
-
+	iommu_sw->release(iommu_sw);
+};
 
 /*
  * Bounce: copy the swiotlb buffer back to the original dma location
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-14 23:01 UTC

head link

[Xen-devel] [PATCH 15/15] [swiotlb] Take advantage of iommu_sw->name and add %s to printk''s.

Make the printk usage more generic in the SWIOTLB library.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb-default.c |    6 +++---
 lib/swiotlb.c         |   26 ++++++++++++++++----------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/lib/swiotlb-default.c b/lib/swiotlb-default.c
index c490fcf..ebee540 100644
--- a/lib/swiotlb-default.c
+++ b/lib/swiotlb-default.c
@@ -51,7 +51,7 @@ swiotlb_init_with_default_size(struct swiotlb_engine
*iommu_sw,
 	 */
 	iommu_sw->start = alloc_bootmem_low_pages(bytes);
 	if (!iommu_sw->start)
-		panic("Cannot allocate SWIOTLB buffer");
+		panic("Cannot allocate %s buffer", iommu_sw->name);
 	iommu_sw->end = iommu_sw->start + bytes;
 
 	/*
@@ -71,7 +71,7 @@ swiotlb_init_with_default_size(struct swiotlb_engine
*iommu_sw,
 	 */
 	iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow);
 	if (!iommu_sw->overflow_buffer)
-		panic("Cannot allocate SWIOTLB overflow buffer!\n");
+		panic("Cannot allocate %s overflow buffer!\n", iommu_sw->name);
 	if (verbose)
 		swiotlb_print_info();
 
@@ -176,7 +176,7 @@ swiotlb_late_init_with_default_size(struct swiotlb_engine
*iommu_sw,
 
 	if (order != get_order(bytes)) {
 		printk(KERN_WARNING "Warning: only able to allocate %ld MB "
-		       "for software IO TLB\n", (PAGE_SIZE << order) >>
20);
+		       "for %s\n", (PAGE_SIZE << order) >> 20,
iommu_sw->name);
 		iommu_sw->nslabs = SLABS_PER_PAGE << order;
 		bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
 	}
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 9e72d21..1f17be0 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -138,9 +138,10 @@ void swiotlb_print_info(void)
 	pstart = virt_to_phys(iommu_sw->start);
 	pend = virt_to_phys(iommu_sw->end);
 
-	printk(KERN_INFO "Placing %luMB software IO TLB between %p - %p\n",
-	       bytes >> 20, iommu_sw->start, iommu_sw->end);
-	printk(KERN_INFO "software IO TLB at phys %#llx - %#llx\n",
+	printk(KERN_INFO "Placing %luMB %s between %p - %p\n",
+	       bytes >> 20, iommu_sw->name, iommu_sw->start,
iommu_sw->end);
+	printk(KERN_INFO "%s at phys %#llx - %#llx\n",
+	       iommu_sw->name,
 	       (unsigned long long)pstart,
 	       (unsigned long long)pend);
 }
@@ -408,7 +409,8 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 
 	/* Confirm address can be DMA''d by device */
 	if (dev_addr + size - 1 > dma_mask) {
-		printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
+		printk(KERN_ERR "%s:hwdev DMA mask = 0x%016Lx, " \
+		       "dev_addr = 0x%016Lx\n", iommu_sw->name,
 		       (unsigned long long)dma_mask,
 		       (unsigned long long)dev_addr);
 
@@ -446,18 +448,21 @@ swiotlb_full(struct device *dev, size_t size, int dir, int
do_panic)
 	 * When the mapping is small enough return a static buffer to limit
 	 * the damage, or panic when the transfer is too big.
 	 */
-	printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at "
-	       "device %s\n", size, dev ? dev_name(dev) : "?");
+	printk(KERN_ERR "%s: Out of space for %zu bytes at "
+	       "device %s\n", iommu_sw->name, size, dev ? dev_name(dev) :
"?");
 
 	if (size <= iommu_sw->overflow || !do_panic)
 		return;
 
 	if (dir == DMA_BIDIRECTIONAL)
-		panic("DMA: Random memory could be DMA accessed\n");
+		panic("%s: Random memory could be DMA accessed\n",
+		      iommu_sw->name);
 	if (dir == DMA_FROM_DEVICE)
-		panic("DMA: Random memory could be DMA written\n");
+		panic("%s: Random memory could be DMA written\n",
+		      iommu_sw->name);
 	if (dir == DMA_TO_DEVICE)
-		panic("DMA: Random memory could be DMA read\n");
+		panic("%s: Random memory could be DMA read\n",
+		      iommu_sw->name);
 }
 
 /*
@@ -500,7 +505,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 	 * Ensure that the address returned is DMA''ble
 	 */
 	if (!iommu_sw->dma_capable(dev, dev_addr, phys, size))
-		panic("map_single: bounce buffer is not DMA''ble");
+		panic("%s map_single: bounce buffer is not DMA''ble",
+		      iommu_sw->name);
 
 	return dev_addr;
 }
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 01:22 UTC

head link

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> Before this patch, if you specified ''swiotlb=force,1024''
it would
> ignore both arguments. This fixes it and allows the user specify it
> in any order (or none at all).
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Having only one substring of digits makes allowing arbitrary order
less useful if more options get added (as in foo,bar,1024,baz,force
would make more sense as foo,bar,nslabs=1024,baz,force).  Do you
think this one is really needed?  If so, be useful to update
Documentation/kernel-parameters.txt which is slightly out of date now.

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 01:33 UTC

head link

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> The structure contains all of the existing variables used in
> software IO TLB (swiotlb.c) collected within a structure.
> 
> Additionally a name variable and a deconstructor (release) function
> variable is defined for API usages.
> 
> The other set of functions: is_swiotlb_buffer, dma_capable, phys_to_bus,
> bus_to_phys, virt_to_bus, and bus_to_virt server as a method to abstract
> them out of the SWIOTLB library.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  include/linux/swiotlb.h |   94
+++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 94 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index febedcf..781c3aa 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -24,6 +24,100 @@ extern int swiotlb_force;
>  
>  extern void swiotlb_init(int verbose);
>  You can move the comments to a kerneldoc section for proper
documentation.

/**
 * struct swiotlb_engine - short desc...
 * @name:	Name of the engine...
etc
> +struct swiotlb_engine {
> +
> +	/*
> +	 * Name of the engine (ie: "Software IO TLB")
> +	 */
> +	const char	*name;
> +
> +	/*
> +	 * Used to do a quick range check in unmap_single and
> +	 * sync_single_*, to see if the memory was in fact allocated by this
> +	 * API.
> +	 */
> +	char *start;
> +	char *end;
Isn''t this still global to swiotlb, not specific to the backend impl.?
> +	/*
> +	 * The number of IO TLB blocks (in groups of 64) betweeen start and
> +	 * end.  This is command line adjustable via setup_io_tlb_npages.
> +	 */
> +	unsigned long nslabs;
Same here.
> +
> +	/*
> +	 * When the IOMMU overflows we return a fallback buffer.
> +	 * This sets the size.
> +	 */
> +	unsigned long overflow;
> +
> +	void *overflow_buffer;
And here.
> +	/*
> +	 * This is a free list describing the number of free entries available
> +	 * from each index
> +	 */
> +	unsigned int *list;
> +
> +	/*
> +	 * Current marker in the start through end location. Is incremented
> +	 * on each map and wraps around.
> +	 */
> +	unsigned int index;
> +
> +	/*
> +	 * We need to save away the original address corresponding to a mapped
> +	 * entry for the sync operations.
> +	 */
> +	phys_addr_t *orig_addr;
> +
> +	/*
> +	 * IOMMU private data.
> +	 */
> +	void *priv;
> +	/*
> +	 * The API call to free a SWIOTLB engine if another wants to register
> +	 * (or if want to turn SWIOTLB off altogether).
> +	 * It is imperative that this function checks for existing DMA maps
> +	 * and not release the IOTLB if there are out-standing maps.
> +	 */
> +	int (*release)(struct swiotlb_engine *);
> +
> +	/*
> +	 * Is the DMA (Bus) address within our bounce buffer (start and end).
> +	 */
> +	int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t dev_addr,
> +				 phys_addr_t phys);
> +
Why is this implementation specific?
> +	/*
> +	 * Is the DMA (Bus) address reachable by the PCI device?.
> +	 */
> +	bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t, size_t);
This too...
> +	/*
> +	 *  Physical to bus (DMA) address translation. On
> +	 *  most platforms this is an equivalent function.
> +	 */
> +	dma_addr_t (*phys_to_bus)(struct device *hwdev, phys_addr_t paddr);
> +
> +	/*
> +	 * Bus (DMA) to physical address translation. On most
> +	 * platforms this is an equivalant function.
> +	 */
> +	phys_addr_t (*bus_to_phys)(struct device *hwdev, dma_addr_t baddr);
> +
> +	/*
> +	 * Virtual to bus (DMA) address translation. On most platforms
> +	 * this is a call to __pa(address).
> +	 */
> +	dma_addr_t (*virt_to_bus)(struct device *hwdev, void *address);
> +
> +	/*
> +	* Bus (DMA) to virtual address translation. On most platforms
> +	* this is a call to __va(address).
> +	*/
> +	void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address);
> +};
> +
>  extern void
>  *swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  			dma_addr_t *dma_handle, gfp_t flags);
> -- 
> 1.6.2.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 01:41 UTC

head link

[Xen-devel] Re: [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> We set the internal iommu_sw pointer to the passed in swiotlb_engine
> structure. Obviously we also check if the existing iommu_sw is
> set and if so, call iommu_sw->release before the switch-over.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  include/linux/swiotlb.h |    2 +
>  lib/swiotlb.c           |   48
+++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 50 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 781c3aa..3bc3c42 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -118,6 +118,8 @@ struct swiotlb_engine {
>  	void* (*bus_to_virt)(struct device *hwdev, dma_addr_t address);
>  };
>  
> +int swiotlb_register_engine(struct swiotlb_engine *iommu);
> +
>  extern void
>  *swiotlb_alloc_coherent(struct device *hwdev, size_t size,
>  			dma_addr_t *dma_handle, gfp_t flags);
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index e6d9e32..e84f269 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -97,6 +97,12 @@ static phys_addr_t *io_tlb_orig_addr;
>   */
>  static DEFINE_SPINLOCK(io_tlb_lock);
>  
> +/*
> + * The software IOMMU this library will utilize.
> + */
> +struct swiotlb_engine *iommu_sw;
> +EXPORT_SYMBOL(iommu_sw);
should be EXPORT_SYMBOL_GPL
>  static int late_alloc;
>  
>  static int __init
> @@ -126,6 +132,48 @@ static dma_addr_t swiotlb_virt_to_bus(struct device
*hwdev,
>  	return phys_to_dma(hwdev, virt_to_phys(address));
>  }
>  
> +/*
> + * Register a software IO TLB engine.
> + *
> + * The registration allows the software IO TLB functions in the
> + * swiotlb library to function properly.
> + *
> + * All the values in the iotlb structure must be set.
> + *
> + * If the registration fails, it is assumed that the caller will free
> + * all of the resources allocated in the swiotlb_engine structure.
> + */
> +int swiotlb_register_engine(struct swiotlb_engine *iommu)
> +{
> +	if (!iommu || !iommu->name || !iommu->release) {
> +		printk(KERN_ERR "DMA: Trying to register a SWIOTLB engine" \
> +				" improperly!");
> +		return -EINVAL;
> +	}
> +
> +	if (iommu_sw && iommu_sw->name) {
According to above, you can''t have !iommu_sw->name.
> +		int retval = -EINVAL;
> +
> +		/* ''release'' must check for out-standing DMAs and
flush them
> +		 *  out or fail. */
> +		if (iommu_sw->release)
> +			retval = iommu_sw->release(iommu_sw);
Same here, you can''t have !iommu_sw->release, just call
unconditionally.
> +
> +		if (retval) {
> +			printk(KERN_ERR "DMA: %s cannot be released!\n",
> +				iommu_sw->name);
> +			return retval;
> +		}
> +		printk(KERN_INFO "DMA: Replacing [%s] with [%s]\n",
> +			iommu_sw->name, iommu->name);
> +	}
> +
> +	iommu_sw = iommu;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(swiotlb_register_engine);
> +
>  void swiotlb_print_info(void)
>  {
>  	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 01:43 UTC

head link

[Xen-devel] Re: [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> We also fix the checkpatch.pl errors that surfaced during
> this conversion.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  lib/swiotlb.c |  204
+++++++++++++++++++++++++++++----------------------------
>  1 files changed, 104 insertions(+), 100 deletions(-)
> 
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index e84f269..3499001 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -176,14 +176,14 @@ EXPORT_SYMBOL(swiotlb_register_engine);
>  
>  void swiotlb_print_info(void)
>  {
> -	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
> +	unsigned long bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
This isn''t a bisect friendly way to do this...this would oops w/
iommu_sw == NULL.

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 01:47 UTC

head link

[Xen-devel] Re: [PATCH 05/15] [swiotlb] Respect the io_tlb_nslabs argument value.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> The search and replace removed the option to override
> the amount of slabs via swiotlb=<x> argument. This puts
> it back in.
Should just fold a change like this in w/ prior one.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 01:57 UTC

head link

[Xen-devel] Re: [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> For baselevel support we define required functions and fill out
> variables.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  lib/swiotlb.c |   22 ++++++++++++++++++++++
>  1 files changed, 22 insertions(+), 0 deletions(-)
> 
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index cf29f03..3c7bd4e 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -132,6 +132,11 @@ static dma_addr_t swiotlb_virt_to_bus(struct device
*hwdev,
>  	return phys_to_dma(hwdev, virt_to_phys(address));
>  }
>  
> +static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t
dev_addr)
> +{
> +	return phys_to_virt(dma_to_phys(hwdev, dev_addr));
> +};
> +
>  /*
>   * Register a software IO TLB engine.
>   *
> @@ -236,9 +241,26 @@ swiotlb_init_with_default_size(size_t default_size,
int verbose)
>  		swiotlb_print_info();
>  }
>  
> +static int swiotlb_release(struct swiotlb_engine *iotlb)
> +{
> +	swiotlb_free();
> +	return 0;
Do you ever expect a failure case here?  You mentioned wait and flush or
fail, but that''s not done here.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 02:02 UTC

head link

[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -364,7 +364,7 @@ cleanup1:
>  
>  void __init swiotlb_free(void)
>  {
> -	if (!iommu_sw->overflow_buffer)
> +	if (!iommu_sw)
>  		return;
>  
Sure this is safe for the case where allocation failed?  Wouldn''t this
do free_late_bootmem(__pa(0))?

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 02:14 UTC

head link

[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> We move all of the initialization functions and as well
> all functions defined in the swiotlb_ops to a seperate file.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  lib/Makefile          |    2 +-
>  lib/swiotlb-default.c |  242
+++++++++++++++++++++++++++++++++++++++++++++++++
>  lib/swiotlb.c         |  231
+----------------------------------------------
>  3 files changed, 245 insertions(+), 230 deletions(-)
>  create mode 100644 lib/swiotlb-default.c
> 
> diff --git a/lib/Makefile b/lib/Makefile
> index 347ad8d..fd96891 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -77,7 +77,7 @@ obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o
>  obj-$(CONFIG_SMP) += percpu_counter.o
>  obj-$(CONFIG_AUDIT_GENERIC) += audit.o
>  
> -obj-$(CONFIG_SWIOTLB) += swiotlb.o
> +obj-$(CONFIG_SWIOTLB) += swiotlb.o swiotlb-default.o
>  obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o
>  obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
>  
> diff --git a/lib/swiotlb-default.c b/lib/swiotlb-default.c
> new file mode 100644
> index 0000000..c490fcf
> --- /dev/null
> +++ b/lib/swiotlb-default.c
> @@ -0,0 +1,242 @@
> +
> +#include <linux/dma-mapping.h>
> +#include <linux/swiotlb.h>
> +#include <linux/bootmem.h>
> +
> +
> +#define OFFSET(val, align) ((unsigned long)	\
> +				 (val) & ((align) - 1))
> +
> +#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
> +
> +/*
> + * Minimum IO TLB size to bother booting with.  Systems with mainly
> + * 64bit capable cards will only lightly use the swiotlb.  If we
can''t
> + * allocate a contiguous 1MB, we''re probably in trouble anyway.
> + */
> +#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> +
> +/* Note that this doesn''t work with highmem page */
> +static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
> +				      void *address)
> +{
> +	return phys_to_dma(hwdev, virt_to_phys(address));
> +}
> +
> +static void *swiotlb_bus_to_virt(struct device *hwdev, dma_addr_t
dev_addr)
> +{
> +	return phys_to_virt(dma_to_phys(hwdev, dev_addr));
> +};
> +
> +/*
> + * Statically reserve bounce buffer space and initialize bounce buffer
data
> + * structures for the software IO TLB used to implement the DMA API.
> + */
> +void __init
> +swiotlb_init_with_default_size(struct swiotlb_engine *iommu_sw,
> +			       size_t default_size, int verbose)
> +{
> +	unsigned long i, bytes;
> +
> +	if (!swiotlb_nslabs) {
> +		iommu_sw->nslabs = (default_size >> IO_TLB_SHIFT);
> +		iommu_sw->nslabs = ALIGN(iommu_sw->nslabs, IO_TLB_SEGSIZE);
> +	} else
> +		iommu_sw->nslabs = swiotlb_nslabs;
> +
> +	bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
> +
> +	/*
> +	 * Get IO TLB memory from the low pages
> +	 */
> +	iommu_sw->start = alloc_bootmem_low_pages(bytes);
> +	if (!iommu_sw->start)
> +		panic("Cannot allocate SWIOTLB buffer");
> +	iommu_sw->end = iommu_sw->start + bytes;
> +
> +	/*
> +	 * Allocate and initialize the free list array.  This array is used
> +	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
> +	 * between iommu_sw->start and iommu_sw->end.
> +	 */
> +	iommu_sw->list = alloc_bootmem(iommu_sw->nslabs * sizeof(int));
> +	for (i = 0; i < iommu_sw->nslabs; i++)
> +		iommu_sw->list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
> +	iommu_sw->index = 0;
> +	iommu_sw->orig_addr = alloc_bootmem(iommu_sw->nslabs *
> +					    sizeof(phys_addr_t));
> +
> +	/*
> +	 * Get the overflow emergency buffer
> +	 */
> +	iommu_sw->overflow_buffer = alloc_bootmem_low(iommu_sw->overflow);
> +	if (!iommu_sw->overflow_buffer)
> +		panic("Cannot allocate SWIOTLB overflow buffer!\n");
> +	if (verbose)
> +		swiotlb_print_info();
> +
> +	iommu_sw->priv = NULL;
> +}
> +
> +int swiotlb_release(struct swiotlb_engine *iommu_sw)
> +{
> +	if (!iommu_sw)
> +		return -ENODEV;
> +
> +	if (iommu_sw->priv) {
> +		free_pages((unsigned long)iommu_sw->overflow_buffer,
> +			   get_order(iommu_sw->overflow));
> +		free_pages((unsigned long)iommu_sw->orig_addr,
> +			   get_order(iommu_sw->nslabs * sizeof(phys_addr_t)));
> +		free_pages((unsigned long)iommu_sw->list,
> +			   get_order(iommu_sw->nslabs * sizeof(int)));
> +		free_pages((unsigned long)iommu_sw->start,
> +			   get_order(iommu_sw->nslabs << IO_TLB_SHIFT));
> +	} else {
> +		free_bootmem_late(__pa(iommu_sw->overflow_buffer),
> +				  iommu_sw->overflow);
> +		free_bootmem_late(__pa(iommu_sw->orig_addr),
> +				  iommu_sw->nslabs * sizeof(phys_addr_t));
> +		free_bootmem_late(__pa(iommu_sw->list),
> +				  iommu_sw->nslabs * sizeof(int));
> +		free_bootmem_late(__pa(iommu_sw->start),
> +				  iommu_sw->nslabs << IO_TLB_SHIFT);
> +	}
> +	return 0;
> +}
> +
> +static int is_swiotlb_buffer(struct swiotlb_engine *iommu_sw,
> +			     dma_addr_t dma_addr, phys_addr_t paddr)
> +{
> +	return paddr >= virt_to_phys(iommu_sw->start) &&
> +		paddr < virt_to_phys(iommu_sw->end);
> +}
> +
> +static bool swiotlb_dma_capable(struct device *hwdev, dma_addr_t dma_addr,
> +				phys_addr_t phys, size_t size)
> +{
> +	/* Phys is not neccessary in this case. */
> +	return dma_capable(hwdev, dma_addr, size);
> +}
> +static struct swiotlb_engine swiotlb_ops = {
> +	.name = "software IO TLB",
> +	.overflow = 32 * 1024,
> +	.release = swiotlb_release,
> +	.dma_capable = swiotlb_dma_capable,
> +	.is_swiotlb_buffer = is_swiotlb_buffer,
> +	.phys_to_bus =  phys_to_dma,
> +	.bus_to_phys = dma_to_phys,
> +	.virt_to_bus = swiotlb_virt_to_bus,
> +	.bus_to_virt = swiotlb_bus_to_virt,
> +};
> +
> +void __init
> +swiotlb_init(int verbose)
> +{
> +	swiotlb_register_engine(&swiotlb_ops);
> +	swiotlb_init_with_default_size(&swiotlb_ops, 64 * (1<<20),
> +					verbose);	/* default to 64MB */
> +}
I''d expect the swiotlb-default file to have only private impl. of the
swiotlb_engine.  Shouldn''t this and the init stay in swiotlb.c?  Also,
would you ever call swiotlb_init w/out register_engine, why not move
register to the swiotlb_init?

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-15 02:25 UTC

head link

[Xen-devel] Re: [RFC SWIOTLB-0.2]

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> Another approach, which this set of patches explores, is to abstract the
> address translation and address determination functions away from the
> SWIOTLB book-keeping functions. This way the core SWIOTLB library functions
> are present in one place, while the address related functions are in
> a separate library for different run-time platforms. I would very much
> appreciate input on this idea and the set of patches.
It seems like it still needs some refinement, since the Xen
implementation is hooking into two layers.  Both:

+       swiotlb_register_engine(&xen_ops);

and

+static struct dma_map_ops xen_swiotlb_dma_ops = {

Wouldn''t the idea be to get to the point that you''d use common
swiotlb
and keep the hooks to one layer?

Also, it''s unclear when some of the prior global to swiotlb variables
would actually be useful to a private implementation.  For example, overflow,
which is just 32 * 1024 in both cases.  Are those really needed to be
private to a swiotlb engine?

Do you think you can reduce the swiotlb_engine to just the relevant ops?

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 17:25 UTC

head link

[Xen-devel] Re: [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.

> > +EXPORT_SYMBOL(iommu_sw);
> 
> should be EXPORT_SYMBOL_GPL
Yup.> 
> >  static int late_alloc;
> >  
> >  static int __init
> > @@ -126,6 +132,48 @@ static dma_addr_t swiotlb_virt_to_bus(struct
device *hwdev,
> >  	return phys_to_dma(hwdev, virt_to_phys(address));
> >  }
> >  
> > +/*
> > + * Register a software IO TLB engine.
> > + *
> > + * The registration allows the software IO TLB functions in the
> > + * swiotlb library to function properly.
> > + *
> > + * All the values in the iotlb structure must be set.
> > + *
> > + * If the registration fails, it is assumed that the caller will free
> > + * all of the resources allocated in the swiotlb_engine structure.
> > + */
> > +int swiotlb_register_engine(struct swiotlb_engine *iommu)
> > +{
> > +	if (!iommu || !iommu->name || !iommu->release) {
> > +		printk(KERN_ERR "DMA: Trying to register a SWIOTLB
engine" \
> > +				" improperly!");
> > +		return -EINVAL;
> > +	}
> > +
> > +	if (iommu_sw && iommu_sw->name) {
> 
> According to above, you can''t have !iommu_sw->name.
Yup. Artificats of previous implementation.> 
> > +		int retval = -EINVAL;
> > +
> > +		/* ''release'' must check for out-standing DMAs and
flush them
> > +		 *  out or fail. */
> > +		if (iommu_sw->release)
> > +			retval = iommu_sw->release(iommu_sw);
> 
> Same here, you can''t have !iommu_sw->release, just call
unconditionally.
Ok.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 17:45 UTC

head link

[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

> > +void __init
> > +swiotlb_init(int verbose)
> > +{
> > +	swiotlb_register_engine(&swiotlb_ops);
> > +	swiotlb_init_with_default_size(&swiotlb_ops, 64 * (1<<20),
> > +					verbose);	/* default to 64MB */
> > +}
> 
> I''d expect the swiotlb-default file to have only private impl. of
the
> swiotlb_engine.  Shouldn''t this and the init stay in swiotlb.c? 
Also,
Hmm, were you thinking that it might make sense to pass in
a swiotlb_ops to swiotlb_init so that it can make the right assignments?

The reason why I stuck here was that the swiotlb_ops needed to be
visible to this function, and having it in swiotlb.c would mean it must
now include the header definition for swiotlb-defualt.h.
> would you ever call swiotlb_init w/out register_engine, why not move
> register to the swiotlb_init?
In essence combine swiotlb_register_engine with swiotlb_init_with_default_size?

There would  still be a need for late call mechanism. 
Perhaps having two variants of swiotlb_init?: swiotlb_early_init(struct
swiotlb_engine *swiotlb_ops) and swiotlb_late_init(struct swiotlb_engine
*swiotlb_ops)?

Or perhaps just pass in an argument: swiotlb_init(int late)?

Furthermore have this new swiotlb_init detect if some of the fields
(start ,end, overflow_buffer) have been allocated and if so skip the
default allocation altogether?


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 17:45 UTC

head link

[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

On Thu, Jan 14, 2010 at 06:02:40PM -0800, Chris Wright
wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:
> > --- a/lib/swiotlb.c
> > +++ b/lib/swiotlb.c
> > @@ -364,7 +364,7 @@ cleanup1:
> >  
> >  void __init swiotlb_free(void)
> >  {
> > -	if (!iommu_sw->overflow_buffer)
> > +	if (!iommu_sw)
> >  		return;
> >  
> 
> Sure this is safe for the case where allocation failed?  Wouldn''t
this
> do free_late_bootmem(__pa(0))?
That would indeed fail, but alloc_bootmem_low_pages (___alloc_bootmem)
panics the machine if it can''t allocate the buffer. So we would never
actually get to swiotlb_free if we failed to allocate the buffers for
SWIOTLB.

But for the case where the SWIOTLB allocation happens when using 
swiotlb_late_init_with_default_size, and it fails, this check
is not sufficient. I will add a check for that or just make
swiotlb_late_init_with_default_size set iommu_sw to NULL when
the allocation fails.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 17:45 UTC

head link

[Xen-devel] Re: [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.

> > +static int swiotlb_release(struct swiotlb_engine *iotlb)
> > +{
> > +	swiotlb_free();
> > +	return 0;
> 
> Do you ever expect a failure case here?  You mentioned wait and flush or
> fail, but that''s not done here.
I was thinking to extend that in the next batch of patches. But I can roll
that in here. The check is pretty simple - it checks to see if the index
value has been changed, and if so returns a failure.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 17:45 UTC

head link

[Xen-devel] Re: [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/

On Thu, Jan 14, 2010 at 05:43:30PM -0800, Chris Wright
wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:
> > We also fix the checkpatch.pl errors that surfaced during
> > this conversion.
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> >  lib/swiotlb.c |  204
+++++++++++++++++++++++++++++----------------------------
> >  1 files changed, 104 insertions(+), 100 deletions(-)
> > 
> > diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> > index e84f269..3499001 100644
> > --- a/lib/swiotlb.c
> > +++ b/lib/swiotlb.c
> > @@ -176,14 +176,14 @@ EXPORT_SYMBOL(swiotlb_register_engine);
> >  
> >  void swiotlb_print_info(void)
> >  {
> > -	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
> > +	unsigned long bytes = iommu_sw->nslabs << IO_TLB_SHIFT;
> 
> This isn''t a bisect friendly way to do this...this would oops w/
> iommu_sw == NULL.
Yes. Let me redo this in a friendlier manner.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 17:46 UTC

head link

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

. snip..> You can move the comments to a kerneldoc section for proper
> documentation.
> 
> /**
>  * struct swiotlb_engine - short desc...
>  * @name:	Name of the engine...
> etc
<nods>.
.. snip ..> > +	char *end;
> 
> Isn''t this still global to swiotlb, not specific to the backend
impl.?
Yes and no. Without the start/end, the "is_swiotlb_buffer"
would not be able to determine if the passed in address is within the
SWIOTLB buffer.
> 
> > +	/*
> > +	 * The number of IO TLB blocks (in groups of 64) betweeen start and
> > +	 * end.  This is command line adjustable via setup_io_tlb_npages.
> > +	 */
> > +	unsigned long nslabs;
> 
> Same here.
> 
That one can be put back (make it part of lib/swiotlb.c)> > +
> > +	/*
> > +	 * When the IOMMU overflows we return a fallback buffer.
> > +	 * This sets the size.
> > +	 */
> > +	unsigned long overflow;
> > +
> > +	void *overflow_buffer;
> 
> And here.Ditto.
..snip ..> > +	 * Is the DMA (Bus) address within our bounce buffer (start and
end).
> > +	 */
> > +	int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t
dev_addr,
> > +				 phys_addr_t phys);
> > +
> 
> Why is this implementation specific?
In the current implementation, they both use the physical address and
do a simple check:

	return paddr >= virt_to_phys(io_tlb_start) &&
		paddr < virt_to_phys(io_tlb_end);

That for virtualized environments where a PCI device is passed in would
work too.

Unfortunately the problem is when we provide a channel of communication
with another domain and we end up doing DMA on behalf of another guest.
The short description of the problem is that a page of memory is shared
with another domain and the mapping in our domain is correct (bus->physical)
the other way (virt->physical->bus) is incorrect for the duration of this
page
being shared. Hence we need to verify that the page is local to our
domain, and for that we need the bus address to verify that the
addr ==  physical->bus(bus->physical(addr)) where addr is the bus
address (dma_addr_t). If it is not local (shared with another domain)
we MUST not consider it as a SWIOTLB buffer as that can lead to
panics and possible corruptions. The trick here is that the phys->virt
address can fall within the SWIOTLB buffer for pages that are
shared with another domain and we need the DMA address to do an extra check.

The long description of the problem is:

You are the domain doing some DMA on behalf of another domain. The
simple example is you are servicing a block device to the other guests.
One way to implement this is to present a one page ring buffer where
both domains move the producer and consumer indexes around. Once you get
a request (READ/WRITE), you use the virtualized channels to "share"
that page
into your domain. For this you have a buffer (2MB or bigger) wherein for
pages that shared in to you, you over-write the phys->bus mapping.
That means that the phys->phys translation is altered for the duration
of this request being out-standing. Once it is completed, the phys->bus
translation is restored.

Here is a little diagram of what happens when a page is shared (and lets
assume that we have a situation where virt #1 == virt #2, which means
that phys #1 == phys #2).

(domain 2) virt#1->phys#1---\
                             +- bus #1
(domain 3) virt#2->phys#2 ---/

(phys#1 points to bus #1, and phys#2 points to bus #1 too).

The converse of the above picture is not true:

      /---> phys #1-> virt #1. (domain 2).
bus#1 +
      \---> phys #2-> virt #2. (domain 3).

phys #1 != phys #2 and hence virt #1 != virt #2.

When a page is not shared:

(domain 2) virt #1->phys #1--> bus #1
(domain 3) virt #2->phys #2--> bus #2

bus #1 -> phys #1 -> virt #1 (domain 2)
bus #2 -> phys #2 -> virt #2 (domain 3)

The bus #1 != bus #2, but phys #1 could be same as phys #2 (since
there are just PFNs). And virt #1 == virt #2.

The reason for these is that each domain has its own memory layout where
the memory starts at pfn 0, not at some higher number. So each domain
sees the physical address identically, but the bus address MUST point
to different areas (except when sharing) otherwise one domain would
over-write another domain, ouch.

Furthermore when a domain is allocated, the pages for the domain are not
guaranteed to be linearly contiguous so we can''t guarantee that phys ==
bus.

So to guard against the situation in which phys #1 ->virt comes out with
an address that looks to be within our SWIOTLB buffer we need to do the
extra check:

addr ==  physical->bus(bus->physical(addr)) where addr is the bus
address

And for scenarios where this is not true (page belongs to another
domain), that page is not in the SWIOTLB (even thought the virtual and
physical address point to it).
> > +	/*
> > +	 * Is the DMA (Bus) address reachable by the PCI device?.
> > +	 */
> > +	bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t,
size_t);
I mentioned in the previous explanation that when a domain is allocated,
the pages are not guaranteed to be linearly contiguous.

For bare-metal that is not the case and ''dma_capable'' just
checks
the device DMA mask against the bus address.

For virtualized environment we do need to check if the pages are linearly
contiguous for the size request.

For that we need the physical address to iterate over them doing the
phys->bus#1 translation and checking whether the  (phys+1)->bus#2
bus#1 == bus#2 + 1.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 17:47 UTC

head link

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

On Thu, Jan 14, 2010 at 05:22:13PM -0800, Chris Wright
wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:
> > Before this patch, if you specified
''swiotlb=force,1024'' it would
> > ignore both arguments. This fixes it and allows the user specify it
> > in any order (or none at all).
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> Having only one substring of digits makes allowing arbitrary order
> less useful if more options get added (as in foo,bar,1024,baz,force
> would make more sense as foo,bar,nslabs=1024,baz,force).  Do you
> think this one is really needed?  If so, be useful to update
I got caught a couple of times where I needed to provide both arguments
and could not figure out why it did not work. Switching the arguments
around fixed it. Thought that it might make sense to remove this
potential trap from other folks by this patch.

Your point about more options got me thinking about the overflow buffer.
I could also provide an over-ride for that, maybe:

"swiotlb=force,overflow=32,slabs=1024"

(Not sure about the syntax?)
> Documentation/kernel-parameters.txt which is slightly out of date now.
Oh, good catch. Will roll the patch for that file as
well.> 
> thanks,
> -chris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 18:20 UTC

head link

[Xen-devel] Re: [RFC SWIOTLB-0.2]

On Thu, Jan 14, 2010 at 06:25:10PM -0800, Chris Wright
wrote:> * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:
> > Another approach, which this set of patches explores, is to abstract
the
> > address translation and address determination functions away from the
> > SWIOTLB book-keeping functions. This way the core SWIOTLB library
functions
> > are present in one place, while the address related functions are in
> > a separate library for different run-time platforms. I would very much
> > appreciate input on this idea and the set of patches.
> 
> It seems like it still needs some refinement, since the Xen
Oh yes.> implementation is hooking into two layers.  Both:
> 
> +       swiotlb_register_engine(&xen_ops);
> 
> and
> 
> +static struct dma_map_ops xen_swiotlb_dma_ops = {
> 
> Wouldn''t the idea be to get to the point that you''d use
common swiotlb
> and keep the hooks to one layer?
I would love to. Maybe I can extend those two functions (alloc_coherent
and free_coherent) to make an extra call after they have
allocated/de-allocated a page?

The reason is that in virtualized environments
I MUST guarantee that those buffers are linearly contiguous.
Meaning I need to post-processing of this buffer:

 ret = (void *)__get_free_pages(flags, order)

If that can''t be done, then I need a mix of DMA ops where the majority
is SWIOTLB with the exception of the alloc_coherent and free_coherent).

Hmm, I should follow the lead of what x86_swiotlb_alloc_coherent does
and just make an extra call to ''is_swiotlb_buffer'' on the
return address
and if not found to be within that SWIOTLB, do the fixup to make sure
the pages are linearly contiguous.

> 
> Also, it''s unclear when some of the prior global to swiotlb
variables
> would actually be useful to a private implementation.  For example,
overflow,
> which is just 32 * 1024 in both cases.  Are those really needed to be
> private to a swiotlb engine?
Unfortunately yes. The same reason as mentioned above: 
MUST guarantee that those buffers (start, overflow) are linearly contiguous.
For that I was doing something like:

void __init xen_swiotlb_init(int verbose)
{
       int rc = 0;

       swiotlb_register_engine(&xen_ops);
       swiotlb_init_with_default_size(&xen_ops, 64 * (1<<20), 0);

       if ((rc = xen_swiotlb_fixup(xen_ops.start,
                         xen_ops.nslabs << IO_TLB_SHIFT,
                         xen_ops.nslabs)))
               goto error;

       if ((rc = xen_swiotlb_fixup(xen_ops.overflow_buffer,
                       xen_ops.overflow,
                       xen_ops.overflow >> IO_TLB_SHIFT)))
               goto error;

so that I can "fix" the start and overflow_buffer pages.
> 
> Do you think you can reduce the swiotlb_engine to just the relevant ops?
Yes. Let me reduce them.> 
> thanks,
> -chris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-19 18:23 UTC

head link

[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> On Thu, Jan 14, 2010 at 06:02:40PM -0800, Chris Wright wrote:
> > * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:
> > > --- a/lib/swiotlb.c
> > > +++ b/lib/swiotlb.c
> > > @@ -364,7 +364,7 @@ cleanup1:
> > >  
> > >  void __init swiotlb_free(void)
> > >  {
> > > -	if (!iommu_sw->overflow_buffer)
> > > +	if (!iommu_sw)
> > >  		return;
> > >  
> > 
> > Sure this is safe for the case where allocation failed? 
Wouldn''t this
> > do free_late_bootmem(__pa(0))?
> 
> That would indeed fail, but alloc_bootmem_low_pages (___alloc_bootmem)
> panics the machine if it can''t allocate the buffer. So we would
never
> actually get to swiotlb_free if we failed to allocate the buffers for
> SWIOTLB.
Ah, right.
> But for the case where the SWIOTLB allocation happens when using 
> swiotlb_late_init_with_default_size, and it fails, this check
> is not sufficient. I will add a check for that or just make
> swiotlb_late_init_with_default_size set iommu_sw to NULL when
> the allocation fails.
That one is ok, since kfree(NULL) is safe.

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-19 18:43 UTC

head link

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> > > +	 * Is the DMA (Bus) address within our bounce buffer (start and
end).
> > > +	 */
> > > +	int (*is_swiotlb_buffer)(struct swiotlb_engine *, dma_addr_t
dev_addr,
> > > +				 phys_addr_t phys);
> > > +
> > 
> > Why is this implementation specific?
> 
> In the current implementation, they both use the physical address and
> do a simple check:
> 
> 	return paddr >= virt_to_phys(io_tlb_start) &&
> 		paddr < virt_to_phys(io_tlb_end);
> 
> That for virtualized environments where a PCI device is passed in would
> work too.
> 
> Unfortunately the problem is when we provide a channel of communication
> with another domain and we end up doing DMA on behalf of another guest.
> The short description of the problem is that a page of memory is shared
> with another domain and the mapping in our domain is correct
(bus->physical)
> the other way (virt->physical->bus) is incorrect for the duration of
this page
> being shared. Hence we need to verify that the page is local to our
> domain, and for that we need the bus address to verify that the
> addr ==  physical->bus(bus->physical(addr)) where addr is the bus
> address (dma_addr_t). If it is not local (shared with another domain)
> we MUST not consider it as a SWIOTLB buffer as that can lead to
> panics and possible corruptions. The trick here is that the phys->virt
> address can fall within the SWIOTLB buffer for pages that are
> shared with another domain and we need the DMA address to do an extra
check.
> 
> The long description of the problem is:
> 
> You are the domain doing some DMA on behalf of another domain. The
> simple example is you are servicing a block device to the other guests.
> One way to implement this is to present a one page ring buffer where
> both domains move the producer and consumer indexes around. Once you get
> a request (READ/WRITE), you use the virtualized channels to
"share" that page
> into your domain. For this you have a buffer (2MB or bigger) wherein for
> pages that shared in to you, you over-write the phys->bus mapping.
> That means that the phys->phys translation is altered for the duration
> of this request being out-standing. Once it is completed, the phys->bus
> translation is restored.
> 
> Here is a little diagram of what happens when a page is shared (and lets
> assume that we have a situation where virt #1 == virt #2, which means
> that phys #1 == phys #2).
> 
> (domain 2) virt#1->phys#1---\
>                              +- bus #1
> (domain 3) virt#2->phys#2 ---/
> 
> (phys#1 points to bus #1, and phys#2 points to bus #1 too).
> 
> The converse of the above picture is not true:
> 
>       /---> phys #1-> virt #1. (domain 2).
> bus#1 +
>       \---> phys #2-> virt #2. (domain 3).
> 
> phys #1 != phys #2 and hence virt #1 != virt #2.
> 
> When a page is not shared:
> 
> (domain 2) virt #1->phys #1--> bus #1
> (domain 3) virt #2->phys #2--> bus #2
> 
> bus #1 -> phys #1 -> virt #1 (domain 2)
> bus #2 -> phys #2 -> virt #2 (domain 3)
> 
> The bus #1 != bus #2, but phys #1 could be same as phys #2 (since
> there are just PFNs). And virt #1 == virt #2.
> 
> The reason for these is that each domain has its own memory layout where
> the memory starts at pfn 0, not at some higher number. So each domain
> sees the physical address identically, but the bus address MUST point
> to different areas (except when sharing) otherwise one domain would
> over-write another domain, ouch.
> 
> Furthermore when a domain is allocated, the pages for the domain are not
> guaranteed to be linearly contiguous so we can''t guarantee that
phys == bus.
> 
> So to guard against the situation in which phys #1 ->virt comes out with
> an address that looks to be within our SWIOTLB buffer we need to do the
> extra check:
> 
> addr ==  physical->bus(bus->physical(addr)) where addr is the bus
> address
> 
> And for scenarios where this is not true (page belongs to another
> domain), that page is not in the SWIOTLB (even thought the virtual and
> physical address point to it).
> 
> > > +	/*
> > > +	 * Is the DMA (Bus) address reachable by the PCI device?.
> > > +	 */
> > > +	bool (*dma_capable)(struct device *, dma_addr_t, phys_addr_t,
size_t);
> 
> I mentioned in the previous explanation that when a domain is allocated,
> the pages are not guaranteed to be linearly contiguous.
> 
> For bare-metal that is not the case and ''dma_capable''
just checks
> the device DMA mask against the bus address.
> 
> For virtualized environment we do need to check if the pages are linearly
> contiguous for the size request.
> 
> For that we need the physical address to iterate over them doing the
> phys->bus#1 translation and checking whether the  (phys+1)->bus#2
> bus#1 == bus#2 + 1.
Right, for both of those cases I was thinking you could make that the
base logic and the existing helpers to do addr translation would be
enough.  But that makes more sense when compiling for a specific arch
(i.e. the checks would be noops and compile away when !xen) as opposed to a
dynamic setup like this.

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-19 18:55 UTC

head link

[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> > > +void __init
> > > +swiotlb_init(int verbose)
> > > +{
> > > +	swiotlb_register_engine(&swiotlb_ops);
> > > +	swiotlb_init_with_default_size(&swiotlb_ops, 64 *
(1<<20),
> > > +					verbose);	/* default to 64MB */
> > > +}
> > 
> > I''d expect the swiotlb-default file to have only private
impl. of the
> > swiotlb_engine.  Shouldn''t this and the init stay in
swiotlb.c?  Also,
> 
> Hmm, were you thinking that it might make sense to pass in
> a swiotlb_ops to swiotlb_init so that it can make the right assignments?
Yeah, something like that.
> The reason why I stuck here was that the swiotlb_ops needed to be
> visible to this function, and having it in swiotlb.c would mean it must
> now include the header definition for swiotlb-defualt.h.
And part of that need is because the allocator (effectively
common/global) is writing to impl. private data, like ->nslabs.  But if
you move that back, then this may not be an issue.
> > would you ever call swiotlb_init w/out register_engine, why not move
> > register to the swiotlb_init?
> 
> In essence combine swiotlb_register_engine with
swiotlb_init_with_default_size?
Yep.
> There would  still be a need for late call mechanism. 
> Perhaps having two variants of swiotlb_init?: swiotlb_early_init(struct
> swiotlb_engine *swiotlb_ops) and swiotlb_late_init(struct swiotlb_engine
> *swiotlb_ops)?
That''s basically what we have now,
swiotlb{,_late}_init_with_default_size,
so seems reasonable to me.
> Or perhaps just pass in an argument: swiotlb_init(int late)?
> 
> Furthermore have this new swiotlb_init detect if some of the fields
> (start ,end, overflow_buffer) have been allocated and if so skip the
> default allocation altogether?
That would keep the allocate, ->release, allocate cycle from happening
(which seems odd when it''s the same sizes and same core allocation).

Part of why I thought there was too much moved into the impl. private
engine structure.

thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Chris Wright

2010-Jan-19 19:00 UTC

head link

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

* Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:> On Thu, Jan 14, 2010 at 05:22:13PM -0800, Chris Wright wrote:
> > * Konrad Rzeszutek Wilk (konrad.wilk@oracle.com) wrote:
> > > Before this patch, if you specified
''swiotlb=force,1024'' it would
> > > ignore both arguments. This fixes it and allows the user specify
it
> > > in any order (or none at all).
> > > 
> > > Signed-off-by: Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com>
> > 
> > Having only one substring of digits makes allowing arbitrary order
> > less useful if more options get added (as in foo,bar,1024,baz,force
> > would make more sense as foo,bar,nslabs=1024,baz,force).  Do you
> > think this one is really needed?  If so, be useful to update
> 
> I got caught a couple of times where I needed to provide both arguments
> and could not figure out why it did not work. Switching the arguments
> around fixed it. Thought that it might make sense to remove this
> potential trap from other folks by this patch.
> 
> Your point about more options got me thinking about the overflow buffer.
> I could also provide an over-ride for that, maybe:
> 
> "swiotlb=force,overflow=32,slabs=1024"
Right, in which case would the is_digit() check remain ahead of the loop
to protect the "legacy" format (swiotlb=1024,force), forcing mixing
like
you did to the new format (swiotlb=force,slabs=1024 or
swiotlb=slabs=1024,force)?
> (Not sure about the syntax?)
> 
> > Documentation/kernel-parameters.txt which is slightly out of date now.
> 
> Oh, good catch. Will roll the patch for that file as well.
thanks,
-chris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-19 19:39 UTC

head link

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

> > "swiotlb=force,overflow=32,slabs=1024"
> 
> Right, in which case would the is_digit() check remain ahead of the loop
> to protect the "legacy" format (swiotlb=1024,force), forcing
mixing like
So, currently the patch that I posted works fine with the "legacy"
format.
> you did to the new format (swiotlb=force,slabs=1024 or
> swiotlb=slabs=1024,force)?
I will make sure that the new patch, which will have follow the format
you mentioned, also work with the "legacy" format.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

FUJITA Tomonori

2010-Jan-22 01:51 UTC

head link

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

On Thu, 14 Jan 2010 18:00:51 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> The structure contains all of the existing variables used in
> software IO TLB (swiotlb.c) collected within a structure.
> 
> Additionally a name variable and a deconstructor (release) function
> variable is defined for API usages.
> 
> The other set of functions: is_swiotlb_buffer, dma_capable, phys_to_bus,
> bus_to_phys, virt_to_bus, and bus_to_virt server as a method to abstract
> them out of the SWIOTLB library.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  include/linux/swiotlb.h |   94
+++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 94 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index febedcf..781c3aa 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -24,6 +24,100 @@ extern int swiotlb_force;
>  
>  extern void swiotlb_init(int verbose);
>  
> +struct swiotlb_engine {
> +
> +	/*
> +	 * Name of the engine (ie: "Software IO TLB")
> +	 */
> +	const char	*name;
Please don''t add another ''layer'' to the dma path.
This leads to too
many indirect function calls. Some people concern about the overhead
of even the current code.

Create something like libswiotlb or whatever (as I said before) to
export functions. Then call them directly.

btw, please send swiotlb patches to lkml.

Thanks,

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Jan-26 16:20 UTC

head link

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

> Please don''t add another ''layer'' to the dma
path. This leads to too
> many indirect function calls. Some people concern about the overhead
> of even the current code.
Any ideas on a good benchmarking tool for that?
> 
> Create something like libswiotlb or whatever (as I said before) to
> export functions. Then call them directly.
I think what you suggesting is that that I abstract all of the address
translation functions (''phys_to_dma'',
''bus_to_phys'', etc.) from the swiotlb.c
layer. 

Specifically, to altogether remove the phys_to_dma, bus_to_phys, etc. calls
from the swiotlb.c file. In place of them, have the results of those functions
(both physical and bus address) be passed in to the map_single and unmap_single
calls. The layer that calls map_single/unmap_single would then be responsible
for translating the phys/bus/virtual addresses.

Implementation wise, I could split the swiotlb.c in two files:
a) /lib/swiotlb-core.c and b) /lib/swiotlb-default.c.

The ''a)'' would have the swiotlb_init_*, swiotlb_free,
swiotlb_bounce,
sync_single, map_single, and unmap_single.

The ''b)'' would contain the DMA function, such as
swiotlb_sync_single_*,
swiotlb_[map|unmap]_[page|sg]_* and swiotlb_[alloc|free]_coherent.
Naturally it would do the phys,bus->bus,phys translations.

The Xen complementary to ''b)'' would be in a different
library/file
(/lib/swiotlb-xen.c?).

Is that close to what you were thinking?
> 
> btw, please send swiotlb patches to lkml.
Once we''ve nailed down the structure of the changes I''ll
definitely
send it out to LKML.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

FUJITA Tomonori

2010-Feb-03 02:04 UTC

head link

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

On Tue, 26 Jan 2010 11:20:43 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > Please don''t add another ''layer'' to the dma
path. This leads to too
> > many indirect function calls. Some people concern about the overhead
> > of even the current code.
> 
> Any ideas on a good benchmarking tool for that?
Any tools with a system capable of fast I/Os like hundreds of SSDs?

> > Create something like libswiotlb or whatever (as I said before) to
> > export functions. Then call them directly.
> 
> I think what you suggesting is that that I abstract all of the address
> translation functions (''phys_to_dma'',
''bus_to_phys'', etc.) from the swiotlb.c
> layer. 
> 
> Specifically, to altogether remove the phys_to_dma, bus_to_phys, etc. calls
> from the swiotlb.c file. In place of them, have the results of those
functions
> (both physical and bus address) be passed in to the map_single and
unmap_single
> calls. The layer that calls map_single/unmap_single would then be
responsible
> for translating the phys/bus/virtual addresses.
> 
> Implementation wise, I could split the swiotlb.c in two files:
> a) /lib/swiotlb-core.c and b) /lib/swiotlb-default.c.
> 
> The ''a)'' would have the swiotlb_init_*, swiotlb_free,
swiotlb_bounce,
> sync_single, map_single, and unmap_single.
I just want core functions to manage the swiotlb buffer (io_tlb)
there. Don''t pass the address translation functions to swiotlb_map_*,
etc.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [RFC SWIOTLB-0.4]

> > Any ideas on a good benchmarking tool for that?
> 
> Any tools with a system capable of fast I/Os like hundreds of SSDs?
That will take some time, but I think I will be able to get some time on
the Oracle''s Exadata box.
> I just want core functions to manage the swiotlb buffer (io_tlb)
> there. Don''t pass the address translation functions to
swiotlb_map_*,
> etc.
Attached is a set of eleven RFC patches that split the SWIOTLB library in
two layers: core, and dma_ops related functions.

The set of eleven patches is also accessible on:

git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6.git
swiotlb-rfc-0.4

An example of how this can be utilized in both bare-metal and Xen environments
is this git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git swiotlb-xen-0.4

Sincerely,

Konrad Rzeszutek Wilk

P.S.
The diffstat:

 Documentation/x86/x86_64/boot-options.txt |    6 +-
 include/linux/swiotlb.h                   |   45 ++-
 lib/Makefile                              |    2 +-
 lib/swiotlb-core.c                        |  572 ++++++++++++++++++++++++++++
 lib/swiotlb.c                             |  579 +---------------------------
 5 files changed, 637 insertions(+), 567 deletions(-)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 01/11] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

Before this patch, if you specified ''swiotlb=force,1024'' it
would
ignore both arguments. This fixes it and allows the user to specify it
in any order (or none at all).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 437eedb..e6d9e32 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -102,16 +102,18 @@ static int late_alloc;
 static int __init
 setup_io_tlb_npages(char *str)
 {
-	if (isdigit(*str)) {
-		io_tlb_nslabs = simple_strtoul(str, &str, 0);
-		/* avoid tail segment of size < IO_TLB_SEGSIZE */
-		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+	while (*str) {
+		if (isdigit(*str)) {
+			io_tlb_nslabs = simple_strtoul(str, &str, 0);
+			/* avoid tail segment of size < IO_TLB_SEGSIZE */
+			io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+		}
+		if (!strncmp(str, "force", 5))
+			swiotlb_force = 1;
+		str += strcspn(str, ",");
+		if (*str == '','')
+			++str;
 	}
-	if (*str == '','')
-		++str;
-	if (!strcmp(str, "force"))
-		swiotlb_force = 1;
-
 	return 1;
 }
 __setup("swiotlb=", setup_io_tlb_npages);
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 02/11] [swiotlb] Make ''setup_io_tlb_npages'' accept new ''swiotlb='' syntax.

The old syntax for ''swiotlb'' is still in effect, and we extend
it
now to include the overflow buffer size. The syntax is now:

swiotlb=[force,][nslabs=<pages>,][overflow=<size>] or more
commonly know as:

swiotlb=[force]
swiotlb=[nslabs=<pages>]
swiotlb=[overflow=<size>]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 Documentation/x86/x86_64/boot-options.txt |    6 ++++-
 lib/swiotlb.c                             |   36 +++++++++++++++++++++++++---
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/Documentation/x86/x86_64/boot-options.txt
b/Documentation/x86/x86_64/boot-options.txt
index 29a6ff8..81f9b94 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -267,10 +267,14 @@ IOMMU (input/output memory management unit)
 
   iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
   implementation:
-    swiotlb=<pages>[,force]
+    swiotlb=[npages=<pages>]
+    swiotlb=[force]
+    swiotlb=[overflow=<size>]
+
     <pages>            Prereserve that many 128K pages for the software
IO
                        bounce buffering.
     force              Force all IO through the software TLB.
+    <size>             Size in bytes of the overflow buffer.
 
   Settings for the IBM Calgary hardware IOMMU currently found in IBM
   pSeries and xSeries machines:
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index e6d9e32..0663879 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -102,7 +102,27 @@ static int late_alloc;
 static int __init
 setup_io_tlb_npages(char *str)
 {
+	int get_value(const char *token, char *str, char **endp)
+	{
+		ssize_t len;
+		int val = 0;
+
+		len = strlen(token);
+		if (!strncmp(str, token, len)) {
+			str += len;
+			if (*str == ''='')
+				++str;
+			if (*str != ''\0'')
+				val = simple_strtoul(str, endp, 0);
+		}
+		*endp = str;
+		return val;
+	}
+
+	int val;
+
 	while (*str) {
+		/* The old syntax */
 		if (isdigit(*str)) {
 			io_tlb_nslabs = simple_strtoul(str, &str, 0);
 			/* avoid tail segment of size < IO_TLB_SEGSIZE */
@@ -110,14 +130,22 @@ setup_io_tlb_npages(char *str)
 		}
 		if (!strncmp(str, "force", 5))
 			swiotlb_force = 1;
-		str += strcspn(str, ",");
-		if (*str == '','')
-			++str;
+		/* The new syntax: swiotlb=nslabs=16384,overflow=32768,force */
+		val = get_value("nslabs", str, &str);
+		if (val)
+			io_tlb_nslabs = ALIGN(val, IO_TLB_SEGSIZE);
+
+		val = get_value("overflow", str, &str);
+		if (val)
+			io_tlb_overflow = val;
+		str = strpbrk(str, ",");
+		if (!str)
+			break;
+		str++; /* skip '','' */
 	}
 	return 1;
 }
 __setup("swiotlb=", setup_io_tlb_npages);
-/* make io_tlb_overflow tunable too? */
 
 /* Note that this doesn''t work with highmem page */
 static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 03/11] [swiotlb] Normalize the swiotlb_init_* function''s naming syntax.

The previous function names were misleading by including
"with_default_size"
as the size was being passed in as argument. Change the functions names
to be clear on what they do.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 0663879..7b66fc3 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -174,7 +174,7 @@ void swiotlb_print_info(void)
  * structures for the software IO TLB used to implement the DMA API.
  */
 void __init
-swiotlb_init_with_default_size(size_t default_size, int verbose)
+swiotlb_init_early(size_t default_size, int verbose)
 {
 	unsigned long i, bytes;
 
@@ -217,7 +217,7 @@ swiotlb_init_with_default_size(size_t default_size, int
verbose)
 void __init
 swiotlb_init(int verbose)
 {
-	swiotlb_init_with_default_size(64 * (1<<20), verbose);	/* default to
64MB */
+	swiotlb_init_early(64 * (1<<20), verbose);	/* default to 64MB */
 }
 
 /*
@@ -226,7 +226,7 @@ swiotlb_init(int verbose)
  * This should be just like above, but with some error catching.
  */
 int
-swiotlb_late_init_with_default_size(size_t default_size)
+swiotlb_init_late(size_t default_size)
 {
 	unsigned long i, bytes, req_nslabs = io_tlb_nslabs;
 	unsigned int order;
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 04/11] [swiotlb] Make printk''s use same prefix and include dev_err when possible.

Various printk''s had the prefix ''DMA'' in them, but
not all of them.
This makes all of the printk''s have the ''DMA'' in them
and for error
cases use the ''dev_err'' macro.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 7b66fc3..0eb64d7 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -162,9 +162,9 @@ void swiotlb_print_info(void)
 	pstart = virt_to_phys(io_tlb_start);
 	pend = virt_to_phys(io_tlb_end);
 
-	printk(KERN_INFO "Placing %luMB software IO TLB between %p - %p\n",
+	printk(KERN_INFO "DMA: Placing %luMB software IO TLB between %p -
%p\n",
 	       bytes >> 20, io_tlb_start, io_tlb_end);
-	printk(KERN_INFO "software IO TLB at phys %#llx - %#llx\n",
+	printk(KERN_INFO "DMA: software IO TLB at phys %#llx - %#llx\n",
 	       (unsigned long long)pstart,
 	       (unsigned long long)pend);
 }
@@ -190,7 +190,7 @@ swiotlb_init_early(size_t default_size, int verbose)
 	 */
 	io_tlb_start = alloc_bootmem_low_pages(bytes);
 	if (!io_tlb_start)
-		panic("Cannot allocate SWIOTLB buffer");
+		panic("DMA: Cannot allocate SWIOTLB buffer");
 	io_tlb_end = io_tlb_start + bytes;
 
 	/*
@@ -209,7 +209,7 @@ swiotlb_init_early(size_t default_size, int verbose)
 	 */
 	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
 	if (!io_tlb_overflow_buffer)
-		panic("Cannot allocate SWIOTLB overflow buffer!\n");
+		panic("DMA: Cannot allocate SWIOTLB overflow buffer!\n");
 	if (verbose)
 		swiotlb_print_info();
 }
@@ -255,8 +255,8 @@ swiotlb_init_late(size_t default_size)
 		goto cleanup1;
 
 	if (order != get_order(bytes)) {
-		printk(KERN_WARNING "Warning: only able to allocate %ld MB "
-		       "for software IO TLB\n", (PAGE_SIZE << order) >>
20);
+		printk(KERN_WARNING "DMA: Warning: only able to allocate %ld MB"
+		       " for software IO TLB\n", (PAGE_SIZE << order)
>> 20);
 		io_tlb_nslabs = SLABS_PER_PAGE << order;
 		bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 	}
@@ -602,7 +602,8 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 
 	/* Confirm address can be DMA''d by device */
 	if (dev_addr + size - 1 > dma_mask) {
-		printk("hwdev DMA mask = 0x%016Lx, dev_addr = 0x%016Lx\n",
+		dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \
+		       "dev_addr = 0x%016Lx\n",
 		       (unsigned long long)dma_mask,
 		       (unsigned long long)dev_addr);
 
@@ -640,8 +641,7 @@ swiotlb_full(struct device *dev, size_t size, int dir, int
do_panic)
 	 * When the mapping is small enough return a static buffer to limit
 	 * the damage, or panic when the transfer is too big.
 	 */
-	printk(KERN_ERR "DMA: Out of SW-IOMMU space for %zu bytes at "
-	       "device %s\n", size, dev ? dev_name(dev) : "?");
+	dev_err(dev, "DMA: Out of SW-IOMMU space for %zu bytes.", size);
 
 	if (size <= io_tlb_overflow || !do_panic)
 		return;
@@ -694,7 +694,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 	 * Ensure that the address returned is DMA''ble
 	 */
 	if (!dma_capable(dev, dev_addr, size))
-		panic("map_single: bounce buffer is not DMA''ble");
+		panic("DMA: swiotlb_map_single: bounce buffer is not
DMA''ble");
 
 	return dev_addr;
 }
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 05/11] [swiotlb] Make internal bookkeeping functions have ''do_'' prefix.

The functions that operate on io_tlb_list/io_tlb_start/io_tlb_orig_addr
have the prefix ''do_'' now.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   20 ++++++++++----------
 1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 0eb64d7..9085eab 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -60,8 +60,8 @@ enum dma_sync_target {
 int swiotlb_force;
 
 /*
- * Used to do a quick range check in unmap_single and
- * sync_single_*, to see if the memory was in fact allocated by this
+ * Used to do a quick range check in do_unmap_single and
+ * do_sync_single_*, to see if the memory was in fact allocated by this
  * API.
  */
 static char *io_tlb_start, *io_tlb_end;
@@ -394,7 +394,7 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr,
size_t size,
  * Allocates bounce buffer and returns its kernel virtual address.
  */
 static void *
-map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir)
+do_map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir)
 {
 	unsigned long flags;
 	char *dma_addr;
@@ -540,7 +540,7 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t
size, int dir)
 }
 
 static void
-sync_single(struct device *hwdev, char *dma_addr, size_t size,
+do_sync_single(struct device *hwdev, char *dma_addr, size_t size,
 	    int dir, int target)
 {
 	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
@@ -589,10 +589,10 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 	if (!ret) {
 		/*
 		 * We are either out of memory or the device can''t DMA
-		 * to GFP_DMA memory; fall back on map_single(), which
+		 * to GFP_DMA memory; fall back on do_map_single(), which
 		 * will grab memory from the lowest available address range.
 		 */
-		ret = map_single(hwdev, 0, size, DMA_FROM_DEVICE);
+		ret = do_map_single(hwdev, 0, size, DMA_FROM_DEVICE);
 		if (!ret)
 			return NULL;
 	}
@@ -626,7 +626,7 @@ swiotlb_free_coherent(struct device *hwdev, size_t size,
void *vaddr,
 	if (!is_swiotlb_buffer(paddr))
 		free_pages((unsigned long)vaddr, get_order(size));
 	else
-		/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
+		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
 		do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE);
 }
 EXPORT_SYMBOL(swiotlb_free_coherent);
@@ -682,7 +682,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 	/*
 	 * Oh well, have to allocate and map a bounce buffer.
 	 */
-	map = map_single(dev, phys, size, dir);
+	map = do_map_single(dev, phys, size, dir);
 	if (!map) {
 		swiotlb_full(dev, size, dir, 1);
 		map = io_tlb_overflow_buffer;
@@ -759,7 +759,7 @@ swiotlb_sync_single(struct device *hwdev, dma_addr_t
dev_addr,
 	BUG_ON(dir == DMA_NONE);
 
 	if (is_swiotlb_buffer(paddr)) {
-		sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
+		do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
 		return;
 	}
 
@@ -847,7 +847,7 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct
scatterlist *sgl, int nelems,
 
 		if (swiotlb_force ||
 		    !dma_capable(hwdev, dev_addr, sg->length)) {
-			void *map = map_single(hwdev, sg_phys(sg),
+			void *map = do_map_single(hwdev, sg_phys(sg),
 					       sg->length, dir);
 			if (!map) {
 				/* Don''t panic here, we expect map_sg users
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 06/11] [swiotlb] do_map_single: abstract out swiotlb_virt_to_bus calls out.

We want to move that function out of do_map_single so that the caller
of this function does the virt->phys->bus address translation.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   23 +++++++++++++++--------
 1 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 9085eab..4ab3885 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -394,20 +394,19 @@ static void swiotlb_bounce(phys_addr_t phys, char
*dma_addr, size_t size,
  * Allocates bounce buffer and returns its kernel virtual address.
  */
 static void *
-do_map_single(struct device *hwdev, phys_addr_t phys, size_t size, int dir)
+do_map_single(struct device *hwdev, phys_addr_t phys,
+	       unsigned long start_dma_addr, size_t size, int dir)
 {
 	unsigned long flags;
 	char *dma_addr;
 	unsigned int nslots, stride, index, wrap;
 	int i;
-	unsigned long start_dma_addr;
 	unsigned long mask;
 	unsigned long offset_slots;
 	unsigned long max_slots;
 
 	mask = dma_get_seg_boundary(hwdev);
-	start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start) & mask;
-
+	start_dma_addr = start_dma_addr & mask;
 	offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
 
 	/*
@@ -574,6 +573,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 	void *ret;
 	int order = get_order(size);
 	u64 dma_mask = DMA_BIT_MASK(32);
+	unsigned long start_dma_addr;
 
 	if (hwdev && hwdev->coherent_dma_mask)
 		dma_mask = hwdev->coherent_dma_mask;
@@ -592,7 +592,9 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		 * to GFP_DMA memory; fall back on do_map_single(), which
 		 * will grab memory from the lowest available address range.
 		 */
-		ret = do_map_single(hwdev, 0, size, DMA_FROM_DEVICE);
+		start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
+		ret = do_map_single(hwdev, 0, start_dma_addr, size,
+				    DMA_FROM_DEVICE);
 		if (!ret)
 			return NULL;
 	}
@@ -607,7 +609,7 @@ swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 		       (unsigned long long)dma_mask,
 		       (unsigned long long)dev_addr);
 
-		/* DMA_TO_DEVICE to avoid memcpy in unmap_single */
+		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
 		do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE);
 		return NULL;
 	}
@@ -666,6 +668,7 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 			    enum dma_data_direction dir,
 			    struct dma_attrs *attrs)
 {
+	unsigned long start_dma_addr;
 	phys_addr_t phys = page_to_phys(page) + offset;
 	dma_addr_t dev_addr = phys_to_dma(dev, phys);
 	void *map;
@@ -682,7 +685,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page
*page,
 	/*
 	 * Oh well, have to allocate and map a bounce buffer.
 	 */
-	map = do_map_single(dev, phys, size, dir);
+	start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start);
+	map = do_map_single(dev, phys, start_dma_addr, size, dir);
 	if (!map) {
 		swiotlb_full(dev, size, dir, 1);
 		map = io_tlb_overflow_buffer;
@@ -836,11 +840,13 @@ int
 swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
 		     enum dma_data_direction dir, struct dma_attrs *attrs)
 {
+	unsigned long start_dma_addr;
 	struct scatterlist *sg;
 	int i;
 
 	BUG_ON(dir == DMA_NONE);
 
+	start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
 	for_each_sg(sgl, sg, nelems, i) {
 		phys_addr_t paddr = sg_phys(sg);
 		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
@@ -848,7 +854,8 @@ swiotlb_map_sg_attrs(struct device *hwdev, struct
scatterlist *sgl, int nelems,
 		if (swiotlb_force ||
 		    !dma_capable(hwdev, dev_addr, sg->length)) {
 			void *map = do_map_single(hwdev, sg_phys(sg),
-					       sg->length, dir);
+						  start_dma_addr,
+						  sg->length, dir);
 			if (!map) {
 				/* Don''t panic here, we expect map_sg users
 				   to do proper error handling. */
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 07/11] [swiotlb] Fix checkpatch warnings.

From: Konrad Rzeszutek <konrad@t42p-lan.dumpdata.com>

I''ve fixed most of the checkpatch warnings except these three:

a). WARNING: consider using strict_strtoul in preference to simple_strtoul
115: FILE: swiotlb.c:115:
+                               val = simple_strtoul(str, endp, 0);

b). WARNING: consider using strict_strtoul in preference to simple_strtoul
126: FILE: swiotlb.c:126:
+                       io_tlb_nslabs = simple_strtoul(str, &str, 0);

c).WARNING: Use of volatile is usually wrong: see
Documentation/volatile-considered-harmful.txt
151: FILE: swiotlb.c:151:
+                                     volatile void *address)

total: 0 errors, 3 warnings, 965 lines checked

As a) and b) are OK, we MUST use simple_strtoul. For c) the
''volatile-consider*''
document outlines that it is OK for pointers to data structrues in coherent
memory
which this certainly could be, hence not fixing that.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb.c |   38 +++++++++++++++++++-------------------
 1 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 4ab3885..80a2306 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -29,16 +29,15 @@
 #include <linux/ctype.h>
 #include <linux/highmem.h>
 
-#include <asm/io.h>
+#include <linux/io.h>
 #include <asm/dma.h>
-#include <asm/scatterlist.h>
+#include <linux/scatterlist.h>
 
 #include <linux/init.h>
 #include <linux/bootmem.h>
 #include <linux/iommu-helper.h>
 
-#define OFFSET(val,align) ((unsigned long)	\
-	                   ( (val) & ( (align) - 1)))
+#define OFFSET(val, align) ((unsigned long)	((val) & ((align) - 1)))
 
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 
@@ -200,7 +199,7 @@ swiotlb_init_early(size_t default_size, int verbose)
 	 */
 	io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int));
 	for (i = 0; i < io_tlb_nslabs; i++)
- 		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
 	io_tlb_index = 0;
 	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t));
 
@@ -269,18 +268,16 @@ swiotlb_init_late(size_t default_size)
 	 * between io_tlb_start and io_tlb_end.
 	 */
 	io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL,
-	                              get_order(io_tlb_nslabs * sizeof(int)));
+					get_order(io_tlb_nslabs * sizeof(int)));
 	if (!io_tlb_list)
 		goto cleanup2;
 
 	for (i = 0; i < io_tlb_nslabs; i++)
- 		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
 	io_tlb_index = 0;
 
-	io_tlb_orig_addr = (phys_addr_t *)
-		__get_free_pages(GFP_KERNEL,
-				 get_order(io_tlb_nslabs *
-					   sizeof(phys_addr_t)));
+	io_tlb_orig_addr = (phys_addr_t *) __get_free_pages(GFP_KERNEL,
+				get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
 	if (!io_tlb_orig_addr)
 		goto cleanup3;
 
@@ -290,7 +287,7 @@ swiotlb_init_late(size_t default_size)
 	 * Get the overflow emergency buffer
 	 */
 	io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
-	                                          get_order(io_tlb_overflow));
+					 get_order(io_tlb_overflow));
 	if (!io_tlb_overflow_buffer)
 		goto cleanup4;
 
@@ -305,8 +302,8 @@ cleanup4:
 		   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
 	io_tlb_orig_addr = NULL;
 cleanup3:
-	free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
-	                                                 sizeof(int)));
+	free_pages((unsigned long)io_tlb_list,
+		   get_order(io_tlb_nslabs * sizeof(int)));
 	io_tlb_list = NULL;
 cleanup2:
 	io_tlb_end = NULL;
@@ -410,8 +407,8 @@ do_map_single(struct device *hwdev, phys_addr_t phys,
 	offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
 
 	/*
- 	 * Carefully handle integer overflow which can occur when mask == ~0UL.
- 	 */
+	 * Carefully handle integer overflow which can occur when mask == ~0UL.
+	 */
 	max_slots = mask + 1
 		    ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT
 		    : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
@@ -458,7 +455,8 @@ do_map_single(struct device *hwdev, phys_addr_t phys,
 
 			for (i = index; i < (int) (index + nslots); i++)
 				io_tlb_list[i] = 0;
-			for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE - 1)
&& io_tlb_list[i]; i--)
+			for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE)
+				!= IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--)
 				io_tlb_list[i] = ++count;
 			dma_addr = io_tlb_start + (index << IO_TLB_SHIFT);
 
@@ -532,7 +530,8 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t
size, int dir)
 		 * Step 2: merge the returned slots with the preceding slots,
 		 * if available (non zero)
 		 */
-		for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != IO_TLB_SEGSIZE -1)
&& io_tlb_list[i]; i--)
+		for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+				IO_TLB_SEGSIZE - 1)
&& io_tlb_list[i]; i--)
 			io_tlb_list[i] = ++count;
 	}
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
@@ -888,7 +887,8 @@ EXPORT_SYMBOL(swiotlb_map_sg);
  */
 void
 swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
-		       int nelems, enum dma_data_direction dir, struct dma_attrs *attrs)
+		       int nelems, enum dma_data_direction dir,
+		       struct dma_attrs *attrs)
 {
 	struct scatterlist *sg;
 	int i;
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 08/11] [swiotlb] Re-order the function declerations.

Move to the top the function declerations dealing with startup/shutdown
of SWIOTLB. This is in preperation of next set of patches which split the
swiotlb.c file in two, and wherein the bookkeeping, init and free related
functions would be declared at the beginning and the dma_ops functions
at the end of the header file.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/swiotlb.h |   15 ++++++++-------
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index febedcf..84e7a53 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -23,7 +23,15 @@ extern int swiotlb_force;
 #define IO_TLB_SHIFT 11
 
 extern void swiotlb_init(int verbose);
+#ifdef CONFIG_SWIOTLB
+extern void __init swiotlb_free(void);
+#else
+static inline void swiotlb_free(void) { }
+#endif
+extern void swiotlb_print_info(void);
+
 
+/* IOMMU functions. */
 extern void
 *swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 			dma_addr_t *dma_handle, gfp_t flags);
@@ -89,11 +97,4 @@ swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t
dma_addr);
 extern int
 swiotlb_dma_supported(struct device *hwdev, u64 mask);
 
-#ifdef CONFIG_SWIOTLB
-extern void __init swiotlb_free(void);
-#else
-static inline void swiotlb_free(void) { }
-#endif
-
-extern void swiotlb_print_info(void);
 #endif /* __LINUX_SWIOTLB_H */
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 09/11] [swiotlb] Make swiotlb bookkeeping functions visible in the header file.

We put the init, free, and functions dealing with the operations on the
SWIOTLB buffer at the top of the header. Also we export some of the variables
that are used by the dma_ops functions.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/swiotlb.h |   31 ++++++++++++++++++++++++++++++-
 lib/swiotlb.c           |   24 ++++++++----------------
 2 files changed, 38 insertions(+), 17 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 84e7a53..af66473 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -30,8 +30,37 @@ static inline void swiotlb_free(void) { }
 #endif
 extern void swiotlb_print_info(void);
 
+/* Internal book-keeping functions. Must be linked against the library
+ * to take advantage of them.*/
+#ifdef CONFIG_SWIOTLB
+/*
+ * Enumeration for sync targets
+ */
+enum dma_sync_target {
+	SYNC_FOR_CPU = 0,
+	SYNC_FOR_DEVICE = 1,
+};
+extern char *io_tlb_start;
+extern char *io_tlb_end;
+extern unsigned long io_tlb_nslabs;
+extern void *io_tlb_overflow_buffer;
+extern unsigned long io_tlb_overflow;
+extern int is_swiotlb_buffer(phys_addr_t paddr);
+extern void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
+			   enum dma_data_direction dir);
+extern void *do_map_single(struct device *hwdev, phys_addr_t phys,
+			    unsigned long start_dma_addr, size_t size, int dir);
+
+extern void do_unmap_single(struct device *hwdev, char *dma_addr, size_t size,
+			     int dir);
+
+extern void do_sync_single(struct device *hwdev, char *dma_addr, size_t size,
+			   int dir, int target);
+extern void swiotlb_full(struct device *dev, size_t size, int dir, int
do_panic);
+extern void __init swiotlb_init_early(size_t default_size, int verbose);
+#endif
 
-/* IOMMU functions. */
+/* swiotlb.c: dma_ops functions. */
 extern void
 *swiotlb_alloc_coherent(struct device *hwdev, size_t size,
 			dma_addr_t *dma_handle, gfp_t flags);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 80a2306..c982d33 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -48,14 +48,6 @@
  */
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 
-/*
- * Enumeration for sync targets
- */
-enum dma_sync_target {
-	SYNC_FOR_CPU = 0,
-	SYNC_FOR_DEVICE = 1,
-};
-
 int swiotlb_force;
 
 /*
@@ -63,18 +55,18 @@ int swiotlb_force;
  * do_sync_single_*, to see if the memory was in fact allocated by this
  * API.
  */
-static char *io_tlb_start, *io_tlb_end;
+char *io_tlb_start, *io_tlb_end;
 
 /*
  * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and
  * io_tlb_end.  This is command line adjustable via setup_io_tlb_npages.
  */
-static unsigned long io_tlb_nslabs;
+unsigned long io_tlb_nslabs;
 
 /*
  * When the IOMMU overflows we return a fallback buffer. This sets the size.
  */
-static unsigned long io_tlb_overflow = 32*1024;
+unsigned long io_tlb_overflow = 32*1024;
 
 void *io_tlb_overflow_buffer;
 
@@ -340,7 +332,7 @@ void __init swiotlb_free(void)
 	}
 }
 
-static int is_swiotlb_buffer(phys_addr_t paddr)
+int is_swiotlb_buffer(phys_addr_t paddr)
 {
 	return paddr >= virt_to_phys(io_tlb_start) &&
 		paddr < virt_to_phys(io_tlb_end);
@@ -349,7 +341,7 @@ static int is_swiotlb_buffer(phys_addr_t paddr)
 /*
  * Bounce: copy the swiotlb buffer back to the original dma location
  */
-static void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
+void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
 			   enum dma_data_direction dir)
 {
 	unsigned long pfn = PFN_DOWN(phys);
@@ -390,7 +382,7 @@ static void swiotlb_bounce(phys_addr_t phys, char *dma_addr,
size_t size,
 /*
  * Allocates bounce buffer and returns its kernel virtual address.
  */
-static void *
+void *
 do_map_single(struct device *hwdev, phys_addr_t phys,
 	       unsigned long start_dma_addr, size_t size, int dir)
 {
@@ -496,7 +488,7 @@ found:
 /*
  * dma_addr is the kernel virtual address of the bounce buffer to unmap.
  */
-static void
+void
 do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
 {
 	unsigned long flags;
@@ -537,7 +529,7 @@ do_unmap_single(struct device *hwdev, char *dma_addr, size_t
size, int dir)
 	spin_unlock_irqrestore(&io_tlb_lock, flags);
 }
 
-static void
+void
 do_sync_single(struct device *hwdev, char *dma_addr, size_t size,
 	    int dir, int target)
 {
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 10/11] [swiotlb] Rename swiotlb.c to swiotlb-core.c

From: Konrad Rzeszutek <konrad@t42p-lan.dumpdata.com>

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/swiotlb.h |    5 +-
 lib/Makefile            |    2 +-
 lib/swiotlb-core.c      |  957 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/swiotlb.c           |  957 -----------------------------------------------
 4 files changed, 961 insertions(+), 960 deletions(-)
 create mode 100644 lib/swiotlb-core.c
 delete mode 100644 lib/swiotlb.c

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index af66473..6ab9b7c 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -22,6 +22,7 @@ extern int swiotlb_force;
  */
 #define IO_TLB_SHIFT 11
 
+/* swiotlb-core.c */
 extern void swiotlb_init(int verbose);
 #ifdef CONFIG_SWIOTLB
 extern void __init swiotlb_free(void);
@@ -30,8 +31,8 @@ static inline void swiotlb_free(void) { }
 #endif
 extern void swiotlb_print_info(void);
 
-/* Internal book-keeping functions. Must be linked against the library
- * to take advantage of them.*/
+/* swiotlb-core.c: Internal book-keeping functions.
+ * Must be linked against the library to take advantage of them.*/
 #ifdef CONFIG_SWIOTLB
 /*
  * Enumeration for sync targets
diff --git a/lib/Makefile b/lib/Makefile
index 3b0b4a6..40728c5 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -78,7 +78,7 @@ obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o
 obj-$(CONFIG_SMP) += percpu_counter.o
 obj-$(CONFIG_AUDIT_GENERIC) += audit.o
 
-obj-$(CONFIG_SWIOTLB) += swiotlb.o
+obj-$(CONFIG_SWIOTLB) += swiotlb-core.o swiotlb.o
 obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o
 obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
diff --git a/lib/swiotlb-core.c b/lib/swiotlb-core.c
new file mode 100644
index 0000000..c982d33
--- /dev/null
+++ b/lib/swiotlb-core.c
@@ -0,0 +1,957 @@
+/*
+ * Dynamic DMA mapping support.
+ *
+ * This implementation is a fallback for platforms that do not support
+ * I/O TLBs (aka DMA address translation hardware).
+ * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com>
+ * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com>
+ * Copyright (C) 2000, 2003 Hewlett-Packard Co
+ *	David Mosberger-Tang <davidm@hpl.hp.com>
+ *
+ * 03/05/07 davidm	Switch from PCI-DMA to generic device DMA API.
+ * 00/12/13 davidm	Rename to swiotlb.c and add mark_clean() to avoid
+ *			unnecessary i-cache flushing.
+ * 04/07/.. ak		Better overflow handling. Assorted fixes.
+ * 05/09/10 linville	Add support for syncing ranges, support syncing for
+ *			DMA_BIDIRECTIONAL mappings, miscellaneous cleanup.
+ * 08/12/11 beckyb	Add highmem support
+ */
+
+#include <linux/cache.h>
+#include <linux/dma-mapping.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/spinlock.h>
+#include <linux/string.h>
+#include <linux/swiotlb.h>
+#include <linux/pfn.h>
+#include <linux/types.h>
+#include <linux/ctype.h>
+#include <linux/highmem.h>
+
+#include <linux/io.h>
+#include <asm/dma.h>
+#include <linux/scatterlist.h>
+
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/iommu-helper.h>
+
+#define OFFSET(val, align) ((unsigned long)	((val) & ((align) - 1)))
+
+#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
+
+/*
+ * Minimum IO TLB size to bother booting with.  Systems with mainly
+ * 64bit capable cards will only lightly use the swiotlb.  If we can''t
+ * allocate a contiguous 1MB, we''re probably in trouble anyway.
+ */
+#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
+
+int swiotlb_force;
+
+/*
+ * Used to do a quick range check in do_unmap_single and
+ * do_sync_single_*, to see if the memory was in fact allocated by this
+ * API.
+ */
+char *io_tlb_start, *io_tlb_end;
+
+/*
+ * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and
+ * io_tlb_end.  This is command line adjustable via setup_io_tlb_npages.
+ */
+unsigned long io_tlb_nslabs;
+
+/*
+ * When the IOMMU overflows we return a fallback buffer. This sets the size.
+ */
+unsigned long io_tlb_overflow = 32*1024;
+
+void *io_tlb_overflow_buffer;
+
+/*
+ * This is a free list describing the number of free entries available from
+ * each index
+ */
+static unsigned int *io_tlb_list;
+static unsigned int io_tlb_index;
+
+/*
+ * We need to save away the original address corresponding to a mapped entry
+ * for the sync operations.
+ */
+static phys_addr_t *io_tlb_orig_addr;
+
+/*
+ * Protect the above data structures in the map and unmap calls
+ */
+static DEFINE_SPINLOCK(io_tlb_lock);
+
+static int late_alloc;
+
+static int __init
+setup_io_tlb_npages(char *str)
+{
+	int get_value(const char *token, char *str, char **endp)
+	{
+		ssize_t len;
+		int val = 0;
+
+		len = strlen(token);
+		if (!strncmp(str, token, len)) {
+			str += len;
+			if (*str == ''='')
+				++str;
+			if (*str != ''\0'')
+				val = simple_strtoul(str, endp, 0);
+		}
+		*endp = str;
+		return val;
+	}
+
+	int val;
+
+	while (*str) {
+		/* The old syntax */
+		if (isdigit(*str)) {
+			io_tlb_nslabs = simple_strtoul(str, &str, 0);
+			/* avoid tail segment of size < IO_TLB_SEGSIZE */
+			io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+		}
+		if (!strncmp(str, "force", 5))
+			swiotlb_force = 1;
+		/* The new syntax: swiotlb=nslabs=16384,overflow=32768,force */
+		val = get_value("nslabs", str, &str);
+		if (val)
+			io_tlb_nslabs = ALIGN(val, IO_TLB_SEGSIZE);
+
+		val = get_value("overflow", str, &str);
+		if (val)
+			io_tlb_overflow = val;
+		str = strpbrk(str, ",");
+		if (!str)
+			break;
+		str++; /* skip '','' */
+	}
+	return 1;
+}
+__setup("swiotlb=", setup_io_tlb_npages);
+
+/* Note that this doesn''t work with highmem page */
+static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
+				      volatile void *address)
+{
+	return phys_to_dma(hwdev, virt_to_phys(address));
+}
+
+void swiotlb_print_info(void)
+{
+	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+	phys_addr_t pstart, pend;
+
+	pstart = virt_to_phys(io_tlb_start);
+	pend = virt_to_phys(io_tlb_end);
+
+	printk(KERN_INFO "DMA: Placing %luMB software IO TLB between %p -
%p\n",
+	       bytes >> 20, io_tlb_start, io_tlb_end);
+	printk(KERN_INFO "DMA: software IO TLB at phys %#llx - %#llx\n",
+	       (unsigned long long)pstart,
+	       (unsigned long long)pend);
+}
+
+/*
+ * Statically reserve bounce buffer space and initialize bounce buffer data
+ * structures for the software IO TLB used to implement the DMA API.
+ */
+void __init
+swiotlb_init_early(size_t default_size, int verbose)
+{
+	unsigned long i, bytes;
+
+	if (!io_tlb_nslabs) {
+		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+	}
+
+	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+
+	/*
+	 * Get IO TLB memory from the low pages
+	 */
+	io_tlb_start = alloc_bootmem_low_pages(bytes);
+	if (!io_tlb_start)
+		panic("DMA: Cannot allocate SWIOTLB buffer");
+	io_tlb_end = io_tlb_start + bytes;
+
+	/*
+	 * Allocate and initialize the free list array.  This array is used
+	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
+	 * between io_tlb_start and io_tlb_end.
+	 */
+	io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int));
+	for (i = 0; i < io_tlb_nslabs; i++)
+		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+	io_tlb_index = 0;
+	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t));
+
+	/*
+	 * Get the overflow emergency buffer
+	 */
+	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
+	if (!io_tlb_overflow_buffer)
+		panic("DMA: Cannot allocate SWIOTLB overflow buffer!\n");
+	if (verbose)
+		swiotlb_print_info();
+}
+
+void __init
+swiotlb_init(int verbose)
+{
+	swiotlb_init_early(64 * (1<<20), verbose);	/* default to 64MB */
+}
+
+/*
+ * Systems with larger DMA zones (those that don''t support ISA) can
+ * initialize the swiotlb later using the slab allocator if needed.
+ * This should be just like above, but with some error catching.
+ */
+int
+swiotlb_init_late(size_t default_size)
+{
+	unsigned long i, bytes, req_nslabs = io_tlb_nslabs;
+	unsigned int order;
+
+	if (!io_tlb_nslabs) {
+		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
+		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+	}
+
+	/*
+	 * Get IO TLB memory from the low pages
+	 */
+	order = get_order(io_tlb_nslabs << IO_TLB_SHIFT);
+	io_tlb_nslabs = SLABS_PER_PAGE << order;
+	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+
+	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
+		io_tlb_start = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
+							order);
+		if (io_tlb_start)
+			break;
+		order--;
+	}
+
+	if (!io_tlb_start)
+		goto cleanup1;
+
+	if (order != get_order(bytes)) {
+		printk(KERN_WARNING "DMA: Warning: only able to allocate %ld MB"
+		       " for software IO TLB\n", (PAGE_SIZE << order)
>> 20);
+		io_tlb_nslabs = SLABS_PER_PAGE << order;
+		bytes = io_tlb_nslabs << IO_TLB_SHIFT;
+	}
+	io_tlb_end = io_tlb_start + bytes;
+	memset(io_tlb_start, 0, bytes);
+
+	/*
+	 * Allocate and initialize the free list array.  This array is used
+	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
+	 * between io_tlb_start and io_tlb_end.
+	 */
+	io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL,
+					get_order(io_tlb_nslabs * sizeof(int)));
+	if (!io_tlb_list)
+		goto cleanup2;
+
+	for (i = 0; i < io_tlb_nslabs; i++)
+		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
+	io_tlb_index = 0;
+
+	io_tlb_orig_addr = (phys_addr_t *) __get_free_pages(GFP_KERNEL,
+				get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
+	if (!io_tlb_orig_addr)
+		goto cleanup3;
+
+	memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t));
+
+	/*
+	 * Get the overflow emergency buffer
+	 */
+	io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
+					 get_order(io_tlb_overflow));
+	if (!io_tlb_overflow_buffer)
+		goto cleanup4;
+
+	swiotlb_print_info();
+
+	late_alloc = 1;
+
+	return 0;
+
+cleanup4:
+	free_pages((unsigned long)io_tlb_orig_addr,
+		   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
+	io_tlb_orig_addr = NULL;
+cleanup3:
+	free_pages((unsigned long)io_tlb_list,
+		   get_order(io_tlb_nslabs * sizeof(int)));
+	io_tlb_list = NULL;
+cleanup2:
+	io_tlb_end = NULL;
+	free_pages((unsigned long)io_tlb_start, order);
+	io_tlb_start = NULL;
+cleanup1:
+	io_tlb_nslabs = req_nslabs;
+	return -ENOMEM;
+}
+
+void __init swiotlb_free(void)
+{
+	if (!io_tlb_overflow_buffer)
+		return;
+
+	if (late_alloc) {
+		free_pages((unsigned long)io_tlb_overflow_buffer,
+			   get_order(io_tlb_overflow));
+		free_pages((unsigned long)io_tlb_orig_addr,
+			   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
+		free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
+								 sizeof(int)));
+		free_pages((unsigned long)io_tlb_start,
+			   get_order(io_tlb_nslabs << IO_TLB_SHIFT));
+	} else {
+		free_bootmem_late(__pa(io_tlb_overflow_buffer),
+				  io_tlb_overflow);
+		free_bootmem_late(__pa(io_tlb_orig_addr),
+				  io_tlb_nslabs * sizeof(phys_addr_t));
+		free_bootmem_late(__pa(io_tlb_list),
+				  io_tlb_nslabs * sizeof(int));
+		free_bootmem_late(__pa(io_tlb_start),
+				  io_tlb_nslabs << IO_TLB_SHIFT);
+	}
+}
+
+int is_swiotlb_buffer(phys_addr_t paddr)
+{
+	return paddr >= virt_to_phys(io_tlb_start) &&
+		paddr < virt_to_phys(io_tlb_end);
+}
+
+/*
+ * Bounce: copy the swiotlb buffer back to the original dma location
+ */
+void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
+			   enum dma_data_direction dir)
+{
+	unsigned long pfn = PFN_DOWN(phys);
+
+	if (PageHighMem(pfn_to_page(pfn))) {
+		/* The buffer does not have a mapping.  Map it in and copy */
+		unsigned int offset = phys & ~PAGE_MASK;
+		char *buffer;
+		unsigned int sz = 0;
+		unsigned long flags;
+
+		while (size) {
+			sz = min_t(size_t, PAGE_SIZE - offset, size);
+
+			local_irq_save(flags);
+			buffer = kmap_atomic(pfn_to_page(pfn),
+					     KM_BOUNCE_READ);
+			if (dir == DMA_TO_DEVICE)
+				memcpy(dma_addr, buffer + offset, sz);
+			else
+				memcpy(buffer + offset, dma_addr, sz);
+			kunmap_atomic(buffer, KM_BOUNCE_READ);
+			local_irq_restore(flags);
+
+			size -= sz;
+			pfn++;
+			dma_addr += sz;
+			offset = 0;
+		}
+	} else {
+		if (dir == DMA_TO_DEVICE)
+			memcpy(dma_addr, phys_to_virt(phys), size);
+		else
+			memcpy(phys_to_virt(phys), dma_addr, size);
+	}
+}
+
+/*
+ * Allocates bounce buffer and returns its kernel virtual address.
+ */
+void *
+do_map_single(struct device *hwdev, phys_addr_t phys,
+	       unsigned long start_dma_addr, size_t size, int dir)
+{
+	unsigned long flags;
+	char *dma_addr;
+	unsigned int nslots, stride, index, wrap;
+	int i;
+	unsigned long mask;
+	unsigned long offset_slots;
+	unsigned long max_slots;
+
+	mask = dma_get_seg_boundary(hwdev);
+	start_dma_addr = start_dma_addr & mask;
+	offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
+
+	/*
+	 * Carefully handle integer overflow which can occur when mask == ~0UL.
+	 */
+	max_slots = mask + 1
+		    ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT
+		    : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
+
+	/*
+	 * For mappings greater than a page, we limit the stride (and
+	 * hence alignment) to a page size.
+	 */
+	nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
+	if (size > PAGE_SIZE)
+		stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
+	else
+		stride = 1;
+
+	BUG_ON(!nslots);
+
+	/*
+	 * Find suitable number of IO TLB entries size that will fit this
+	 * request and allocate a buffer from that IO TLB pool.
+	 */
+	spin_lock_irqsave(&io_tlb_lock, flags);
+	index = ALIGN(io_tlb_index, stride);
+	if (index >= io_tlb_nslabs)
+		index = 0;
+	wrap = index;
+
+	do {
+		while (iommu_is_span_boundary(index, nslots, offset_slots,
+					      max_slots)) {
+			index += stride;
+			if (index >= io_tlb_nslabs)
+				index = 0;
+			if (index == wrap)
+				goto not_found;
+		}
+
+		/*
+		 * If we find a slot that indicates we have ''nslots'' number
of
+		 * contiguous buffers, we allocate the buffers from that slot
+		 * and mark the entries as ''0'' indicating unavailable.
+		 */
+		if (io_tlb_list[index] >= nslots) {
+			int count = 0;
+
+			for (i = index; i < (int) (index + nslots); i++)
+				io_tlb_list[i] = 0;
+			for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE)
+				!= IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--)
+				io_tlb_list[i] = ++count;
+			dma_addr = io_tlb_start + (index << IO_TLB_SHIFT);
+
+			/*
+			 * Update the indices to avoid searching in the next
+			 * round.
+			 */
+			io_tlb_index = ((index + nslots) < io_tlb_nslabs
+					? (index + nslots) : 0);
+
+			goto found;
+		}
+		index += stride;
+		if (index >= io_tlb_nslabs)
+			index = 0;
+	} while (index != wrap);
+
+not_found:
+	spin_unlock_irqrestore(&io_tlb_lock, flags);
+	return NULL;
+found:
+	spin_unlock_irqrestore(&io_tlb_lock, flags);
+
+	/*
+	 * Save away the mapping from the original address to the DMA address.
+	 * This is needed when we sync the memory.  Then we sync the buffer if
+	 * needed.
+	 */
+	for (i = 0; i < nslots; i++)
+		io_tlb_orig_addr[index+i] = phys + (i << IO_TLB_SHIFT);
+	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
+		swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE);
+
+	return dma_addr;
+}
+
+/*
+ * dma_addr is the kernel virtual address of the bounce buffer to unmap.
+ */
+void
+do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
+{
+	unsigned long flags;
+	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
+	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
+	phys_addr_t phys = io_tlb_orig_addr[index];
+
+	/*
+	 * First, sync the memory before unmapping the entry
+	 */
+	if (phys && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
+		swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE);
+
+	/*
+	 * Return the buffer to the free list by setting the corresponding
+	 * entries to indicate the number of contiguous entries available.
+	 * While returning the entries to the free list, we merge the entries
+	 * with slots below and above the pool being returned.
+	 */
+	spin_lock_irqsave(&io_tlb_lock, flags);
+	{
+		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
+			 io_tlb_list[index + nslots] : 0);
+		/*
+		 * Step 1: return the slots to the free list, merging the
+		 * slots with superceeding slots
+		 */
+		for (i = index + nslots - 1; i >= index; i--)
+			io_tlb_list[i] = ++count;
+		/*
+		 * Step 2: merge the returned slots with the preceding slots,
+		 * if available (non zero)
+		 */
+		for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !+				IO_TLB_SEGSIZE - 1)
&& io_tlb_list[i]; i--)
+			io_tlb_list[i] = ++count;
+	}
+	spin_unlock_irqrestore(&io_tlb_lock, flags);
+}
+
+void
+do_sync_single(struct device *hwdev, char *dma_addr, size_t size,
+	    int dir, int target)
+{
+	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
+	phys_addr_t phys = io_tlb_orig_addr[index];
+
+	phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1));
+
+	switch (target) {
+	case SYNC_FOR_CPU:
+		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
+			swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE);
+		else
+			BUG_ON(dir != DMA_TO_DEVICE);
+		break;
+	case SYNC_FOR_DEVICE:
+		if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
+			swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE);
+		else
+			BUG_ON(dir != DMA_FROM_DEVICE);
+		break;
+	default:
+		BUG();
+	}
+}
+
+void *
+swiotlb_alloc_coherent(struct device *hwdev, size_t size,
+		       dma_addr_t *dma_handle, gfp_t flags)
+{
+	dma_addr_t dev_addr;
+	void *ret;
+	int order = get_order(size);
+	u64 dma_mask = DMA_BIT_MASK(32);
+	unsigned long start_dma_addr;
+
+	if (hwdev && hwdev->coherent_dma_mask)
+		dma_mask = hwdev->coherent_dma_mask;
+
+	ret = (void *)__get_free_pages(flags, order);
+	if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) {
+		/*
+		 * The allocated memory isn''t reachable by the device.
+		 */
+		free_pages((unsigned long) ret, order);
+		ret = NULL;
+	}
+	if (!ret) {
+		/*
+		 * We are either out of memory or the device can''t DMA
+		 * to GFP_DMA memory; fall back on do_map_single(), which
+		 * will grab memory from the lowest available address range.
+		 */
+		start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
+		ret = do_map_single(hwdev, 0, start_dma_addr, size,
+				    DMA_FROM_DEVICE);
+		if (!ret)
+			return NULL;
+	}
+
+	memset(ret, 0, size);
+	dev_addr = swiotlb_virt_to_bus(hwdev, ret);
+
+	/* Confirm address can be DMA''d by device */
+	if (dev_addr + size - 1 > dma_mask) {
+		dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \
+		       "dev_addr = 0x%016Lx\n",
+		       (unsigned long long)dma_mask,
+		       (unsigned long long)dev_addr);
+
+		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
+		do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE);
+		return NULL;
+	}
+	*dma_handle = dev_addr;
+	return ret;
+}
+EXPORT_SYMBOL(swiotlb_alloc_coherent);
+
+void
+swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
+		      dma_addr_t dev_addr)
+{
+	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+
+	WARN_ON(irqs_disabled());
+	if (!is_swiotlb_buffer(paddr))
+		free_pages((unsigned long)vaddr, get_order(size));
+	else
+		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
+		do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE);
+}
+EXPORT_SYMBOL(swiotlb_free_coherent);
+
+static void
+swiotlb_full(struct device *dev, size_t size, int dir, int do_panic)
+{
+	/*
+	 * Ran out of IOMMU space for this operation. This is very bad.
+	 * Unfortunately the drivers cannot handle this operation properly.
+	 * unless they check for dma_mapping_error (most don''t)
+	 * When the mapping is small enough return a static buffer to limit
+	 * the damage, or panic when the transfer is too big.
+	 */
+	dev_err(dev, "DMA: Out of SW-IOMMU space for %zu bytes.", size);
+
+	if (size <= io_tlb_overflow || !do_panic)
+		return;
+
+	if (dir == DMA_BIDIRECTIONAL)
+		panic("DMA: Random memory could be DMA accessed\n");
+	if (dir == DMA_FROM_DEVICE)
+		panic("DMA: Random memory could be DMA written\n");
+	if (dir == DMA_TO_DEVICE)
+		panic("DMA: Random memory could be DMA read\n");
+}
+
+/*
+ * Map a single buffer of the indicated size for DMA in streaming mode.  The
+ * physical address to use is returned.
+ *
+ * Once the device is given the dma address, the device owns this memory until
+ * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed.
+ */
+dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
+			    unsigned long offset, size_t size,
+			    enum dma_data_direction dir,
+			    struct dma_attrs *attrs)
+{
+	unsigned long start_dma_addr;
+	phys_addr_t phys = page_to_phys(page) + offset;
+	dma_addr_t dev_addr = phys_to_dma(dev, phys);
+	void *map;
+
+	BUG_ON(dir == DMA_NONE);
+	/*
+	 * If the address happens to be in the device''s DMA window,
+	 * we can safely return the device addr and not worry about bounce
+	 * buffering it.
+	 */
+	if (dma_capable(dev, dev_addr, size) && !swiotlb_force)
+		return dev_addr;
+
+	/*
+	 * Oh well, have to allocate and map a bounce buffer.
+	 */
+	start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start);
+	map = do_map_single(dev, phys, start_dma_addr, size, dir);
+	if (!map) {
+		swiotlb_full(dev, size, dir, 1);
+		map = io_tlb_overflow_buffer;
+	}
+
+	dev_addr = swiotlb_virt_to_bus(dev, map);
+
+	/*
+	 * Ensure that the address returned is DMA''ble
+	 */
+	if (!dma_capable(dev, dev_addr, size))
+		panic("DMA: swiotlb_map_single: bounce buffer is not
DMA''ble");
+
+	return dev_addr;
+}
+EXPORT_SYMBOL_GPL(swiotlb_map_page);
+
+/*
+ * Unmap a single streaming mode DMA translation.  The dma_addr and size must
+ * match what was provided for in a previous swiotlb_map_page call.  All
+ * other usages are undefined.
+ *
+ * After this call, reads by the cpu to the buffer are guaranteed to see
+ * whatever the device wrote there.
+ */
+static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
+			 size_t size, int dir)
+{
+	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+
+	BUG_ON(dir == DMA_NONE);
+
+	if (is_swiotlb_buffer(paddr)) {
+		do_unmap_single(hwdev, phys_to_virt(paddr), size, dir);
+		return;
+	}
+
+	if (dir != DMA_FROM_DEVICE)
+		return;
+
+	/*
+	 * phys_to_virt doesn''t work with hihgmem page but we could
+	 * call dma_mark_clean() with hihgmem page here. However, we
+	 * are fine since dma_mark_clean() is null on POWERPC. We can
+	 * make dma_mark_clean() take a physical address if necessary.
+	 */
+	dma_mark_clean(phys_to_virt(paddr), size);
+}
+
+void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
+			size_t size, enum dma_data_direction dir,
+			struct dma_attrs *attrs)
+{
+	unmap_single(hwdev, dev_addr, size, dir);
+}
+EXPORT_SYMBOL_GPL(swiotlb_unmap_page);
+
+/*
+ * Make physical memory consistent for a single streaming mode DMA translation
+ * after a transfer.
+ *
+ * If you perform a swiotlb_map_page() but wish to interrogate the buffer
+ * using the cpu, yet do not wish to teardown the dma mapping, you must
+ * call this function before doing so.  At the next point you give the dma
+ * address back to the card, you must first perform a
+ * swiotlb_dma_sync_for_device, and then the device again owns the buffer
+ */
+static void
+swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
+		    size_t size, int dir, int target)
+{
+	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+
+	BUG_ON(dir == DMA_NONE);
+
+	if (is_swiotlb_buffer(paddr)) {
+		do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
+		return;
+	}
+
+	if (dir != DMA_FROM_DEVICE)
+		return;
+
+	dma_mark_clean(phys_to_virt(paddr), size);
+}
+
+void
+swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
+			    size_t size, enum dma_data_direction dir)
+{
+	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
+}
+EXPORT_SYMBOL(swiotlb_sync_single_for_cpu);
+
+void
+swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
+			       size_t size, enum dma_data_direction dir)
+{
+	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE);
+}
+EXPORT_SYMBOL(swiotlb_sync_single_for_device);
+
+/*
+ * Same as above, but for a sub-range of the mapping.
+ */
+static void
+swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr,
+			  unsigned long offset, size_t size,
+			  int dir, int target)
+{
+	swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target);
+}
+
+void
+swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
+				  unsigned long offset, size_t size,
+				  enum dma_data_direction dir)
+{
+	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
+				  SYNC_FOR_CPU);
+}
+EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu);
+
+void
+swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr,
+				     unsigned long offset, size_t size,
+				     enum dma_data_direction dir)
+{
+	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
+				  SYNC_FOR_DEVICE);
+}
+EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device);
+
+/*
+ * Map a set of buffers described by scatterlist in streaming mode for DMA.
+ * This is the scatter-gather version of the above swiotlb_map_page
+ * interface.  Here the scatter gather list elements are each tagged with the
+ * appropriate dma address and length.  They are obtained via
+ * sg_dma_{address,length}(SG).
+ *
+ * NOTE: An implementation may be able to use a smaller number of
+ *       DMA address/length pairs than there are SG table elements.
+ *       (for example via virtual mapping capabilities)
+ *       The routine returns the number of addr/length pairs actually
+ *       used, at most nents.
+ *
+ * Device ownership issues as mentioned above for swiotlb_map_page are the
+ * same here.
+ */
+int
+swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
+		     enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	unsigned long start_dma_addr;
+	struct scatterlist *sg;
+	int i;
+
+	BUG_ON(dir == DMA_NONE);
+
+	start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
+	for_each_sg(sgl, sg, nelems, i) {
+		phys_addr_t paddr = sg_phys(sg);
+		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
+
+		if (swiotlb_force ||
+		    !dma_capable(hwdev, dev_addr, sg->length)) {
+			void *map = do_map_single(hwdev, sg_phys(sg),
+						  start_dma_addr,
+						  sg->length, dir);
+			if (!map) {
+				/* Don''t panic here, we expect map_sg users
+				   to do proper error handling. */
+				swiotlb_full(hwdev, sg->length, dir, 0);
+				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
+						       attrs);
+				sgl[0].dma_length = 0;
+				return 0;
+			}
+			sg->dma_address = swiotlb_virt_to_bus(hwdev, map);
+		} else
+			sg->dma_address = dev_addr;
+		sg->dma_length = sg->length;
+	}
+	return nelems;
+}
+EXPORT_SYMBOL(swiotlb_map_sg_attrs);
+
+int
+swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
+	       int dir)
+{
+	return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL);
+}
+EXPORT_SYMBOL(swiotlb_map_sg);
+
+/*
+ * Unmap a set of streaming mode DMA translations.  Again, cpu read rules
+ * concerning calls here are the same as for swiotlb_unmap_page() above.
+ */
+void
+swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
+		       int nelems, enum dma_data_direction dir,
+		       struct dma_attrs *attrs)
+{
+	struct scatterlist *sg;
+	int i;
+
+	BUG_ON(dir == DMA_NONE);
+
+	for_each_sg(sgl, sg, nelems, i)
+		unmap_single(hwdev, sg->dma_address, sg->dma_length, dir);
+
+}
+EXPORT_SYMBOL(swiotlb_unmap_sg_attrs);
+
+void
+swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
+		 int dir)
+{
+	return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL);
+}
+EXPORT_SYMBOL(swiotlb_unmap_sg);
+
+/*
+ * Make physical memory consistent for a set of streaming mode DMA translations
+ * after a transfer.
+ *
+ * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules
+ * and usage.
+ */
+static void
+swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl,
+		int nelems, int dir, int target)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nelems, i)
+		swiotlb_sync_single(hwdev, sg->dma_address,
+				    sg->dma_length, dir, target);
+}
+
+void
+swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
+			int nelems, enum dma_data_direction dir)
+{
+	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU);
+}
+EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu);
+
+void
+swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
+			   int nelems, enum dma_data_direction dir)
+{
+	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
+}
+EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
+
+int
+swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
+{
+	return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
+}
+EXPORT_SYMBOL(swiotlb_dma_mapping_error);
+
+/*
+ * Return whether the given device DMA address mask can be supported
+ * properly.  For example, if your device can only drive the low 24-bits
+ * during bus mastering, then you would pass 0x00ffffff as the mask to
+ * this function.
+ */
+int
+swiotlb_dma_supported(struct device *hwdev, u64 mask)
+{
+	return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask;
+}
+EXPORT_SYMBOL(swiotlb_dma_supported);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
deleted file mode 100644
index c982d33..0000000
--- a/lib/swiotlb.c
+++ /dev/null
@@ -1,957 +0,0 @@
-/*
- * Dynamic DMA mapping support.
- *
- * This implementation is a fallback for platforms that do not support
- * I/O TLBs (aka DMA address translation hardware).
- * Copyright (C) 2000 Asit Mallick <Asit.K.Mallick@intel.com>
- * Copyright (C) 2000 Goutham Rao <goutham.rao@intel.com>
- * Copyright (C) 2000, 2003 Hewlett-Packard Co
- *	David Mosberger-Tang <davidm@hpl.hp.com>
- *
- * 03/05/07 davidm	Switch from PCI-DMA to generic device DMA API.
- * 00/12/13 davidm	Rename to swiotlb.c and add mark_clean() to avoid
- *			unnecessary i-cache flushing.
- * 04/07/.. ak		Better overflow handling. Assorted fixes.
- * 05/09/10 linville	Add support for syncing ranges, support syncing for
- *			DMA_BIDIRECTIONAL mappings, miscellaneous cleanup.
- * 08/12/11 beckyb	Add highmem support
- */
-
-#include <linux/cache.h>
-#include <linux/dma-mapping.h>
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/spinlock.h>
-#include <linux/string.h>
-#include <linux/swiotlb.h>
-#include <linux/pfn.h>
-#include <linux/types.h>
-#include <linux/ctype.h>
-#include <linux/highmem.h>
-
-#include <linux/io.h>
-#include <asm/dma.h>
-#include <linux/scatterlist.h>
-
-#include <linux/init.h>
-#include <linux/bootmem.h>
-#include <linux/iommu-helper.h>
-
-#define OFFSET(val, align) ((unsigned long)	((val) & ((align) - 1)))
-
-#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
-
-/*
- * Minimum IO TLB size to bother booting with.  Systems with mainly
- * 64bit capable cards will only lightly use the swiotlb.  If we can''t
- * allocate a contiguous 1MB, we''re probably in trouble anyway.
- */
-#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
-
-int swiotlb_force;
-
-/*
- * Used to do a quick range check in do_unmap_single and
- * do_sync_single_*, to see if the memory was in fact allocated by this
- * API.
- */
-char *io_tlb_start, *io_tlb_end;
-
-/*
- * The number of IO TLB blocks (in groups of 64) betweeen io_tlb_start and
- * io_tlb_end.  This is command line adjustable via setup_io_tlb_npages.
- */
-unsigned long io_tlb_nslabs;
-
-/*
- * When the IOMMU overflows we return a fallback buffer. This sets the size.
- */
-unsigned long io_tlb_overflow = 32*1024;
-
-void *io_tlb_overflow_buffer;
-
-/*
- * This is a free list describing the number of free entries available from
- * each index
- */
-static unsigned int *io_tlb_list;
-static unsigned int io_tlb_index;
-
-/*
- * We need to save away the original address corresponding to a mapped entry
- * for the sync operations.
- */
-static phys_addr_t *io_tlb_orig_addr;
-
-/*
- * Protect the above data structures in the map and unmap calls
- */
-static DEFINE_SPINLOCK(io_tlb_lock);
-
-static int late_alloc;
-
-static int __init
-setup_io_tlb_npages(char *str)
-{
-	int get_value(const char *token, char *str, char **endp)
-	{
-		ssize_t len;
-		int val = 0;
-
-		len = strlen(token);
-		if (!strncmp(str, token, len)) {
-			str += len;
-			if (*str == ''='')
-				++str;
-			if (*str != ''\0'')
-				val = simple_strtoul(str, endp, 0);
-		}
-		*endp = str;
-		return val;
-	}
-
-	int val;
-
-	while (*str) {
-		/* The old syntax */
-		if (isdigit(*str)) {
-			io_tlb_nslabs = simple_strtoul(str, &str, 0);
-			/* avoid tail segment of size < IO_TLB_SEGSIZE */
-			io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
-		}
-		if (!strncmp(str, "force", 5))
-			swiotlb_force = 1;
-		/* The new syntax: swiotlb=nslabs=16384,overflow=32768,force */
-		val = get_value("nslabs", str, &str);
-		if (val)
-			io_tlb_nslabs = ALIGN(val, IO_TLB_SEGSIZE);
-
-		val = get_value("overflow", str, &str);
-		if (val)
-			io_tlb_overflow = val;
-		str = strpbrk(str, ",");
-		if (!str)
-			break;
-		str++; /* skip '','' */
-	}
-	return 1;
-}
-__setup("swiotlb=", setup_io_tlb_npages);
-
-/* Note that this doesn''t work with highmem page */
-static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
-				      volatile void *address)
-{
-	return phys_to_dma(hwdev, virt_to_phys(address));
-}
-
-void swiotlb_print_info(void)
-{
-	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
-	phys_addr_t pstart, pend;
-
-	pstart = virt_to_phys(io_tlb_start);
-	pend = virt_to_phys(io_tlb_end);
-
-	printk(KERN_INFO "DMA: Placing %luMB software IO TLB between %p -
%p\n",
-	       bytes >> 20, io_tlb_start, io_tlb_end);
-	printk(KERN_INFO "DMA: software IO TLB at phys %#llx - %#llx\n",
-	       (unsigned long long)pstart,
-	       (unsigned long long)pend);
-}
-
-/*
- * Statically reserve bounce buffer space and initialize bounce buffer data
- * structures for the software IO TLB used to implement the DMA API.
- */
-void __init
-swiotlb_init_early(size_t default_size, int verbose)
-{
-	unsigned long i, bytes;
-
-	if (!io_tlb_nslabs) {
-		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
-	}
-
-	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
-
-	/*
-	 * Get IO TLB memory from the low pages
-	 */
-	io_tlb_start = alloc_bootmem_low_pages(bytes);
-	if (!io_tlb_start)
-		panic("DMA: Cannot allocate SWIOTLB buffer");
-	io_tlb_end = io_tlb_start + bytes;
-
-	/*
-	 * Allocate and initialize the free list array.  This array is used
-	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
-	 * between io_tlb_start and io_tlb_end.
-	 */
-	io_tlb_list = alloc_bootmem(io_tlb_nslabs * sizeof(int));
-	for (i = 0; i < io_tlb_nslabs; i++)
-		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	io_tlb_index = 0;
-	io_tlb_orig_addr = alloc_bootmem(io_tlb_nslabs * sizeof(phys_addr_t));
-
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	io_tlb_overflow_buffer = alloc_bootmem_low(io_tlb_overflow);
-	if (!io_tlb_overflow_buffer)
-		panic("DMA: Cannot allocate SWIOTLB overflow buffer!\n");
-	if (verbose)
-		swiotlb_print_info();
-}
-
-void __init
-swiotlb_init(int verbose)
-{
-	swiotlb_init_early(64 * (1<<20), verbose);	/* default to 64MB */
-}
-
-/*
- * Systems with larger DMA zones (those that don''t support ISA) can
- * initialize the swiotlb later using the slab allocator if needed.
- * This should be just like above, but with some error catching.
- */
-int
-swiotlb_init_late(size_t default_size)
-{
-	unsigned long i, bytes, req_nslabs = io_tlb_nslabs;
-	unsigned int order;
-
-	if (!io_tlb_nslabs) {
-		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
-		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
-	}
-
-	/*
-	 * Get IO TLB memory from the low pages
-	 */
-	order = get_order(io_tlb_nslabs << IO_TLB_SHIFT);
-	io_tlb_nslabs = SLABS_PER_PAGE << order;
-	bytes = io_tlb_nslabs << IO_TLB_SHIFT;
-
-	while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-		io_tlb_start = (void *)__get_free_pages(GFP_DMA | __GFP_NOWARN,
-							order);
-		if (io_tlb_start)
-			break;
-		order--;
-	}
-
-	if (!io_tlb_start)
-		goto cleanup1;
-
-	if (order != get_order(bytes)) {
-		printk(KERN_WARNING "DMA: Warning: only able to allocate %ld MB"
-		       " for software IO TLB\n", (PAGE_SIZE << order)
>> 20);
-		io_tlb_nslabs = SLABS_PER_PAGE << order;
-		bytes = io_tlb_nslabs << IO_TLB_SHIFT;
-	}
-	io_tlb_end = io_tlb_start + bytes;
-	memset(io_tlb_start, 0, bytes);
-
-	/*
-	 * Allocate and initialize the free list array.  This array is used
-	 * to find contiguous free memory regions of size up to IO_TLB_SEGSIZE
-	 * between io_tlb_start and io_tlb_end.
-	 */
-	io_tlb_list = (unsigned int *)__get_free_pages(GFP_KERNEL,
-					get_order(io_tlb_nslabs * sizeof(int)));
-	if (!io_tlb_list)
-		goto cleanup2;
-
-	for (i = 0; i < io_tlb_nslabs; i++)
-		io_tlb_list[i] = IO_TLB_SEGSIZE - OFFSET(i, IO_TLB_SEGSIZE);
-	io_tlb_index = 0;
-
-	io_tlb_orig_addr = (phys_addr_t *) __get_free_pages(GFP_KERNEL,
-				get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
-	if (!io_tlb_orig_addr)
-		goto cleanup3;
-
-	memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t));
-
-	/*
-	 * Get the overflow emergency buffer
-	 */
-	io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
-					 get_order(io_tlb_overflow));
-	if (!io_tlb_overflow_buffer)
-		goto cleanup4;
-
-	swiotlb_print_info();
-
-	late_alloc = 1;
-
-	return 0;
-
-cleanup4:
-	free_pages((unsigned long)io_tlb_orig_addr,
-		   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
-	io_tlb_orig_addr = NULL;
-cleanup3:
-	free_pages((unsigned long)io_tlb_list,
-		   get_order(io_tlb_nslabs * sizeof(int)));
-	io_tlb_list = NULL;
-cleanup2:
-	io_tlb_end = NULL;
-	free_pages((unsigned long)io_tlb_start, order);
-	io_tlb_start = NULL;
-cleanup1:
-	io_tlb_nslabs = req_nslabs;
-	return -ENOMEM;
-}
-
-void __init swiotlb_free(void)
-{
-	if (!io_tlb_overflow_buffer)
-		return;
-
-	if (late_alloc) {
-		free_pages((unsigned long)io_tlb_overflow_buffer,
-			   get_order(io_tlb_overflow));
-		free_pages((unsigned long)io_tlb_orig_addr,
-			   get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
-		free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
-								 sizeof(int)));
-		free_pages((unsigned long)io_tlb_start,
-			   get_order(io_tlb_nslabs << IO_TLB_SHIFT));
-	} else {
-		free_bootmem_late(__pa(io_tlb_overflow_buffer),
-				  io_tlb_overflow);
-		free_bootmem_late(__pa(io_tlb_orig_addr),
-				  io_tlb_nslabs * sizeof(phys_addr_t));
-		free_bootmem_late(__pa(io_tlb_list),
-				  io_tlb_nslabs * sizeof(int));
-		free_bootmem_late(__pa(io_tlb_start),
-				  io_tlb_nslabs << IO_TLB_SHIFT);
-	}
-}
-
-int is_swiotlb_buffer(phys_addr_t paddr)
-{
-	return paddr >= virt_to_phys(io_tlb_start) &&
-		paddr < virt_to_phys(io_tlb_end);
-}
-
-/*
- * Bounce: copy the swiotlb buffer back to the original dma location
- */
-void swiotlb_bounce(phys_addr_t phys, char *dma_addr, size_t size,
-			   enum dma_data_direction dir)
-{
-	unsigned long pfn = PFN_DOWN(phys);
-
-	if (PageHighMem(pfn_to_page(pfn))) {
-		/* The buffer does not have a mapping.  Map it in and copy */
-		unsigned int offset = phys & ~PAGE_MASK;
-		char *buffer;
-		unsigned int sz = 0;
-		unsigned long flags;
-
-		while (size) {
-			sz = min_t(size_t, PAGE_SIZE - offset, size);
-
-			local_irq_save(flags);
-			buffer = kmap_atomic(pfn_to_page(pfn),
-					     KM_BOUNCE_READ);
-			if (dir == DMA_TO_DEVICE)
-				memcpy(dma_addr, buffer + offset, sz);
-			else
-				memcpy(buffer + offset, dma_addr, sz);
-			kunmap_atomic(buffer, KM_BOUNCE_READ);
-			local_irq_restore(flags);
-
-			size -= sz;
-			pfn++;
-			dma_addr += sz;
-			offset = 0;
-		}
-	} else {
-		if (dir == DMA_TO_DEVICE)
-			memcpy(dma_addr, phys_to_virt(phys), size);
-		else
-			memcpy(phys_to_virt(phys), dma_addr, size);
-	}
-}
-
-/*
- * Allocates bounce buffer and returns its kernel virtual address.
- */
-void *
-do_map_single(struct device *hwdev, phys_addr_t phys,
-	       unsigned long start_dma_addr, size_t size, int dir)
-{
-	unsigned long flags;
-	char *dma_addr;
-	unsigned int nslots, stride, index, wrap;
-	int i;
-	unsigned long mask;
-	unsigned long offset_slots;
-	unsigned long max_slots;
-
-	mask = dma_get_seg_boundary(hwdev);
-	start_dma_addr = start_dma_addr & mask;
-	offset_slots = ALIGN(start_dma_addr, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
-
-	/*
-	 * Carefully handle integer overflow which can occur when mask == ~0UL.
-	 */
-	max_slots = mask + 1
-		    ? ALIGN(mask + 1, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT
-		    : 1UL << (BITS_PER_LONG - IO_TLB_SHIFT);
-
-	/*
-	 * For mappings greater than a page, we limit the stride (and
-	 * hence alignment) to a page size.
-	 */
-	nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
-	if (size > PAGE_SIZE)
-		stride = (1 << (PAGE_SHIFT - IO_TLB_SHIFT));
-	else
-		stride = 1;
-
-	BUG_ON(!nslots);
-
-	/*
-	 * Find suitable number of IO TLB entries size that will fit this
-	 * request and allocate a buffer from that IO TLB pool.
-	 */
-	spin_lock_irqsave(&io_tlb_lock, flags);
-	index = ALIGN(io_tlb_index, stride);
-	if (index >= io_tlb_nslabs)
-		index = 0;
-	wrap = index;
-
-	do {
-		while (iommu_is_span_boundary(index, nslots, offset_slots,
-					      max_slots)) {
-			index += stride;
-			if (index >= io_tlb_nslabs)
-				index = 0;
-			if (index == wrap)
-				goto not_found;
-		}
-
-		/*
-		 * If we find a slot that indicates we have ''nslots'' number
of
-		 * contiguous buffers, we allocate the buffers from that slot
-		 * and mark the entries as ''0'' indicating unavailable.
-		 */
-		if (io_tlb_list[index] >= nslots) {
-			int count = 0;
-
-			for (i = index; i < (int) (index + nslots); i++)
-				io_tlb_list[i] = 0;
-			for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE)
-				!= IO_TLB_SEGSIZE - 1) && io_tlb_list[i]; i--)
-				io_tlb_list[i] = ++count;
-			dma_addr = io_tlb_start + (index << IO_TLB_SHIFT);
-
-			/*
-			 * Update the indices to avoid searching in the next
-			 * round.
-			 */
-			io_tlb_index = ((index + nslots) < io_tlb_nslabs
-					? (index + nslots) : 0);
-
-			goto found;
-		}
-		index += stride;
-		if (index >= io_tlb_nslabs)
-			index = 0;
-	} while (index != wrap);
-
-not_found:
-	spin_unlock_irqrestore(&io_tlb_lock, flags);
-	return NULL;
-found:
-	spin_unlock_irqrestore(&io_tlb_lock, flags);
-
-	/*
-	 * Save away the mapping from the original address to the DMA address.
-	 * This is needed when we sync the memory.  Then we sync the buffer if
-	 * needed.
-	 */
-	for (i = 0; i < nslots; i++)
-		io_tlb_orig_addr[index+i] = phys + (i << IO_TLB_SHIFT);
-	if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL)
-		swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE);
-
-	return dma_addr;
-}
-
-/*
- * dma_addr is the kernel virtual address of the bounce buffer to unmap.
- */
-void
-do_unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
-{
-	unsigned long flags;
-	int i, count, nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >>
IO_TLB_SHIFT;
-	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	phys_addr_t phys = io_tlb_orig_addr[index];
-
-	/*
-	 * First, sync the memory before unmapping the entry
-	 */
-	if (phys && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
-		swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE);
-
-	/*
-	 * Return the buffer to the free list by setting the corresponding
-	 * entries to indicate the number of contiguous entries available.
-	 * While returning the entries to the free list, we merge the entries
-	 * with slots below and above the pool being returned.
-	 */
-	spin_lock_irqsave(&io_tlb_lock, flags);
-	{
-		count = ((index + nslots) < ALIGN(index + 1, IO_TLB_SEGSIZE) ?
-			 io_tlb_list[index + nslots] : 0);
-		/*
-		 * Step 1: return the slots to the free list, merging the
-		 * slots with superceeding slots
-		 */
-		for (i = index + nslots - 1; i >= index; i--)
-			io_tlb_list[i] = ++count;
-		/*
-		 * Step 2: merge the returned slots with the preceding slots,
-		 * if available (non zero)
-		 */
-		for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) !-				IO_TLB_SEGSIZE - 1)
&& io_tlb_list[i]; i--)
-			io_tlb_list[i] = ++count;
-	}
-	spin_unlock_irqrestore(&io_tlb_lock, flags);
-}
-
-void
-do_sync_single(struct device *hwdev, char *dma_addr, size_t size,
-	    int dir, int target)
-{
-	int index = (dma_addr - io_tlb_start) >> IO_TLB_SHIFT;
-	phys_addr_t phys = io_tlb_orig_addr[index];
-
-	phys += ((unsigned long)dma_addr & ((1 << IO_TLB_SHIFT) - 1));
-
-	switch (target) {
-	case SYNC_FOR_CPU:
-		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
-			swiotlb_bounce(phys, dma_addr, size, DMA_FROM_DEVICE);
-		else
-			BUG_ON(dir != DMA_TO_DEVICE);
-		break;
-	case SYNC_FOR_DEVICE:
-		if (likely(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
-			swiotlb_bounce(phys, dma_addr, size, DMA_TO_DEVICE);
-		else
-			BUG_ON(dir != DMA_FROM_DEVICE);
-		break;
-	default:
-		BUG();
-	}
-}
-
-void *
-swiotlb_alloc_coherent(struct device *hwdev, size_t size,
-		       dma_addr_t *dma_handle, gfp_t flags)
-{
-	dma_addr_t dev_addr;
-	void *ret;
-	int order = get_order(size);
-	u64 dma_mask = DMA_BIT_MASK(32);
-	unsigned long start_dma_addr;
-
-	if (hwdev && hwdev->coherent_dma_mask)
-		dma_mask = hwdev->coherent_dma_mask;
-
-	ret = (void *)__get_free_pages(flags, order);
-	if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) {
-		/*
-		 * The allocated memory isn''t reachable by the device.
-		 */
-		free_pages((unsigned long) ret, order);
-		ret = NULL;
-	}
-	if (!ret) {
-		/*
-		 * We are either out of memory or the device can''t DMA
-		 * to GFP_DMA memory; fall back on do_map_single(), which
-		 * will grab memory from the lowest available address range.
-		 */
-		start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
-		ret = do_map_single(hwdev, 0, start_dma_addr, size,
-				    DMA_FROM_DEVICE);
-		if (!ret)
-			return NULL;
-	}
-
-	memset(ret, 0, size);
-	dev_addr = swiotlb_virt_to_bus(hwdev, ret);
-
-	/* Confirm address can be DMA''d by device */
-	if (dev_addr + size - 1 > dma_mask) {
-		dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \
-		       "dev_addr = 0x%016Lx\n",
-		       (unsigned long long)dma_mask,
-		       (unsigned long long)dev_addr);
-
-		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
-		do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE);
-		return NULL;
-	}
-	*dma_handle = dev_addr;
-	return ret;
-}
-EXPORT_SYMBOL(swiotlb_alloc_coherent);
-
-void
-swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
-		      dma_addr_t dev_addr)
-{
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
-
-	WARN_ON(irqs_disabled());
-	if (!is_swiotlb_buffer(paddr))
-		free_pages((unsigned long)vaddr, get_order(size));
-	else
-		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
-		do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE);
-}
-EXPORT_SYMBOL(swiotlb_free_coherent);
-
-static void
-swiotlb_full(struct device *dev, size_t size, int dir, int do_panic)
-{
-	/*
-	 * Ran out of IOMMU space for this operation. This is very bad.
-	 * Unfortunately the drivers cannot handle this operation properly.
-	 * unless they check for dma_mapping_error (most don''t)
-	 * When the mapping is small enough return a static buffer to limit
-	 * the damage, or panic when the transfer is too big.
-	 */
-	dev_err(dev, "DMA: Out of SW-IOMMU space for %zu bytes.", size);
-
-	if (size <= io_tlb_overflow || !do_panic)
-		return;
-
-	if (dir == DMA_BIDIRECTIONAL)
-		panic("DMA: Random memory could be DMA accessed\n");
-	if (dir == DMA_FROM_DEVICE)
-		panic("DMA: Random memory could be DMA written\n");
-	if (dir == DMA_TO_DEVICE)
-		panic("DMA: Random memory could be DMA read\n");
-}
-
-/*
- * Map a single buffer of the indicated size for DMA in streaming mode.  The
- * physical address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory until
- * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed.
- */
-dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
-			    unsigned long offset, size_t size,
-			    enum dma_data_direction dir,
-			    struct dma_attrs *attrs)
-{
-	unsigned long start_dma_addr;
-	phys_addr_t phys = page_to_phys(page) + offset;
-	dma_addr_t dev_addr = phys_to_dma(dev, phys);
-	void *map;
-
-	BUG_ON(dir == DMA_NONE);
-	/*
-	 * If the address happens to be in the device''s DMA window,
-	 * we can safely return the device addr and not worry about bounce
-	 * buffering it.
-	 */
-	if (dma_capable(dev, dev_addr, size) && !swiotlb_force)
-		return dev_addr;
-
-	/*
-	 * Oh well, have to allocate and map a bounce buffer.
-	 */
-	start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start);
-	map = do_map_single(dev, phys, start_dma_addr, size, dir);
-	if (!map) {
-		swiotlb_full(dev, size, dir, 1);
-		map = io_tlb_overflow_buffer;
-	}
-
-	dev_addr = swiotlb_virt_to_bus(dev, map);
-
-	/*
-	 * Ensure that the address returned is DMA''ble
-	 */
-	if (!dma_capable(dev, dev_addr, size))
-		panic("DMA: swiotlb_map_single: bounce buffer is not
DMA''ble");
-
-	return dev_addr;
-}
-EXPORT_SYMBOL_GPL(swiotlb_map_page);
-
-/*
- * Unmap a single streaming mode DMA translation.  The dma_addr and size must
- * match what was provided for in a previous swiotlb_map_page call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
-			 size_t size, int dir)
-{
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
-
-	BUG_ON(dir == DMA_NONE);
-
-	if (is_swiotlb_buffer(paddr)) {
-		do_unmap_single(hwdev, phys_to_virt(paddr), size, dir);
-		return;
-	}
-
-	if (dir != DMA_FROM_DEVICE)
-		return;
-
-	/*
-	 * phys_to_virt doesn''t work with hihgmem page but we could
-	 * call dma_mark_clean() with hihgmem page here. However, we
-	 * are fine since dma_mark_clean() is null on POWERPC. We can
-	 * make dma_mark_clean() take a physical address if necessary.
-	 */
-	dma_mark_clean(phys_to_virt(paddr), size);
-}
-
-void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
-			size_t size, enum dma_data_direction dir,
-			struct dma_attrs *attrs)
-{
-	unmap_single(hwdev, dev_addr, size, dir);
-}
-EXPORT_SYMBOL_GPL(swiotlb_unmap_page);
-
-/*
- * Make physical memory consistent for a single streaming mode DMA translation
- * after a transfer.
- *
- * If you perform a swiotlb_map_page() but wish to interrogate the buffer
- * using the cpu, yet do not wish to teardown the dma mapping, you must
- * call this function before doing so.  At the next point you give the dma
- * address back to the card, you must first perform a
- * swiotlb_dma_sync_for_device, and then the device again owns the buffer
- */
-static void
-swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
-		    size_t size, int dir, int target)
-{
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
-
-	BUG_ON(dir == DMA_NONE);
-
-	if (is_swiotlb_buffer(paddr)) {
-		do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
-		return;
-	}
-
-	if (dir != DMA_FROM_DEVICE)
-		return;
-
-	dma_mark_clean(phys_to_virt(paddr), size);
-}
-
-void
-swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
-			    size_t size, enum dma_data_direction dir)
-{
-	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
-}
-EXPORT_SYMBOL(swiotlb_sync_single_for_cpu);
-
-void
-swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
-			       size_t size, enum dma_data_direction dir)
-{
-	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE);
-}
-EXPORT_SYMBOL(swiotlb_sync_single_for_device);
-
-/*
- * Same as above, but for a sub-range of the mapping.
- */
-static void
-swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr,
-			  unsigned long offset, size_t size,
-			  int dir, int target)
-{
-	swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target);
-}
-
-void
-swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
-				  unsigned long offset, size_t size,
-				  enum dma_data_direction dir)
-{
-	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
-				  SYNC_FOR_CPU);
-}
-EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu);
-
-void
-swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr,
-				     unsigned long offset, size_t size,
-				     enum dma_data_direction dir)
-{
-	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
-				  SYNC_FOR_DEVICE);
-}
-EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device);
-
-/*
- * Map a set of buffers described by scatterlist in streaming mode for DMA.
- * This is the scatter-gather version of the above swiotlb_map_page
- * interface.  Here the scatter gather list elements are each tagged with the
- * appropriate dma address and length.  They are obtained via
- * sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for swiotlb_map_page are the
- * same here.
- */
-int
-swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
-		     enum dma_data_direction dir, struct dma_attrs *attrs)
-{
-	unsigned long start_dma_addr;
-	struct scatterlist *sg;
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
-	for_each_sg(sgl, sg, nelems, i) {
-		phys_addr_t paddr = sg_phys(sg);
-		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
-
-		if (swiotlb_force ||
-		    !dma_capable(hwdev, dev_addr, sg->length)) {
-			void *map = do_map_single(hwdev, sg_phys(sg),
-						  start_dma_addr,
-						  sg->length, dir);
-			if (!map) {
-				/* Don''t panic here, we expect map_sg users
-				   to do proper error handling. */
-				swiotlb_full(hwdev, sg->length, dir, 0);
-				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
-						       attrs);
-				sgl[0].dma_length = 0;
-				return 0;
-			}
-			sg->dma_address = swiotlb_virt_to_bus(hwdev, map);
-		} else
-			sg->dma_address = dev_addr;
-		sg->dma_length = sg->length;
-	}
-	return nelems;
-}
-EXPORT_SYMBOL(swiotlb_map_sg_attrs);
-
-int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
-	       int dir)
-{
-	return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL);
-}
-EXPORT_SYMBOL(swiotlb_map_sg);
-
-/*
- * Unmap a set of streaming mode DMA translations.  Again, cpu read rules
- * concerning calls here are the same as for swiotlb_unmap_page() above.
- */
-void
-swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
-		       int nelems, enum dma_data_direction dir,
-		       struct dma_attrs *attrs)
-{
-	struct scatterlist *sg;
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for_each_sg(sgl, sg, nelems, i)
-		unmap_single(hwdev, sg->dma_address, sg->dma_length, dir);
-
-}
-EXPORT_SYMBOL(swiotlb_unmap_sg_attrs);
-
-void
-swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
-		 int dir)
-{
-	return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL);
-}
-EXPORT_SYMBOL(swiotlb_unmap_sg);
-
-/*
- * Make physical memory consistent for a set of streaming mode DMA translations
- * after a transfer.
- *
- * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules
- * and usage.
- */
-static void
-swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl,
-		int nelems, int dir, int target)
-{
-	struct scatterlist *sg;
-	int i;
-
-	for_each_sg(sgl, sg, nelems, i)
-		swiotlb_sync_single(hwdev, sg->dma_address,
-				    sg->dma_length, dir, target);
-}
-
-void
-swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
-			int nelems, enum dma_data_direction dir)
-{
-	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU);
-}
-EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu);
-
-void
-swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
-			   int nelems, enum dma_data_direction dir)
-{
-	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
-}
-EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
-
-int
-swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
-{
-	return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
-}
-EXPORT_SYMBOL(swiotlb_dma_mapping_error);
-
-/*
- * Return whether the given device DMA address mask can be supported
- * properly.  For example, if your device can only drive the low 24-bits
- * during bus mastering, then you would pass 0x00ffffff as the mask to
- * this function.
- */
-int
-swiotlb_dma_supported(struct device *hwdev, u64 mask)
-{
-	return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask;
-}
-EXPORT_SYMBOL(swiotlb_dma_supported);
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-03 17:08 UTC

head link

[Xen-devel] [PATCH 11/11] [swiotlb] move dma_ops functions to swiotlb.c.

From: Konrad Rzeszutek <konrad@t42p-lan.dumpdata.com>

In essence, leave in swiotlb-core.c functions dealing with the
bookkeeping of the IOMMU. And functions which are declared in dma_ops
structures are moved over to swiotlb.c.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 lib/swiotlb-core.c |  385 ---------------------------------------------------
 lib/swiotlb.c      |  391 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 391 insertions(+), 385 deletions(-)
 create mode 100644 lib/swiotlb.c

diff --git a/lib/swiotlb-core.c b/lib/swiotlb-core.c
index c982d33..2534d6d 100644
--- a/lib/swiotlb-core.c
+++ b/lib/swiotlb-core.c
@@ -138,13 +138,6 @@ setup_io_tlb_npages(char *str)
 }
 __setup("swiotlb=", setup_io_tlb_npages);
 
-/* Note that this doesn''t work with highmem page */
-static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
-				      volatile void *address)
-{
-	return phys_to_dma(hwdev, virt_to_phys(address));
-}
-
 void swiotlb_print_info(void)
 {
 	unsigned long bytes = io_tlb_nslabs << IO_TLB_SHIFT;
@@ -555,76 +548,7 @@ do_sync_single(struct device *hwdev, char *dma_addr, size_t
size,
 		BUG();
 	}
 }
-
-void *
-swiotlb_alloc_coherent(struct device *hwdev, size_t size,
-		       dma_addr_t *dma_handle, gfp_t flags)
-{
-	dma_addr_t dev_addr;
-	void *ret;
-	int order = get_order(size);
-	u64 dma_mask = DMA_BIT_MASK(32);
-	unsigned long start_dma_addr;
-
-	if (hwdev && hwdev->coherent_dma_mask)
-		dma_mask = hwdev->coherent_dma_mask;
-
-	ret = (void *)__get_free_pages(flags, order);
-	if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) {
-		/*
-		 * The allocated memory isn''t reachable by the device.
-		 */
-		free_pages((unsigned long) ret, order);
-		ret = NULL;
-	}
-	if (!ret) {
-		/*
-		 * We are either out of memory or the device can''t DMA
-		 * to GFP_DMA memory; fall back on do_map_single(), which
-		 * will grab memory from the lowest available address range.
-		 */
-		start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
-		ret = do_map_single(hwdev, 0, start_dma_addr, size,
-				    DMA_FROM_DEVICE);
-		if (!ret)
-			return NULL;
-	}
-
-	memset(ret, 0, size);
-	dev_addr = swiotlb_virt_to_bus(hwdev, ret);
-
-	/* Confirm address can be DMA''d by device */
-	if (dev_addr + size - 1 > dma_mask) {
-		dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \
-		       "dev_addr = 0x%016Lx\n",
-		       (unsigned long long)dma_mask,
-		       (unsigned long long)dev_addr);
-
-		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
-		do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE);
-		return NULL;
-	}
-	*dma_handle = dev_addr;
-	return ret;
-}
-EXPORT_SYMBOL(swiotlb_alloc_coherent);
-
 void
-swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
-		      dma_addr_t dev_addr)
-{
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
-
-	WARN_ON(irqs_disabled());
-	if (!is_swiotlb_buffer(paddr))
-		free_pages((unsigned long)vaddr, get_order(size));
-	else
-		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
-		do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE);
-}
-EXPORT_SYMBOL(swiotlb_free_coherent);
-
-static void
 swiotlb_full(struct device *dev, size_t size, int dir, int do_panic)
 {
 	/*
@@ -646,312 +570,3 @@ swiotlb_full(struct device *dev, size_t size, int dir, int
do_panic)
 	if (dir == DMA_TO_DEVICE)
 		panic("DMA: Random memory could be DMA read\n");
 }
-
-/*
- * Map a single buffer of the indicated size for DMA in streaming mode.  The
- * physical address to use is returned.
- *
- * Once the device is given the dma address, the device owns this memory until
- * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed.
- */
-dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
-			    unsigned long offset, size_t size,
-			    enum dma_data_direction dir,
-			    struct dma_attrs *attrs)
-{
-	unsigned long start_dma_addr;
-	phys_addr_t phys = page_to_phys(page) + offset;
-	dma_addr_t dev_addr = phys_to_dma(dev, phys);
-	void *map;
-
-	BUG_ON(dir == DMA_NONE);
-	/*
-	 * If the address happens to be in the device''s DMA window,
-	 * we can safely return the device addr and not worry about bounce
-	 * buffering it.
-	 */
-	if (dma_capable(dev, dev_addr, size) && !swiotlb_force)
-		return dev_addr;
-
-	/*
-	 * Oh well, have to allocate and map a bounce buffer.
-	 */
-	start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start);
-	map = do_map_single(dev, phys, start_dma_addr, size, dir);
-	if (!map) {
-		swiotlb_full(dev, size, dir, 1);
-		map = io_tlb_overflow_buffer;
-	}
-
-	dev_addr = swiotlb_virt_to_bus(dev, map);
-
-	/*
-	 * Ensure that the address returned is DMA''ble
-	 */
-	if (!dma_capable(dev, dev_addr, size))
-		panic("DMA: swiotlb_map_single: bounce buffer is not
DMA''ble");
-
-	return dev_addr;
-}
-EXPORT_SYMBOL_GPL(swiotlb_map_page);
-
-/*
- * Unmap a single streaming mode DMA translation.  The dma_addr and size must
- * match what was provided for in a previous swiotlb_map_page call.  All
- * other usages are undefined.
- *
- * After this call, reads by the cpu to the buffer are guaranteed to see
- * whatever the device wrote there.
- */
-static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
-			 size_t size, int dir)
-{
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
-
-	BUG_ON(dir == DMA_NONE);
-
-	if (is_swiotlb_buffer(paddr)) {
-		do_unmap_single(hwdev, phys_to_virt(paddr), size, dir);
-		return;
-	}
-
-	if (dir != DMA_FROM_DEVICE)
-		return;
-
-	/*
-	 * phys_to_virt doesn''t work with hihgmem page but we could
-	 * call dma_mark_clean() with hihgmem page here. However, we
-	 * are fine since dma_mark_clean() is null on POWERPC. We can
-	 * make dma_mark_clean() take a physical address if necessary.
-	 */
-	dma_mark_clean(phys_to_virt(paddr), size);
-}
-
-void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
-			size_t size, enum dma_data_direction dir,
-			struct dma_attrs *attrs)
-{
-	unmap_single(hwdev, dev_addr, size, dir);
-}
-EXPORT_SYMBOL_GPL(swiotlb_unmap_page);
-
-/*
- * Make physical memory consistent for a single streaming mode DMA translation
- * after a transfer.
- *
- * If you perform a swiotlb_map_page() but wish to interrogate the buffer
- * using the cpu, yet do not wish to teardown the dma mapping, you must
- * call this function before doing so.  At the next point you give the dma
- * address back to the card, you must first perform a
- * swiotlb_dma_sync_for_device, and then the device again owns the buffer
- */
-static void
-swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
-		    size_t size, int dir, int target)
-{
-	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
-
-	BUG_ON(dir == DMA_NONE);
-
-	if (is_swiotlb_buffer(paddr)) {
-		do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
-		return;
-	}
-
-	if (dir != DMA_FROM_DEVICE)
-		return;
-
-	dma_mark_clean(phys_to_virt(paddr), size);
-}
-
-void
-swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
-			    size_t size, enum dma_data_direction dir)
-{
-	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
-}
-EXPORT_SYMBOL(swiotlb_sync_single_for_cpu);
-
-void
-swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
-			       size_t size, enum dma_data_direction dir)
-{
-	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE);
-}
-EXPORT_SYMBOL(swiotlb_sync_single_for_device);
-
-/*
- * Same as above, but for a sub-range of the mapping.
- */
-static void
-swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr,
-			  unsigned long offset, size_t size,
-			  int dir, int target)
-{
-	swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target);
-}
-
-void
-swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
-				  unsigned long offset, size_t size,
-				  enum dma_data_direction dir)
-{
-	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
-				  SYNC_FOR_CPU);
-}
-EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu);
-
-void
-swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr,
-				     unsigned long offset, size_t size,
-				     enum dma_data_direction dir)
-{
-	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
-				  SYNC_FOR_DEVICE);
-}
-EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device);
-
-/*
- * Map a set of buffers described by scatterlist in streaming mode for DMA.
- * This is the scatter-gather version of the above swiotlb_map_page
- * interface.  Here the scatter gather list elements are each tagged with the
- * appropriate dma address and length.  They are obtained via
- * sg_dma_{address,length}(SG).
- *
- * NOTE: An implementation may be able to use a smaller number of
- *       DMA address/length pairs than there are SG table elements.
- *       (for example via virtual mapping capabilities)
- *       The routine returns the number of addr/length pairs actually
- *       used, at most nents.
- *
- * Device ownership issues as mentioned above for swiotlb_map_page are the
- * same here.
- */
-int
-swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
-		     enum dma_data_direction dir, struct dma_attrs *attrs)
-{
-	unsigned long start_dma_addr;
-	struct scatterlist *sg;
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
-	for_each_sg(sgl, sg, nelems, i) {
-		phys_addr_t paddr = sg_phys(sg);
-		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
-
-		if (swiotlb_force ||
-		    !dma_capable(hwdev, dev_addr, sg->length)) {
-			void *map = do_map_single(hwdev, sg_phys(sg),
-						  start_dma_addr,
-						  sg->length, dir);
-			if (!map) {
-				/* Don''t panic here, we expect map_sg users
-				   to do proper error handling. */
-				swiotlb_full(hwdev, sg->length, dir, 0);
-				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
-						       attrs);
-				sgl[0].dma_length = 0;
-				return 0;
-			}
-			sg->dma_address = swiotlb_virt_to_bus(hwdev, map);
-		} else
-			sg->dma_address = dev_addr;
-		sg->dma_length = sg->length;
-	}
-	return nelems;
-}
-EXPORT_SYMBOL(swiotlb_map_sg_attrs);
-
-int
-swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
-	       int dir)
-{
-	return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL);
-}
-EXPORT_SYMBOL(swiotlb_map_sg);
-
-/*
- * Unmap a set of streaming mode DMA translations.  Again, cpu read rules
- * concerning calls here are the same as for swiotlb_unmap_page() above.
- */
-void
-swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
-		       int nelems, enum dma_data_direction dir,
-		       struct dma_attrs *attrs)
-{
-	struct scatterlist *sg;
-	int i;
-
-	BUG_ON(dir == DMA_NONE);
-
-	for_each_sg(sgl, sg, nelems, i)
-		unmap_single(hwdev, sg->dma_address, sg->dma_length, dir);
-
-}
-EXPORT_SYMBOL(swiotlb_unmap_sg_attrs);
-
-void
-swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
-		 int dir)
-{
-	return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL);
-}
-EXPORT_SYMBOL(swiotlb_unmap_sg);
-
-/*
- * Make physical memory consistent for a set of streaming mode DMA translations
- * after a transfer.
- *
- * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules
- * and usage.
- */
-static void
-swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl,
-		int nelems, int dir, int target)
-{
-	struct scatterlist *sg;
-	int i;
-
-	for_each_sg(sgl, sg, nelems, i)
-		swiotlb_sync_single(hwdev, sg->dma_address,
-				    sg->dma_length, dir, target);
-}
-
-void
-swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
-			int nelems, enum dma_data_direction dir)
-{
-	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU);
-}
-EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu);
-
-void
-swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
-			   int nelems, enum dma_data_direction dir)
-{
-	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
-}
-EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
-
-int
-swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
-{
-	return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
-}
-EXPORT_SYMBOL(swiotlb_dma_mapping_error);
-
-/*
- * Return whether the given device DMA address mask can be supported
- * properly.  For example, if your device can only drive the low 24-bits
- * during bus mastering, then you would pass 0x00ffffff as the mask to
- * this function.
- */
-int
-swiotlb_dma_supported(struct device *hwdev, u64 mask)
-{
-	return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask;
-}
-EXPORT_SYMBOL(swiotlb_dma_supported);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
new file mode 100644
index 0000000..f6bbcd1
--- /dev/null
+++ b/lib/swiotlb.c
@@ -0,0 +1,391 @@
+
+#include <linux/dma-mapping.h>
+#include <linux/module.h>
+#include <linux/swiotlb.h>
+
+#include <asm/scatterlist.h>
+#include <linux/iommu-helper.h>
+
+
+/* Note that this doesn''t work with highmem page */
+static dma_addr_t swiotlb_virt_to_bus(struct device *hwdev,
+				      volatile void *address)
+{
+	return phys_to_dma(hwdev, virt_to_phys(address));
+}
+void *
+swiotlb_alloc_coherent(struct device *hwdev, size_t size,
+		       dma_addr_t *dma_handle, gfp_t flags)
+{
+	dma_addr_t dev_addr;
+	void *ret;
+	int order = get_order(size);
+	u64 dma_mask = DMA_BIT_MASK(32);
+	unsigned long start_dma_addr;
+
+	if (hwdev && hwdev->coherent_dma_mask)
+		dma_mask = hwdev->coherent_dma_mask;
+
+	ret = (void *)__get_free_pages(flags, order);
+	if (ret && swiotlb_virt_to_bus(hwdev, ret) + size - 1 > dma_mask) {
+		/*
+		 * The allocated memory isn''t reachable by the device.
+		 */
+		free_pages((unsigned long) ret, order);
+		ret = NULL;
+	}
+	if (!ret) {
+		/*
+		 * We are either out of memory or the device can''t DMA
+		 * to GFP_DMA memory; fall back on do_map_single(), which
+		 * will grab memory from the lowest available address range.
+		 */
+		start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
+		ret = do_map_single(hwdev, 0, start_dma_addr, size,
+				    DMA_FROM_DEVICE);
+		if (!ret)
+			return NULL;
+	}
+
+	memset(ret, 0, size);
+	dev_addr = swiotlb_virt_to_bus(hwdev, ret);
+
+	/* Confirm address can be DMA''d by device */
+	if (dev_addr + size - 1 > dma_mask) {
+		dev_err(hwdev, "DMA: hwdev DMA mask = 0x%016Lx, " \
+		       "dev_addr = 0x%016Lx\n",
+		       (unsigned long long)dma_mask,
+		       (unsigned long long)dev_addr);
+
+		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
+		do_unmap_single(hwdev, ret, size, DMA_TO_DEVICE);
+		return NULL;
+	}
+	*dma_handle = dev_addr;
+	return ret;
+}
+EXPORT_SYMBOL(swiotlb_alloc_coherent);
+
+void
+swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
+		      dma_addr_t dev_addr)
+{
+	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+
+	WARN_ON(irqs_disabled());
+	if (!is_swiotlb_buffer(paddr))
+		free_pages((unsigned long)vaddr, get_order(size));
+	else
+		/* DMA_TO_DEVICE to avoid memcpy in do_unmap_single */
+		do_unmap_single(hwdev, vaddr, size, DMA_TO_DEVICE);
+}
+EXPORT_SYMBOL(swiotlb_free_coherent);
+
+/*
+ * Map a single buffer of the indicated size for DMA in streaming mode.  The
+ * physical address to use is returned.
+ *
+ * Once the device is given the dma address, the device owns this memory until
+ * either swiotlb_unmap_page or swiotlb_dma_sync_single is performed.
+ */
+dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
+			    unsigned long offset, size_t size,
+			    enum dma_data_direction dir,
+			    struct dma_attrs *attrs)
+{
+	unsigned long start_dma_addr;
+	phys_addr_t phys = page_to_phys(page) + offset;
+	dma_addr_t dev_addr = phys_to_dma(dev, phys);
+	void *map;
+
+	BUG_ON(dir == DMA_NONE);
+	/*
+	 * If the address happens to be in the device''s DMA window,
+	 * we can safely return the device addr and not worry about bounce
+	 * buffering it.
+	 */
+	if (dma_capable(dev, dev_addr, size) && !swiotlb_force)
+		return dev_addr;
+
+	/*
+	 * Oh well, have to allocate and map a bounce buffer.
+	 */
+	start_dma_addr = swiotlb_virt_to_bus(dev, io_tlb_start);
+	map = do_map_single(dev, phys, start_dma_addr, size, dir);
+	if (!map) {
+		swiotlb_full(dev, size, dir, 1);
+		map = io_tlb_overflow_buffer;
+	}
+
+	dev_addr = swiotlb_virt_to_bus(dev, map);
+
+	/*
+	 * Ensure that the address returned is DMA''ble
+	 */
+	if (!dma_capable(dev, dev_addr, size))
+		panic("DMA: swiotlb_map_single: bounce buffer is not
DMA''ble");
+
+	return dev_addr;
+}
+EXPORT_SYMBOL_GPL(swiotlb_map_page);
+
+/*
+ * Unmap a single streaming mode DMA translation.  The dma_addr and size must
+ * match what was provided for in a previous swiotlb_map_page call.  All
+ * other usages are undefined.
+ *
+ * After this call, reads by the cpu to the buffer are guaranteed to see
+ * whatever the device wrote there.
+ */
+static void unmap_single(struct device *hwdev, dma_addr_t dev_addr,
+			 size_t size, int dir)
+{
+	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+
+	BUG_ON(dir == DMA_NONE);
+
+	if (is_swiotlb_buffer(paddr)) {
+		do_unmap_single(hwdev, phys_to_virt(paddr), size, dir);
+		return;
+	}
+
+	if (dir != DMA_FROM_DEVICE)
+		return;
+
+	/*
+	 * phys_to_virt doesn''t work with hihgmem page but we could
+	 * call dma_mark_clean() with hihgmem page here. However, we
+	 * are fine since dma_mark_clean() is null on POWERPC. We can
+	 * make dma_mark_clean() take a physical address if necessary.
+	 */
+	dma_mark_clean(phys_to_virt(paddr), size);
+}
+
+void swiotlb_unmap_page(struct device *hwdev, dma_addr_t dev_addr,
+			size_t size, enum dma_data_direction dir,
+			struct dma_attrs *attrs)
+{
+	unmap_single(hwdev, dev_addr, size, dir);
+}
+EXPORT_SYMBOL_GPL(swiotlb_unmap_page);
+
+/*
+ * Make physical memory consistent for a single streaming mode DMA translation
+ * after a transfer.
+ *
+ * If you perform a swiotlb_map_page() but wish to interrogate the buffer
+ * using the cpu, yet do not wish to teardown the dma mapping, you must
+ * call this function before doing so.  At the next point you give the dma
+ * address back to the card, you must first perform a
+ * swiotlb_dma_sync_for_device, and then the device again owns the buffer
+ */
+static void
+swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
+		    size_t size, int dir, int target)
+{
+	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);
+
+	BUG_ON(dir == DMA_NONE);
+
+	if (is_swiotlb_buffer(paddr)) {
+		do_sync_single(hwdev, phys_to_virt(paddr), size, dir, target);
+		return;
+	}
+
+	if (dir != DMA_FROM_DEVICE)
+		return;
+
+	dma_mark_clean(phys_to_virt(paddr), size);
+}
+
+void
+swiotlb_sync_single_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
+			    size_t size, enum dma_data_direction dir)
+{
+	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_CPU);
+}
+EXPORT_SYMBOL(swiotlb_sync_single_for_cpu);
+
+void
+swiotlb_sync_single_for_device(struct device *hwdev, dma_addr_t dev_addr,
+			       size_t size, enum dma_data_direction dir)
+{
+	swiotlb_sync_single(hwdev, dev_addr, size, dir, SYNC_FOR_DEVICE);
+}
+EXPORT_SYMBOL(swiotlb_sync_single_for_device);
+
+/*
+ * Same as above, but for a sub-range of the mapping.
+ */
+static void
+swiotlb_sync_single_range(struct device *hwdev, dma_addr_t dev_addr,
+			  unsigned long offset, size_t size,
+			  int dir, int target)
+{
+	swiotlb_sync_single(hwdev, dev_addr + offset, size, dir, target);
+}
+
+void
+swiotlb_sync_single_range_for_cpu(struct device *hwdev, dma_addr_t dev_addr,
+				  unsigned long offset, size_t size,
+				  enum dma_data_direction dir)
+{
+	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
+				  SYNC_FOR_CPU);
+}
+EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_cpu);
+
+void
+swiotlb_sync_single_range_for_device(struct device *hwdev, dma_addr_t dev_addr,
+				     unsigned long offset, size_t size,
+				     enum dma_data_direction dir)
+{
+	swiotlb_sync_single_range(hwdev, dev_addr, offset, size, dir,
+				  SYNC_FOR_DEVICE);
+}
+EXPORT_SYMBOL_GPL(swiotlb_sync_single_range_for_device);
+
+/*
+ * Map a set of buffers described by scatterlist in streaming mode for DMA.
+ * This is the scatter-gather version of the above swiotlb_map_page
+ * interface.  Here the scatter gather list elements are each tagged with the
+ * appropriate dma address and length.  They are obtained via
+ * sg_dma_{address,length}(SG).
+ *
+ * NOTE: An implementation may be able to use a smaller number of
+ *       DMA address/length pairs than there are SG table elements.
+ *       (for example via virtual mapping capabilities)
+ *       The routine returns the number of addr/length pairs actually
+ *       used, at most nents.
+ *
+ * Device ownership issues as mentioned above for swiotlb_map_page are the
+ * same here.
+ */
+int
+swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
+		     enum dma_data_direction dir, struct dma_attrs *attrs)
+{
+	unsigned long start_dma_addr;
+	struct scatterlist *sg;
+	int i;
+
+	BUG_ON(dir == DMA_NONE);
+
+	start_dma_addr = swiotlb_virt_to_bus(hwdev, io_tlb_start);
+	for_each_sg(sgl, sg, nelems, i) {
+		phys_addr_t paddr = sg_phys(sg);
+		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);
+
+		if (swiotlb_force ||
+		    !dma_capable(hwdev, dev_addr, sg->length)) {
+			void *map = do_map_single(hwdev, sg_phys(sg),
+						  start_dma_addr,
+						  sg->length, dir);
+			if (!map) {
+				/* Don''t panic here, we expect map_sg users
+				   to do proper error handling. */
+				swiotlb_full(hwdev, sg->length, dir, 0);
+				swiotlb_unmap_sg_attrs(hwdev, sgl, i, dir,
+						       attrs);
+				sgl[0].dma_length = 0;
+				return 0;
+			}
+			sg->dma_address = swiotlb_virt_to_bus(hwdev, map);
+		} else
+			sg->dma_address = dev_addr;
+		sg->dma_length = sg->length;
+	}
+	return nelems;
+}
+EXPORT_SYMBOL(swiotlb_map_sg_attrs);
+
+int
+swiotlb_map_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
+	       int dir)
+{
+	return swiotlb_map_sg_attrs(hwdev, sgl, nelems, dir, NULL);
+}
+EXPORT_SYMBOL(swiotlb_map_sg);
+
+/*
+ * Unmap a set of streaming mode DMA translations.  Again, cpu read rules
+ * concerning calls here are the same as for swiotlb_unmap_page() above.
+ */
+void
+swiotlb_unmap_sg_attrs(struct device *hwdev, struct scatterlist *sgl,
+		       int nelems, enum dma_data_direction dir,
+		       struct dma_attrs *attrs)
+{
+	struct scatterlist *sg;
+	int i;
+
+	BUG_ON(dir == DMA_NONE);
+
+	for_each_sg(sgl, sg, nelems, i)
+		unmap_single(hwdev, sg->dma_address, sg->dma_length, dir);
+
+}
+EXPORT_SYMBOL(swiotlb_unmap_sg_attrs);
+
+void
+swiotlb_unmap_sg(struct device *hwdev, struct scatterlist *sgl, int nelems,
+		 int dir)
+{
+	return swiotlb_unmap_sg_attrs(hwdev, sgl, nelems, dir, NULL);
+}
+EXPORT_SYMBOL(swiotlb_unmap_sg);
+
+/*
+ * Make physical memory consistent for a set of streaming mode DMA translations
+ * after a transfer.
+ *
+ * The same as swiotlb_sync_single_* but for a scatter-gather list, same rules
+ * and usage.
+ */
+static void
+swiotlb_sync_sg(struct device *hwdev, struct scatterlist *sgl,
+		int nelems, int dir, int target)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nelems, i)
+		swiotlb_sync_single(hwdev, sg->dma_address,
+				    sg->dma_length, dir, target);
+}
+
+void
+swiotlb_sync_sg_for_cpu(struct device *hwdev, struct scatterlist *sg,
+			int nelems, enum dma_data_direction dir)
+{
+	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_CPU);
+}
+EXPORT_SYMBOL(swiotlb_sync_sg_for_cpu);
+
+void
+swiotlb_sync_sg_for_device(struct device *hwdev, struct scatterlist *sg,
+			   int nelems, enum dma_data_direction dir)
+{
+	swiotlb_sync_sg(hwdev, sg, nelems, dir, SYNC_FOR_DEVICE);
+}
+EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
+
+int
+swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
+{
+	return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
+}
+EXPORT_SYMBOL(swiotlb_dma_mapping_error);
+
+/*
+ * Return whether the given device DMA address mask can be supported
+ * properly.  For example, if your device can only drive the low 24-bits
+ * during bus mastering, then you would pass 0x00ffffff as the mask to
+ * this function.
+ */
+int
+swiotlb_dma_supported(struct device *hwdev, u64 mask)
+{
+	return swiotlb_virt_to_bus(hwdev, io_tlb_end - 1) <= mask;
+}
+EXPORT_SYMBOL(swiotlb_dma_supported);
-- 
1.6.2.5


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

FUJITA Tomonori

2010-Feb-04 00:17 UTC

head link

[Xen-devel] Re: [RFC SWIOTLB-0.4]

On Wed,  3 Feb 2010 12:08:01 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> Attached is a set of eleven RFC patches that split the SWIOTLB library in
> two layers: core, and dma_ops related functions.
What''s the point of splitting swiotlb.c? Why can''t you just
export
some of functions in swiotlb.c?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-04 03:07 UTC

head link

Re: [Xen-devel] Re: [RFC SWIOTLB-0.4]

On Thu, Feb 04, 2010 at 09:17:31AM +0900, FUJITA Tomonori
wrote:> On Wed,  3 Feb 2010 12:08:01 -0500
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> 
> > Attached is a set of eleven RFC patches that split the SWIOTLB library
in
> > two layers: core, and dma_ops related functions.
> 
> What''s the point of splitting swiotlb.c? Why can''t you
just export
> some of functions in swiotlb.c?
I was emulating some of the other libraries that are in the kernel,
where the core functionality was in -core.c file and the users of it
are in subsequent once (libata).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2010-Feb-16 23:37 UTC

head link

Re: [Xen-devel] Re: [RFC SWIOTLB-0.4]

On Wed, Feb 03, 2010 at 10:07:49PM -0500, Konrad Rzeszutek Wilk
wrote:> On Thu, Feb 04, 2010 at 09:17:31AM +0900, FUJITA Tomonori wrote:
> > On Wed,  3 Feb 2010 12:08:01 -0500
> > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > 
> > > Attached is a set of eleven RFC patches that split the SWIOTLB
library in
> > > two layers: core, and dma_ops related functions.
> > 
> > What''s the point of splitting swiotlb.c? Why can''t
you just export
> > some of functions in swiotlb.c?
> 
> I was emulating some of the other libraries that are in the kernel,
> where the core functionality was in -core.c file and the users of it
> are in subsequent once (libata).
Thought there is one thing Jens Axboe mentioned that I didn''t think
off:
Keep it as simple and as few.

I''ve redone the patches, this time without the splitting and have
exported the symbols.

The git tree is:
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6.git swiotlb-0.5

And the LKML posting is:
https://lists.linux-foundation.org/pipermail/iommu/2010-February/002066.html
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/iommu
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jan 2010 - [RFC SWIOTLB-0.2]

[Xen-devel] [RFC SWIOTLB-0.2]

[Xen-devel] [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

[Xen-devel] [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

[Xen-devel] [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.

[Xen-devel] [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/

[Xen-devel] [PATCH 05/15] [swiotlb] Respect the io_tlb_nslabs argument value.

[Xen-devel] [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.

[Xen-devel] [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

[Xen-devel] [PATCH 08/15] [swiotlb] Add ''is_swiotlb_buffer'' to the swiotlb_ops function decleration.

[Xen-devel] [PATCH 09/15] [swiotlb] Add ''dma_capable'' to the swiotlb_ops structure.

[Xen-devel] [PATCH 10/15] [swiotlb] Replace the [phys, bus]->virt and virt->[bus, phys] functions with iommu_sw calls.

[Xen-devel] [PATCH 11/15] [swiotlb] Replace late_alloc with iommu_sw->priv usage.

[Xen-devel] [PATCH 12/15] [swiotlb] Remove un-used static declerations obsoleted by iommu_sw.

[Xen-devel] [PATCH 13/15] [swiotlb] Make io_tlb_nslabs visible outside lib/swiotlb.c and rename it.

[Xen-devel] [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

[Xen-devel] [PATCH 15/15] [swiotlb] Take advantage of iommu_sw->name and add %s to printk''s.

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

[Xen-devel] Re: [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.

[Xen-devel] Re: [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/

[Xen-devel] Re: [PATCH 05/15] [swiotlb] Respect the io_tlb_nslabs argument value.

[Xen-devel] Re: [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.

[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

[Xen-devel] Re: [RFC SWIOTLB-0.2]

[Xen-devel] Re: [PATCH 03/15] [swiotlb] Add swiotlb_register_engine function.

[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

[Xen-devel] Re: [PATCH 06/15] [swiotlb] In ''swiotlb_init'' take advantage of the default swiotlb_engine support.

[Xen-devel] Re: [PATCH 04/15] [swiotlb] Search and replace s/io_tlb/iommu_sw->/

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

[Xen-devel] Re: [RFC SWIOTLB-0.2]

[Xen-devel] Re: [PATCH 07/15] [swiotlb] In ''swiotlb_free'' check iommu_sw pointer.

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

[Xen-devel] Re: [PATCH 14/15] [swiotlb] Move initialization (swiotlb_init) and its friends in swiotlb-default.c

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

[Xen-devel] Re: [PATCH 01/15] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

[Xen-devel] Re: [PATCH 02/15] [swiotlb] Add swiotlb_engine structure for tracking multiple software IO TLBs.

[Xen-devel] [RFC SWIOTLB-0.4]

[Xen-devel] [PATCH 01/11] [swiotlb] fix: Update ''setup_io_tlb_npages'' to accept both arguments in either order.

[Xen-devel] [PATCH 02/11] [swiotlb] Make ''setup_io_tlb_npages'' accept new ''swiotlb='' syntax.

[Xen-devel] [PATCH 03/11] [swiotlb] Normalize the swiotlb_init_* function''s naming syntax.

[Xen-devel] [PATCH 04/11] [swiotlb] Make printk''s use same prefix and include dev_err when possible.

[Xen-devel] [PATCH 05/11] [swiotlb] Make internal bookkeeping functions have ''do_'' prefix.

[Xen-devel] [PATCH 06/11] [swiotlb] do_map_single: abstract out swiotlb_virt_to_bus calls out.

[Xen-devel] [PATCH 07/11] [swiotlb] Fix checkpatch warnings.

[Xen-devel] [PATCH 08/11] [swiotlb] Re-order the function declerations.

[Xen-devel] [PATCH 09/11] [swiotlb] Make swiotlb bookkeeping functions visible in the header file.

[Xen-devel] [PATCH 10/11] [swiotlb] Rename swiotlb.c to swiotlb-core.c

[Xen-devel] [PATCH 11/11] [swiotlb] move dma_ops functions to swiotlb.c.

[Xen-devel] Re: [RFC SWIOTLB-0.4]

Re: [Xen-devel] Re: [RFC SWIOTLB-0.4]

Re: [Xen-devel] Re: [RFC SWIOTLB-0.4]