The original problem I mentioned in the above mentioned URL: "if one boots a PV 64-bit guests with more than 4GB, the SWIOTLB [Xen] gets turned on - and 64MB of precious low-memory gets used." was totally bogus. The SWIOTLB that gets turned on is the *native* one - which does not exhaust any low-memory of the host. But it does eat up perfectly fine 64MB of the guest and never gets used. So this patchset has some things I wanted to do for some time: [PATCH 1/5] xen/swiotlb: Simplify the logic. Just so that next time I am not confused. [PATCH 2/5] xen/swiotlb: With more than 4GB on 64-bit, disable the and don''t turn the *native* SWIOTLB on PV guests and waste those 64MB. Here are the exciting new patches - basically I want to emulate what IA64 does which is to turn on the SWIOTLB late in the bootup cycle. This means not using the alloc_bootmem and having a "late" variant to initialize SWIOTLB. There is some surgery in the SWIOTLB library: [PATCH 3/5] swiotlb: add the late swiotlb initialization function to allow it to use an io_tlb passed in. Note: I hadn''t tested this on IA64 and that is something I need to do. And then the implementation in the Xen-SWIOTLB to use it: [PATCH 4/5] xen/swiotlb: Use the swiotlb_late_init_with_tbl to init along with Xen PCI frontend to utilize it. [PATCH 5/5] xen/pcifront: Use Xen-SWIOTLB when initting if required. The end result is that a PV guest can now dynamically(*) deal with PCI passthrough cards. I say "dynamically" b/c if one boots a PV guest with more than 3GB without using ''e820_hole'' (or is it called ''e820_host'' now?) the PCI subsystem won''t be able to squeeze the BARs as they are RAM occupied. The workaround is to boot with ''e820_hole'' or some new work where we manipulate at boot time the E820 to leave a nice big 1GB hole under 4G - and with all the work on the P2M tree that should be fairly easy actually. Note: If one uses ''iommu=soft'' on the Linux command line, the Xen-SWIOTLB still gets turned on.
Its pretty easy: 1). We only check to see if we need Xen SWIOTLB for PV guests. 2). If swiotlb=force or iommu=soft is set, then Xen SWIOTLB will be enabled. 3). If it is an initial domain, then Xen SWIOTLB will be enabled. 4). Native SWIOTLB must be disabled for PV guests. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/pci-swiotlb-xen.c | 9 +++++---- 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c index 967633a..b6a5340 100644 --- a/arch/x86/xen/pci-swiotlb-xen.c +++ b/arch/x86/xen/pci-swiotlb-xen.c @@ -34,19 +34,20 @@ static struct dma_map_ops xen_swiotlb_dma_ops = { int __init pci_xen_swiotlb_detect(void) { + if (!xen_pv_domain()) + return 0; + /* If running as PV guest, either iommu=soft, or swiotlb=force will * activate this IOMMU. If running as PV privileged, activate it * irregardless. */ - if ((xen_initial_domain() || swiotlb || swiotlb_force) && - (xen_pv_domain())) + if ((xen_initial_domain() || swiotlb || swiotlb_force)) xen_swiotlb = 1; /* If we are running under Xen, we MUST disable the native SWIOTLB. * Don''t worry about swiotlb_force flag activating the native, as * the ''swiotlb'' flag is the only one turning it on. */ - if (xen_pv_domain()) - swiotlb = 0; + swiotlb = 0; return xen_swiotlb; } -- 1.7.7.6
Konrad Rzeszutek Wilk
2012-Jul-31 14:00 UTC
[PATCH 2/5] xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB.
If a PV guest is booted the native SWIOTLB should not be turned on. It does not help us (we don''t have any PCI devices) and it eats 64MB of good memory. In the case of PV guests with PCI devices we need the Xen-SWIOTLB one. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/pci-swiotlb-xen.c | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c index b6a5340..0d5a214 100644 --- a/arch/x86/xen/pci-swiotlb-xen.c +++ b/arch/x86/xen/pci-swiotlb-xen.c @@ -8,6 +8,11 @@ #include <xen/xen.h> #include <asm/iommu_table.h> +#ifdef CONFIG_X86_64 +#include <asm/iommu.h> +#include <asm/dma.h> +#endif + int xen_swiotlb __read_mostly; static struct dma_map_ops xen_swiotlb_dma_ops = { @@ -49,6 +54,13 @@ int __init pci_xen_swiotlb_detect(void) * the ''swiotlb'' flag is the only one turning it on. */ swiotlb = 0; +#ifdef CONFIG_X86_64 + /* pci_swiotlb_detect_4gb turns native SWIOTLB if no_iommu == 0 + * (so no iommu=X command line over-writes). So disable the native + * SWIOTLB. */ + if (max_pfn > MAX_DMA32_PFN) + no_iommu = 1; +#endif return xen_swiotlb; } -- 1.7.7.6
Konrad Rzeszutek Wilk
2012-Jul-31 14:00 UTC
[PATCH 3/5] swiotlb: add the late swiotlb initialization function with iotlb memory
This enables the caller to initialize swiotlb with its own iotlb memory late in the bootup. See git commit eb605a5754d050a25a9f00d718fb173f24c486ef "swiotlb: add swiotlb_tbl_map_single library function" which will explain the full details of what it can be used for. CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- include/linux/swiotlb.h | 1 + lib/swiotlb.c | 31 +++++++++++++++++++++++-------- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h index e872526..8d08b3e 100644 --- a/include/linux/swiotlb.h +++ b/include/linux/swiotlb.h @@ -25,6 +25,7 @@ extern int swiotlb_force; extern void swiotlb_init(int verbose); extern void swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose); extern unsigned long swiotlb_nr_tbl(void); +extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs); /* * Enumeration for sync targets diff --git a/lib/swiotlb.c b/lib/swiotlb.c index 45bc1f8..5d33651 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -206,8 +206,9 @@ swiotlb_init(int verbose) int swiotlb_late_init_with_default_size(size_t default_size) { - unsigned long i, bytes, req_nslabs = io_tlb_nslabs; + unsigned long bytes, req_nslabs = io_tlb_nslabs; unsigned int order; + int rc = 0; if (!io_tlb_nslabs) { io_tlb_nslabs = (default_size >> IO_TLB_SHIFT); @@ -229,16 +230,32 @@ swiotlb_late_init_with_default_size(size_t default_size) order--; } - if (!io_tlb_start) - goto cleanup1; - + if (!io_tlb_start) { + io_tlb_nslabs = req_nslabs; + return -ENOMEM; + } if (order != get_order(bytes)) { printk(KERN_WARNING "Warning: only able to allocate %ld MB " "for software IO TLB\n", (PAGE_SIZE << order) >> 20); io_tlb_nslabs = SLABS_PER_PAGE << order; - bytes = io_tlb_nslabs << IO_TLB_SHIFT; } + rc = swiotlb_late_init_with_tbl(io_tlb_start, io_tlb_nslabs); + if (rc) + free_pages((unsigned long)io_tlb_start, order); + return rc; +} + +int +swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs) +{ + unsigned long i, bytes; + + bytes = nslabs << IO_TLB_SHIFT; + + io_tlb_nslabs = nslabs; + io_tlb_start = tlb; io_tlb_end = io_tlb_start + bytes; + memset(io_tlb_start, 0, bytes); /* @@ -288,10 +305,8 @@ cleanup3: io_tlb_list = NULL; cleanup2: io_tlb_end = NULL; - free_pages((unsigned long)io_tlb_start, order); io_tlb_start = NULL; -cleanup1: - io_tlb_nslabs = req_nslabs; + io_tlb_nslabs = 0; return -ENOMEM; } -- 1.7.7.6
Konrad Rzeszutek Wilk
2012-Jul-31 14:00 UTC
[PATCH 4/5] xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used.
With this patch we provide the functionality to initialize the Xen-SWIOTLB late in the bootup cycle - specifically for Xen PCI-frontend. We still will work if the user had supplied ''iommu=soft'' on the Linux command line. CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/include/asm/xen/swiotlb-xen.h | 2 + arch/x86/xen/pci-swiotlb-xen.c | 17 ++++++++++- drivers/xen/swiotlb-xen.c | 50 +++++++++++++++++++++++++++---- include/xen/swiotlb-xen.h | 1 + 4 files changed, 62 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/xen/swiotlb-xen.h b/arch/x86/include/asm/xen/swiotlb-xen.h index 1be1ab7..ee52fca 100644 --- a/arch/x86/include/asm/xen/swiotlb-xen.h +++ b/arch/x86/include/asm/xen/swiotlb-xen.h @@ -5,10 +5,12 @@ extern int xen_swiotlb; extern int __init pci_xen_swiotlb_detect(void); extern void __init pci_xen_swiotlb_init(void); +extern int pci_xen_swiotlb_init_late(void); #else #define xen_swiotlb (0) static inline int __init pci_xen_swiotlb_detect(void) { return 0; } static inline void __init pci_xen_swiotlb_init(void) { } +static inline int pci_xen_swiotlb_init_late(void) { return -ENXIO; } #endif #endif /* _ASM_X86_SWIOTLB_XEN_H */ diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c index 0d5a214..bc2daf6 100644 --- a/arch/x86/xen/pci-swiotlb-xen.c +++ b/arch/x86/xen/pci-swiotlb-xen.c @@ -12,7 +12,7 @@ #include <asm/iommu.h> #include <asm/dma.h> #endif - +#include <linux/export.h> int xen_swiotlb __read_mostly; static struct dma_map_ops xen_swiotlb_dma_ops = { @@ -74,6 +74,21 @@ void __init pci_xen_swiotlb_init(void) pci_request_acs(); } } + +int pci_xen_swiotlb_init_late(void) +{ + int rc = xen_swiotlb_late_init(1); + if (rc) + return rc; + + dma_ops = &xen_swiotlb_dma_ops; + /* Make sure ACS will be enabled */ + pci_request_acs(); + + return 0; +} +EXPORT_SYMBOL_GPL(pci_xen_swiotlb_init_late); + IOMMU_INIT_FINISH(pci_xen_swiotlb_detect, 0, pci_xen_swiotlb_init, diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index 1afb4fb..8805e94 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -145,13 +145,14 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs) return 0; } -void __init xen_swiotlb_init(int verbose) +int __xen_swiotlb_init(int verbose, bool early) { unsigned long bytes; int rc = -ENOMEM; unsigned long nr_tbl; char *m = NULL; unsigned int repeat = 3; + unsigned long order; nr_tbl = swiotlb_nr_tbl(); if (nr_tbl) @@ -161,12 +162,31 @@ void __init xen_swiotlb_init(int verbose) xen_io_tlb_nslabs = ALIGN(xen_io_tlb_nslabs, IO_TLB_SEGSIZE); } retry: + order = get_order(xen_io_tlb_nslabs << IO_TLB_SHIFT); bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT; /* * Get IO TLB memory from any location. */ - xen_io_tlb_start = alloc_bootmem_pages(PAGE_ALIGN(bytes)); + if (early) + xen_io_tlb_start = alloc_bootmem_pages(PAGE_ALIGN(bytes)); + else { +#define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT)) +#define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT) + + while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) { + xen_io_tlb_start = (void *)__get_free_pages(__GFP_NOWARN, order); + if (xen_io_tlb_start) + break; + order--; + } + if (order != get_order(bytes)) { + pr_warn("Warning: only able to allocate %ld MB " + "for software IO TLB\n", (PAGE_SIZE << order) >> 20); + xen_io_tlb_nslabs = SLABS_PER_PAGE << order; + bytes = xen_io_tlb_nslabs << IO_TLB_SHIFT; + } + } if (!xen_io_tlb_start) { m = "Cannot allocate Xen-SWIOTLB buffer!\n"; goto error; @@ -179,7 +199,8 @@ retry: bytes, xen_io_tlb_nslabs); if (rc) { - free_bootmem(__pa(xen_io_tlb_start), PAGE_ALIGN(bytes)); + if (early) + free_bootmem(__pa(xen_io_tlb_start), PAGE_ALIGN(bytes)); m = "Failed to get contiguous memory for DMA from Xen!\n"\ "You either: don''t have the permissions, do not have"\ " enough free memory under 4GB, or the hypervisor memory"\ @@ -187,9 +208,13 @@ retry: goto error; } start_dma_addr = xen_virt_to_bus(xen_io_tlb_start); - swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs, verbose); - return; + rc = 0; + if (early) + swiotlb_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs, verbose); + else + rc = swiotlb_late_init_with_tbl(xen_io_tlb_start, xen_io_tlb_nslabs); + return rc; error: if (repeat--) { xen_io_tlb_nslabs = max(1024UL, /* Min is 2MB */ @@ -199,9 +224,20 @@ error: goto retry; } xen_raw_printk("%s (rc:%d)", m, rc); - panic("%s (rc:%d)", m, rc); + if (early) + panic("%s (rc:%d)", m, rc); + else + free_pages((unsigned long)xen_io_tlb_start, order); + return rc; +} +void __init xen_swiotlb_init(int verbose) +{ + __xen_swiotlb_init(verbose, true /* early */); +} +int xen_swiotlb_late_init(int verbose) +{ + return __xen_swiotlb_init(verbose, false /* late */); } - void * xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, dma_addr_t *dma_handle, gfp_t flags, diff --git a/include/xen/swiotlb-xen.h b/include/xen/swiotlb-xen.h index 4f4d449..d38d984 100644 --- a/include/xen/swiotlb-xen.h +++ b/include/xen/swiotlb-xen.h @@ -4,6 +4,7 @@ #include <linux/swiotlb.h> extern void xen_swiotlb_init(int verbose); +extern int xen_swiotlb_late_init(int verbose); extern void *xen_swiotlb_alloc_coherent(struct device *hwdev, size_t size, -- 1.7.7.6
Konrad Rzeszutek Wilk
2012-Jul-31 14:00 UTC
[PATCH 5/5] xen/pcifront: Use Xen-SWIOTLB when initting if required.
We piggyback on "xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used." functionality to start up the Xen-SWIOTLB if we are hot-plugged. This allows us to bypass the need to supply ''iommu=soft'' on the Linux command line (mostly). With this patch, if a user forgot ''iommu=soft'' on the command line, and hotplug a PCI device they will get: pcifront pci-0: Installing PCI frontend Warning: only able to allocate 4 MB for software IO TLB software IO TLB [mem 0x2a000000-0x2a3fffff] (4MB) mapped at [ffff88002a000000-ffff88002a3fffff] pcifront pci-0: Creating PCI Frontend Bus 0000:00 pcifront pci-0: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [io 0x0000-0xffff] pci_bus 0000:00: root bus resource [mem 0x00000000-0xfffffffff] pci 0000:00:00.0: [8086:10d3] type 00 class 0x020000 pci 0000:00:00.0: reg 10: [mem 0xfe5c0000-0xfe5dffff] pci 0000:00:00.0: reg 14: [mem 0xfe500000-0xfe57ffff] pci 0000:00:00.0: reg 18: [io 0xe000-0xe01f] pci 0000:00:00.0: reg 1c: [mem 0xfe5e0000-0xfe5e3fff] pcifront pci-0: claiming resource 0000:00:00.0/0 pcifront pci-0: claiming resource 0000:00:00.0/1 pcifront pci-0: claiming resource 0000:00:00.0/2 pcifront pci-0: claiming resource 0000:00:00.0/3 e1000e: Intel(R) PRO/1000 Network Driver - 2.0.0-k e1000e: Copyright(c) 1999 - 2012 Intel Corporation. e1000e 0000:00:00.0: Disabling ASPM L0s L1 e1000e 0000:00:00.0: enabling device (0000 -> 0002) e1000e 0000:00:00.0: Xen PCI mapped GSI16 to IRQ34 e1000e 0000:00:00.0: (unregistered net_device): Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode e1000e 0000:00:00.0: eth0: (PCI Express:2.5GT/s:Width x1) 00:1b:21:ab:c6:13 e1000e 0000:00:00.0: eth0: Intel(R) PRO/1000 Network Connection e1000e 0000:00:00.0: eth0: MAC: 3, PHY: 8, PBA No: E46981-005 The "Warning only" will go away if one supplies ''iommu=soft'' instead as we have a higher chance of being able to allocate large swaths of memory. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/pci/xen-pcifront.c | 14 ++++++++++---- 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c index d6cc62c..ca92801 100644 --- a/drivers/pci/xen-pcifront.c +++ b/drivers/pci/xen-pcifront.c @@ -21,6 +21,7 @@ #include <linux/bitops.h> #include <linux/time.h> +#include <asm/xen/swiotlb-xen.h> #define INVALID_GRANT_REF (0) #define INVALID_EVTCHN (-1) @@ -668,7 +669,7 @@ static irqreturn_t pcifront_handler_aer(int irq, void *dev) schedule_pcifront_aer_op(pdev); return IRQ_HANDLED; } -static int pcifront_connect(struct pcifront_device *pdev) +static int pcifront_connect_and_init_dma(struct pcifront_device *pdev) { int err = 0; @@ -681,9 +682,13 @@ static int pcifront_connect(struct pcifront_device *pdev) dev_err(&pdev->xdev->dev, "PCI frontend already installed!\n"); err = -EEXIST; } - spin_unlock(&pcifront_dev_lock); + if (!err && !swiotlb_nr_tbl()) { + err = pci_xen_swiotlb_init_late(); + if (err) + dev_err(&pdev->xdev->dev, "Could not setup SWIOTLB!\n"); + } return err; } @@ -699,6 +704,7 @@ static void pcifront_disconnect(struct pcifront_device *pdev) spin_unlock(&pcifront_dev_lock); } + static struct pcifront_device *alloc_pdev(struct xenbus_device *xdev) { struct pcifront_device *pdev; @@ -842,10 +848,10 @@ static int __devinit pcifront_try_connect(struct pcifront_device *pdev) XenbusStateInitialised) goto out; - err = pcifront_connect(pdev); + err = pcifront_connect_and_init_dma(pdev); if (err) { xenbus_dev_fatal(pdev->xdev, err, - "Error connecting PCI Frontend"); + "Error setting up PCI Frontend"); goto out; } -- 1.7.7.6
Stefano Stabellini
2012-Jul-31 14:46 UTC
Re: [PATCH 2/5] xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB.
On Tue, 31 Jul 2012, Konrad Rzeszutek Wilk wrote:> If a PV guest is booted the native SWIOTLB should not be > turned on. It does not help us (we don''t have any PCI devices) > and it eats 64MB of good memory. In the case of PV guests > with PCI devices we need the Xen-SWIOTLB one. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/pci-swiotlb-xen.c | 12 ++++++++++++ > 1 files changed, 12 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c > index b6a5340..0d5a214 100644 > --- a/arch/x86/xen/pci-swiotlb-xen.c > +++ b/arch/x86/xen/pci-swiotlb-xen.c > @@ -8,6 +8,11 @@ > #include <xen/xen.h> > #include <asm/iommu_table.h> > > +#ifdef CONFIG_X86_64 > +#include <asm/iommu.h> > +#include <asm/dma.h> > +#endif > + > int xen_swiotlb __read_mostly; > > static struct dma_map_ops xen_swiotlb_dma_ops = { > @@ -49,6 +54,13 @@ int __init pci_xen_swiotlb_detect(void) > * the ''swiotlb'' flag is the only one turning it on. */ > swiotlb = 0; > > +#ifdef CONFIG_X86_64 > + /* pci_swiotlb_detect_4gb turns native SWIOTLB if no_iommu == 0 > + * (so no iommu=X command line over-writes). So disable the native > + * SWIOTLB. */Maybe rewording it would be a good idea: /* pci_swiotlb_detect_4gb turns on native SWIOTLB if no_iommu == 0 * (so no iommu=X command line over-writes). * Considering that PV guests don''t normally have PCI devices it is not * useful to us so we set no_iommu to 1 here */> + if (max_pfn > MAX_DMA32_PFN) > + no_iommu = 1; > +#endif > return xen_swiotlb; > } > > -- > 1.7.7.6 >
Konrad Rzeszutek Wilk
2012-Jul-31 18:25 UTC
Re: [PATCH 2/5] xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB.
On Tue, Jul 31, 2012 at 03:46:04PM +0100, Stefano Stabellini wrote:> On Tue, 31 Jul 2012, Konrad Rzeszutek Wilk wrote: > > If a PV guest is booted the native SWIOTLB should not be > > turned on. It does not help us (we don''t have any PCI devices) > > and it eats 64MB of good memory. In the case of PV guests > > with PCI devices we need the Xen-SWIOTLB one. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > arch/x86/xen/pci-swiotlb-xen.c | 12 ++++++++++++ > > 1 files changed, 12 insertions(+), 0 deletions(-) > > > > diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c > > index b6a5340..0d5a214 100644 > > --- a/arch/x86/xen/pci-swiotlb-xen.c > > +++ b/arch/x86/xen/pci-swiotlb-xen.c > > @@ -8,6 +8,11 @@ > > #include <xen/xen.h> > > #include <asm/iommu_table.h> > > > > +#ifdef CONFIG_X86_64 > > +#include <asm/iommu.h> > > +#include <asm/dma.h> > > +#endif > > + > > int xen_swiotlb __read_mostly; > > > > static struct dma_map_ops xen_swiotlb_dma_ops = { > > @@ -49,6 +54,13 @@ int __init pci_xen_swiotlb_detect(void) > > * the ''swiotlb'' flag is the only one turning it on. */ > > swiotlb = 0; > > > > +#ifdef CONFIG_X86_64 > > + /* pci_swiotlb_detect_4gb turns native SWIOTLB if no_iommu == 0 > > + * (so no iommu=X command line over-writes). So disable the native > > + * SWIOTLB. */ > > Maybe rewording it would be a good idea: > > /* pci_swiotlb_detect_4gb turns on native SWIOTLB if no_iommu == 0 > * (so no iommu=X command line over-writes). > * Considering that PV guests don''t normally have PCI devices it is not > * useful to us so we set no_iommu to 1 here */ >commit 21ef55f4ab2b6d63eb0ed86abbc959d31377853b Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Fri Jul 27 20:16:00 2012 -0400 xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB. If a PV guest is booted the native SWIOTLB should not be turned on. It does not help us (we don''t have any PCI devices) and it eats 64MB of good memory. In the case of PV guests with PCI devices we need the Xen-SWIOTLB one. [v1: Rewrite comment per Stefano''s suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c index b6a5340..1c17227 100644 --- a/arch/x86/xen/pci-swiotlb-xen.c +++ b/arch/x86/xen/pci-swiotlb-xen.c @@ -8,6 +8,11 @@ #include <xen/xen.h> #include <asm/iommu_table.h> +#ifdef CONFIG_X86_64 +#include <asm/iommu.h> +#include <asm/dma.h> +#endif + int xen_swiotlb __read_mostly; static struct dma_map_ops xen_swiotlb_dma_ops = { @@ -49,6 +54,15 @@ int __init pci_xen_swiotlb_detect(void) * the ''swiotlb'' flag is the only one turning it on. */ swiotlb = 0; +#ifdef CONFIG_X86_64 + /* pci_swiotlb_detect_4gb turns on native SWIOTLB if no_iommu == 0 + * (so no iommu=X command line over-writes). + * Considering that PV guests do not want the *native SWIOTLB* but + * only Xen SWIOTLB it is not useful to us so set no_iommu=1 here. + */ + if (max_pfn > MAX_DMA32_PFN) + no_iommu = 1; +#endif return xen_swiotlb; }
Stefano Stabellini
2012-Aug-01 10:09 UTC
Re: [Xen-devel] [PATCH 2/5] xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB.
On Tue, 31 Jul 2012, Konrad Rzeszutek Wilk wrote:> commit 21ef55f4ab2b6d63eb0ed86abbc959d31377853b > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Fri Jul 27 20:16:00 2012 -0400 > > xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB. > > If a PV guest is booted the native SWIOTLB should not be > turned on. It does not help us (we don''t have any PCI devices) > and it eats 64MB of good memory. In the case of PV guests > with PCI devices we need the Xen-SWIOTLB one. > > [v1: Rewrite comment per Stefano''s suggestion] > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>> diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c > index b6a5340..1c17227 100644 > --- a/arch/x86/xen/pci-swiotlb-xen.c > +++ b/arch/x86/xen/pci-swiotlb-xen.c > @@ -8,6 +8,11 @@ > #include <xen/xen.h> > #include <asm/iommu_table.h> > > +#ifdef CONFIG_X86_64 > +#include <asm/iommu.h> > +#include <asm/dma.h> > +#endif > + > int xen_swiotlb __read_mostly; > > static struct dma_map_ops xen_swiotlb_dma_ops = { > @@ -49,6 +54,15 @@ int __init pci_xen_swiotlb_detect(void) > * the ''swiotlb'' flag is the only one turning it on. */ > swiotlb = 0; > > +#ifdef CONFIG_X86_64 > + /* pci_swiotlb_detect_4gb turns on native SWIOTLB if no_iommu == 0 > + * (so no iommu=X command line over-writes). > + * Considering that PV guests do not want the *native SWIOTLB* but > + * only Xen SWIOTLB it is not useful to us so set no_iommu=1 here. > + */ > + if (max_pfn > MAX_DMA32_PFN) > + no_iommu = 1; > +#endif > return xen_swiotlb; > } > >