AMD Opteron and Athlon 64 processors have an AGP aperture and GART built into the processor. Linux has used the AGP GART as an IOMMU to improve the performance of 32-bit only PCI devices when DMA''ing into physical memory above 0xffffffff. This patch provides a similar capability for Xen dom0. Most of the code simply migrates the native Linux aperture.c and pci-gart.c to dom0. A new memory op is added to clear the aperture mapping from the hypervisor''s page tables, which is necessary to prevent cache alias issues resulting from processor speculation. My thanks to Ulrich Meis [um@amd64.org], who was instrumental in developing and testing this patch. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>--- a/linux-2.6-xen-sparse/arch/i386/mm/hypervisor.c Fri Feb 09 10:48:41 2007 +0000 >+++ b/linux-2.6-xen-sparse/arch/i386/mm/hypervisor.c Fri Feb 09 16:32:04 2007 -0600 >@@ -252,7 +252,7 @@ static void contiguous_bitmap_clear( > } > > /* Protected by balloon_lock. */ >-#define MAX_CONTIG_ORDER 9 /* 2MB */ >+#define MAX_CONTIG_ORDER 16 /* 256MB */ > static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER]; > static multicall_entry_t cr_mcl[1<<MAX_CONTIG_ORDER];This seems dangerous to me.>--- a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/dma-mapping.h Fri Feb 09 10:48:41 2007 +0000 >+++ b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/dma-mapping.h Fri Feb 09 16:32:04 2007 -0600 >@@ -62,7 +62,12 @@ static inline int valid_dma_direction(in > (dma_direction == DMA_FROM_DEVICE)); > } > >-#if 0 >+#ifdef CONFIG_XEN >+#define global_need_iommu() 1 >+#else >+#define global_need_iommu() (HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL)>MAX_DMA32_PFN) >+#endif >+ > static inline int dma_mapping_error(dma_addr_t dma_addr) > { > if (dma_ops->mapping_error)HYPERVISOR_memory_op() if CONFIG_XEN is undefined?>--- a/linux-2.6-xen-sparse/lib/Makefile Fri Feb 09 10:48:41 2007 +0000 >+++ b/linux-2.6-xen-sparse/lib/Makefile Fri Feb 09 16:32:04 2007 -0600 >@@ -51,8 +51,7 @@ obj-$(CONFIG_SMP) += percpu_counter.o > obj-$(CONFIG_SMP) += percpu_counter.o > obj-$(CONFIG_AUDIT_GENERIC) += audit.o > >-obj-$(CONFIG_SWIOTLB) += swiotlb.o >-swiotlb-$(CONFIG_XEN) := ../arch/i386/kernel/swiotlb.o >+obj-$(CONFIG_SWIOTLB) += swiotlb-xen.o > > hostprogs-y := gen_crc32table > clean-files := crc32table.hThis seems very unlikely to have been tested in a native build. You should use cherry-pick-xen in the file. I generally welcome moving arch/i386/kernel/swiotlb.c to lib/swiotlb-xen.c, but would appreciate if you then also removed the original file (and perhaps this should be done as a separate patch, so that it''d be clear that the move itself doesn''t change the file in any way (and if you need changes to it for the IOMMU patch, have only those changes in the patch here). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Feb 12, 2007 at 11:47:09AM -0600, Langsdorf, Mark wrote:> AMD Opteron and Athlon 64 processors have an AGP aperture and GART > built into the processor. Linux has used the AGP GART as an IOMMU > to improve the performance of 32-bit only PCI devices when DMA''ing > into physical memory above 0xffffffff. This patch provides a > similar capability for Xen dom0. > > Most of the code simply migrates the native Linux aperture.c and > pci-gart.c to dom0. A new memory op is added to clear the aperture > mapping from the hypervisor''s page tables, which is necessary to > prevent cache alias issues resulting from processor speculation.Could you please split it up into two patches, one of which does the bulk movement around and the other which adds GART support, to make reviewing easier? Thanks, Muli _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> >--- a/linux-2.6-xen-sparse/arch/i386/mm/hypervisor.c Fri Feb > 09 10:48:41 2007 +0000 > >+++ b/linux-2.6-xen-sparse/arch/i386/mm/hypervisor.c Fri Feb > 09 16:32:04 2007 -0600 > >@@ -252,7 +252,7 @@ static void contiguous_bitmap_clear( > > } > > > > /* Protected by balloon_lock. */ > >-#define MAX_CONTIG_ORDER 9 /* 2MB */ > >+#define MAX_CONTIG_ORDER 16 /* 256MB */ > > static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER]; > > static multicall_entry_t cr_mcl[1<<MAX_CONTIG_ORDER]; > > This seems dangerous to me.We need at least 64MB of contiguous memory for the aperture.> a/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/dma-ma > pping.h Fri Feb 09 10:48:41 2007 +0000 > >+++ > b/linux-2.6-xen-sparse/include/asm-x86_64/mach-xen/asm/dma-ma > pping.h Fri Feb 09 16:32:04 2007 -0600 > >@@ -62,7 +62,12 @@ static inline int valid_dma_direction(in > > (dma_direction == DMA_FROM_DEVICE)); > > } > > > >-#if 0 > >+#ifdef CONFIG_XEN > >+#define global_need_iommu() 1 > >+#else > >+#define global_need_iommu() > (HYPERVISOR_memory_op(XENMEM_maximum_ram_page, NULL)>MAX_DMA32_PFN) > >+#endif > >+ > > static inline int dma_mapping_error(dma_addr_t dma_addr) > > { > > if (dma_ops->mapping_error) > > HYPERVISOR_memory_op() if CONFIG_XEN is undefined?Will fix.> >--- a/linux-2.6-xen-sparse/lib/Makefile Fri Feb 09 > 10:48:41 2007 +0000 > >+++ b/linux-2.6-xen-sparse/lib/Makefile Fri Feb 09 > 16:32:04 2007 -0600 > >@@ -51,8 +51,7 @@ obj-$(CONFIG_SMP) += percpu_counter.o > > obj-$(CONFIG_SMP) += percpu_counter.o > > obj-$(CONFIG_AUDIT_GENERIC) += audit.o > > > >-obj-$(CONFIG_SWIOTLB) += swiotlb.o > >-swiotlb-$(CONFIG_XEN) := ../arch/i386/kernel/swiotlb.o > >+obj-$(CONFIG_SWIOTLB) += swiotlb-xen.o > > > > hostprogs-y := gen_crc32table > > clean-files := crc32table.h > > This seems very unlikely to have been tested in a native > build. You should use cherry-pick-xen in the file.Okay.> I generally welcome moving arch/i386/kernel/swiotlb.c to > lib/swiotlb-xen.c, but would appreciate if you then also > removed the original file (and perhaps this should be done > as a separate patch, so that it''d be clear that the move > itself doesn''t change the file in any way (and if you need > changes to it for the IOMMU patch, have only those changes > in the patch here).I''ll seperate the reorg a bit and make it clearer. Thanks for looking at this. -Mark Langsdorf AMD, Inc. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> "Langsdorf, Mark" <mark.langsdorf@amd.com> 14.02.07 00:19 >>> >> >--- a/linux-2.6-xen-sparse/arch/i386/mm/hypervisor.c Fri Feb >> 09 10:48:41 2007 +0000 >> >+++ b/linux-2.6-xen-sparse/arch/i386/mm/hypervisor.c Fri Feb >> 09 16:32:04 2007 -0600 >> >@@ -252,7 +252,7 @@ static void contiguous_bitmap_clear( >> > } >> > >> > /* Protected by balloon_lock. */ >> >-#define MAX_CONTIG_ORDER 9 /* 2MB */ >> >+#define MAX_CONTIG_ORDER 16 /* 256MB */ >> > static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER]; >> > static multicall_entry_t cr_mcl[1<<MAX_CONTIG_ORDER]; >> >> This seems dangerous to me. > >We need at least 64MB of contiguous memory for the aperture.I think this should then be established by some other means. 256Mb (and even the minimum of 64Mb that you say you need) is, for contiguous memory, almost as good as no limit at all. Among other reservations I have - how do you intend to make sure the hypervisor even has a chance of fulfilling this big a request? But I would certainly like to know Keir''s opinion here, too. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/2/07 23:19, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote:>>> /* Protected by balloon_lock. */ >>> -#define MAX_CONTIG_ORDER 9 /* 2MB */ >>> +#define MAX_CONTIG_ORDER 16 /* 256MB */ >>> static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER]; >>> static multicall_entry_t cr_mcl[1<<MAX_CONTIG_ORDER]; >> >> This seems dangerous to me. > > We need at least 64MB of contiguous memory for the aperture.Not that I know anything much about the K8 GART, but I assume the aperture is an address range that the GART takes control of and dynamically aliases other RAM pages into? Is it necessary to burn 64MB of RAM (which is presumably inaccessible when the GART is turned on)? Will the BIOS not already have conveniently piointed the aperture into a RAM hole (e.g., just below 4GB)? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed 14.02.07 10:04, Keir Fraser wrote:> On 13/2/07 23:19, "Langsdorf, Mark" <mark.langsdorf@amd.com> wrote: > > >>> /* Protected by balloon_lock. */ > >>> -#define MAX_CONTIG_ORDER 9 /* 2MB */ > >>> +#define MAX_CONTIG_ORDER 16 /* 256MB */ > >>> static unsigned long discontig_frames[1<<MAX_CONTIG_ORDER]; > >>> static multicall_entry_t cr_mcl[1<<MAX_CONTIG_ORDER]; > >> > >> This seems dangerous to me. > > > > We need at least 64MB of contiguous memory for the aperture. > > Not that I know anything much about the K8 GART, but I assume the aperture > is an address range that the GART takes control of and dynamically aliases > other RAM pages into? Is it necessary to burn 64MB of RAM (which is > presumably inaccessible when the GART is turned on)? Will the BIOS not > already have conveniently piointed the aperture into a RAM hole (e.g., just > below 4GB)?Usually, yes, the BIOS should allocate the aperture. However, on all systems I''ve tested on the BIOS allocated only 32MB (probably because they had no AGP). Sometimes it would even keep the memory location that was set on the last boot---potentially pointing to usable RAM. The patch will only allocate the aperture from RAM if the BIOS reserved less than 64MB. In that case, it will also make the call to xen_create_contiguous_region. The memory is lost. There''s also a boot mesage telling people that. Concerning the availability of 64MB contiguous RAM: The hypervisor keeps if dom0_mem is unspecified 1/16th of memory free, which is for >4GB systems (where one needs a GART) at least 256MB. Therefore, the allocation should always succeed unless someone''s tweaking that parameter. Uli _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/2/07 14:33, "Uli Meis" <um@amd64.org> wrote:> Usually, yes, the BIOS should allocate the aperture. However, on all > systems I''ve tested on the BIOS allocated only 32MB (probably because > they had no AGP). Sometimes it would even keep the memory location that > was set on the last boot---potentially pointing to usable RAM. > > The patch will only allocate the aperture from RAM if the BIOS reserved > less than 64MB. In that case, it will also make the call to > xen_create_contiguous_region. The memory is lost. There''s also a boot > mesage telling people that.Might it typically be possible to find some free space in the hole just below 4GB? I don''t believe most BIOSes make effort to keep the RAM hole small. Looking back to the particular approach you have taken, expanding the static argument arrays for this one particular user is not really desirable. It would be better to do a dynamic GFP_ATOMIC allocation in the case that order> MAX_CONTIG_ORDER. We want to avoid that in the general case becauseGFP_ATOMIC allocations can fail, but this will be happening at start of day when there should be no memory pressure. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel