Stefano Stabellini
2011-Jun-07 18:13 UTC
[Xen-devel] [PATCH 0/3] x86: remove x86_init.mapping.pagetable_reserve
Currently find_early_table_space calculates an overestimate of how much memory the pagetable for 1:1 mapping is going to need. After kernel_physical_mapping_init completes we know exactly how much memory we used so we memblock reserve only the used memory and "free" the rest. This patch series modifies find_early_table_space to calculate the exact amount of memory we need for the 1:1 mapping, so that we can memblock reverve it right away and we don''t need to free the unused memory after kernel_physical_mapping_init. At this point we can also safely revert "x86,xen: introduce x86_init.mapping.pagetable_reserve". The list of patches with diffstat follows: Stefano Stabellini (3): x86: calculate precisely the memory needed by init_memory_mapping Revert "x86,xen: introduce x86_init.mapping.pagetable_reserve" x86: move memblock_x86_reserve_range PGTABLE to find_early_table_space arch/x86/include/asm/pgtable_types.h | 1 - arch/x86/include/asm/x86_init.h | 12 ----- arch/x86/kernel/x86_init.c | 4 -- arch/x86/mm/init.c | 87 +++++++++++++++++++--------------- arch/x86/xen/mmu.c | 15 ------ 5 files changed, 49 insertions(+), 70 deletions(-) Many thanks to Konrad that helped me review the patch series and performed an impressive amount of tests: *Configurations baremetal Linux 64-bit baremetal Linux 32-bit NOHIGHMEM, HIGHMEM4G and HIGHMEM64G 32-bit and 64-bit Linux on Xen 32-bit and 64-bit Linux HVM on Xen *Hardware AMD development box (Tilapia) - 8GB AMD BIOSTAR Grp N61PB-M2S/N61PB-M2S (Sempron) - 4GB Intel DX58SO (Core i7) - 8GB Supermicro X7DB8/X7DB8 (Harpertown) - 4GB IBM x3850 (Cranford) - 8GB MSI MS-7680/H61M-P23 (MS-7680) (SandyBridge, i2500)- 8GB A git branch based on 3.0-rc1 is available here: git://xenbits.xen.org/people/sstabellini/linux-pvhvm.git 3.0-rc1-rem_pg_reserve-4 - Stefano _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
<stefano.stabellini@eu.citrix.com>
2011-Jun-07 18:13 UTC
[Xen-devel] [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com> - take into account the size of the initial pagetable; - remove the extra PMD_SIZE added when use_pse, because the previously allocated PMDs are always 2M aligned; - remove the extra page added on x86_32 for the fixmap because is not needed: the PMD entry is already allocated and contiguous for the whole range (a PMD page covers 4G of virtual addresses) and the pte entry is already allocated by early_ioremap_init. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/mm/init.c | 62 ++++++++++++++++++++++++++++++++++++++------------- 1 files changed, 46 insertions(+), 16 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 3032644..0cfe8d4 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -28,22 +28,52 @@ int direct_gbpages #endif ; -static void __init find_early_table_space(unsigned long end, int use_pse, - int use_gbpages) +static void __init find_early_table_space(unsigned long start, + unsigned long end, int use_pse, int use_gbpages) { - unsigned long puds, pmds, ptes, tables, start = 0, good_end = end; + unsigned long pmds = 0, ptes = 0, tables = 0, good_end = end, + pud_mapped = 0, pmd_mapped = 0, size = end - start; phys_addr_t base; - puds = (end + PUD_SIZE - 1) >> PUD_SHIFT; - tables = roundup(puds * sizeof(pud_t), PAGE_SIZE); + pud_mapped = DIV_ROUND_UP(PFN_PHYS(max_pfn_mapped), + (PUD_SIZE * PTRS_PER_PUD)); + pud_mapped *= (PUD_SIZE * PTRS_PER_PUD); + pmd_mapped = DIV_ROUND_UP(PFN_PHYS(max_pfn_mapped), + (PMD_SIZE * PTRS_PER_PMD)); + pmd_mapped *= (PMD_SIZE * PTRS_PER_PMD); + + if (start < PFN_PHYS(max_pfn_mapped)) { + if (PFN_PHYS(max_pfn_mapped) < end) + size -= PFN_PHYS(max_pfn_mapped) - start; + else + size = 0; + } + +#ifndef __PAGETABLE_PUD_FOLDED + if (end > pud_mapped) { + unsigned long puds; + if (start < pud_mapped) + puds = (end - pud_mapped + PUD_SIZE - 1) >> PUD_SHIFT; + else + puds = (end - start + PUD_SIZE - 1) >> PUD_SHIFT; + tables += roundup(puds * sizeof(pud_t), PAGE_SIZE); + } +#endif if (use_gbpages) { unsigned long extra; extra = end - ((end>>PUD_SHIFT) << PUD_SHIFT); pmds = (extra + PMD_SIZE - 1) >> PMD_SHIFT; - } else - pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT; + } +#ifndef __PAGETABLE_PMD_FOLDED + else if (end > pmd_mapped) { + if (start < pmd_mapped) + pmds = (end - pmd_mapped + PMD_SIZE - 1) >> PMD_SHIFT; + else + pmds = (end - start + PMD_SIZE - 1) >> PMD_SHIFT; + } +#endif tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE); @@ -51,23 +81,20 @@ static void __init find_early_table_space(unsigned long end, int use_pse, unsigned long extra; extra = end - ((end>>PMD_SHIFT) << PMD_SHIFT); -#ifdef CONFIG_X86_32 - extra += PMD_SIZE; -#endif ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT; } else - ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT; + ptes = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; tables += roundup(ptes * sizeof(pte_t), PAGE_SIZE); -#ifdef CONFIG_X86_32 - /* for fixmap */ - tables += roundup(__end_of_fixed_addresses * sizeof(pte_t), PAGE_SIZE); + if (!tables) + return; +#ifdef CONFIG_X86_32 good_end = max_pfn_mapped << PAGE_SHIFT; #endif - base = memblock_find_in_range(start, good_end, tables, PAGE_SIZE); + base = memblock_find_in_range(0x00, good_end, tables, PAGE_SIZE); if (base == MEMBLOCK_ERROR) panic("Cannot find space for the kernel page tables"); @@ -261,7 +288,7 @@ unsigned long __init_refok init_memory_mapping(unsigned long start, * nodes are discovered. */ if (!after_bootmem) - find_early_table_space(end, use_pse, use_gbpages); + find_early_table_space(start, end, use_pse, use_gbpages); for (i = 0; i < nr_range; i++) ret = kernel_physical_mapping_init(mr[i].start, mr[i].end, @@ -275,6 +302,9 @@ unsigned long __init_refok init_memory_mapping(unsigned long start, __flush_tlb_all(); + if (pgt_buf_end != pgt_buf_top) + printk(KERN_DEBUG "initial kernel pagetable allocation wasted %lx" + " pages\n", pgt_buf_top - pgt_buf_end); /* * Reserve the kernel pagetable pages we used (pgt_buf_start - * pgt_buf_end) and free the other ones (pgt_buf_end - pgt_buf_top) -- 1.7.2.3 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
<stefano.stabellini@eu.citrix.com>
2011-Jun-07 18:13 UTC
[Xen-devel] [PATCH 2/3] Revert "x86, xen: introduce x86_init.mapping.pagetable_reserve"
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com> This reverts commit 279b706bf800b5967037f492dbe4fc5081ad5d0f. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/include/asm/pgtable_types.h | 1 - arch/x86/include/asm/x86_init.h | 12 ------------ arch/x86/kernel/x86_init.c | 4 ---- arch/x86/mm/init.c | 25 +++---------------------- arch/x86/xen/mmu.c | 15 --------------- 5 files changed, 3 insertions(+), 54 deletions(-) diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index d56187c..7db7723 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -299,7 +299,6 @@ int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn, /* Install a pte for a particular vaddr in kernel space. */ void set_pte_vaddr(unsigned long vaddr, pte_t pte); -extern void native_pagetable_reserve(u64 start, u64 end); #ifdef CONFIG_X86_32 extern void native_pagetable_setup_start(pgd_t *base); extern void native_pagetable_setup_done(pgd_t *base); diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index d3d8590..643ebf2 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -68,17 +68,6 @@ struct x86_init_oem { }; /** - * struct x86_init_mapping - platform specific initial kernel pagetable setup - * @pagetable_reserve: reserve a range of addresses for kernel pagetable usage - * - * For more details on the purpose of this hook, look in - * init_memory_mapping and the commit that added it. - */ -struct x86_init_mapping { - void (*pagetable_reserve)(u64 start, u64 end); -}; - -/** * struct x86_init_paging - platform specific paging functions * @pagetable_setup_start: platform specific pre paging_init() call * @pagetable_setup_done: platform specific post paging_init() call @@ -134,7 +123,6 @@ struct x86_init_ops { struct x86_init_mpparse mpparse; struct x86_init_irqs irqs; struct x86_init_oem oem; - struct x86_init_mapping mapping; struct x86_init_paging paging; struct x86_init_timers timers; struct x86_init_iommu iommu; diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index 6f164bd..6eee082 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -61,10 +61,6 @@ struct x86_init_ops x86_init __initdata = { .banner = default_banner, }, - .mapping = { - .pagetable_reserve = native_pagetable_reserve, - }, - .paging = { .pagetable_setup_start = native_pagetable_setup_start, .pagetable_setup_done = native_pagetable_setup_done, diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 0cfe8d4..15590fd 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -106,11 +106,6 @@ static void __init find_early_table_space(unsigned long start, end, pgt_buf_start << PAGE_SHIFT, pgt_buf_top << PAGE_SHIFT); } -void __init native_pagetable_reserve(u64 start, u64 end) -{ - memblock_x86_reserve_range(start, end, "PGTABLE"); -} - struct map_range { unsigned long start; unsigned long end; @@ -305,24 +300,10 @@ unsigned long __init_refok init_memory_mapping(unsigned long start, if (pgt_buf_end != pgt_buf_top) printk(KERN_DEBUG "initial kernel pagetable allocation wasted %lx" " pages\n", pgt_buf_top - pgt_buf_end); - /* - * Reserve the kernel pagetable pages we used (pgt_buf_start - - * pgt_buf_end) and free the other ones (pgt_buf_end - pgt_buf_top) - * so that they can be reused for other purposes. - * - * On native it just means calling memblock_x86_reserve_range, on Xen it - * also means marking RW the pagetable pages that we allocated before - * but that haven''t been used. - * - * In fact on xen we mark RO the whole range pgt_buf_start - - * pgt_buf_top, because we have to make sure that when - * init_memory_mapping reaches the pagetable pages area, it maps - * RO all the pagetable pages, including the ones that are beyond - * pgt_buf_end at that time. - */ + if (!after_bootmem && pgt_buf_end > pgt_buf_start) - x86_init.mapping.pagetable_reserve(PFN_PHYS(pgt_buf_start), - PFN_PHYS(pgt_buf_end)); + memblock_x86_reserve_range(pgt_buf_start << PAGE_SHIFT, + pgt_buf_end << PAGE_SHIFT, "PGTABLE"); if (!after_bootmem) early_memtest(start, end); diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index dc708dc..2004f1e 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -1153,20 +1153,6 @@ static void __init xen_pagetable_setup_start(pgd_t *base) { } -static __init void xen_mapping_pagetable_reserve(u64 start, u64 end) -{ - /* reserve the range used */ - native_pagetable_reserve(start, end); - - /* set as RW the rest */ - printk(KERN_DEBUG "xen: setting RW the range %llx - %llx\n", end, - PFN_PHYS(pgt_buf_top)); - while (end < PFN_PHYS(pgt_buf_top)) { - make_lowmem_page_readwrite(__va(end)); - end += PAGE_SIZE; - } -} - static void xen_post_allocator_init(void); static void __init xen_pagetable_setup_done(pgd_t *base) @@ -1997,7 +1983,6 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst = { void __init xen_init_mmu_ops(void) { - x86_init.mapping.pagetable_reserve = xen_mapping_pagetable_reserve; x86_init.paging.pagetable_setup_start = xen_pagetable_setup_start; x86_init.paging.pagetable_setup_done = xen_pagetable_setup_done; pv_mmu_ops = xen_mmu_ops; -- 1.7.2.3 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
<stefano.stabellini@eu.citrix.com>
2011-Jun-07 18:13 UTC
[Xen-devel] [PATCH 3/3] x86: move memblock_x86_reserve_range PGTABLE to find_early_table_space
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Now that find_early_table_space knows how to calculate the exact amout of memory needed by the kernel pagetable, we can reserve the range directly in find_early_table_space. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/mm/init.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 15590fd..36bacfe 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -104,6 +104,10 @@ static void __init find_early_table_space(unsigned long start, printk(KERN_DEBUG "kernel direct mapping tables up to %lx @ %lx-%lx\n", end, pgt_buf_start << PAGE_SHIFT, pgt_buf_top << PAGE_SHIFT); + + if (pgt_buf_top > pgt_buf_start) + memblock_x86_reserve_range(pgt_buf_start << PAGE_SHIFT, + pgt_buf_top << PAGE_SHIFT, "PGTABLE"); } struct map_range { @@ -301,10 +305,6 @@ unsigned long __init_refok init_memory_mapping(unsigned long start, printk(KERN_DEBUG "initial kernel pagetable allocation wasted %lx" " pages\n", pgt_buf_top - pgt_buf_end); - if (!after_bootmem && pgt_buf_end > pgt_buf_start) - memblock_x86_reserve_range(pgt_buf_start << PAGE_SHIFT, - pgt_buf_end << PAGE_SHIFT, "PGTABLE"); - if (!after_bootmem) early_memtest(start, end); -- 1.7.2.3 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-Jun-17 17:11 UTC
Re: [Xen-devel] [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping
stefano.stabellini@eu.citrix.com writes ("[Xen-devel] [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping"):> + if (pgt_buf_end != pgt_buf_top) > + printk(KERN_DEBUG "initial kernel pagetable allocation wasted %lx" > + " pages\n", pgt_buf_top - pgt_buf_end);If (due to a bug) pgt_buf_end > pgt_buf_top, this will printk a message about wasting a negative number of pages, rather than crashing. Is there something else that will catch this case ? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jun-20 19:16 UTC
Re: [Xen-devel] [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping
On Fri, 17 Jun 2011, Ian Jackson wrote:> stefano.stabellini@eu.citrix.com writes ("[Xen-devel] [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping"): > > + if (pgt_buf_end != pgt_buf_top) > > + printk(KERN_DEBUG "initial kernel pagetable allocation wasted %lx" > > + " pages\n", pgt_buf_top - pgt_buf_end); > > If (due to a bug) pgt_buf_end > pgt_buf_top, this will printk a > message about wasting a negative number of pages, rather than > crashing. Is there something else that will catch this case ?Thanks for reviewing this patch! Yes, there is something else that catches this case: both the 32 bit and the 64 bit versions of alloc_low_page contain this code: unsigned long pfn = pgt_buf_end++; if (pfn >= pgt_buf_top) panic("alloc_low_page: ran out of memory"); so we are safe from that point of view. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
H. Peter Anvin
2011-Jun-20 22:37 UTC
[Xen-devel] Re: [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping
On 06/07/2011 11:13 AM, stefano.stabellini@eu.citrix.com wrote:> > - remove the extra page added on x86_32 for the fixmap because is not > needed: the PMD entry is already allocated and contiguous for the whole > range (a PMD page covers 4G of virtual addresses) and the pte entry is > already allocated by early_ioremap_init. >Hi Stefano, I think this is wrong. A PMD page covers *1G* of virtual addresses, and in the 2+2 and 1+3 memory configurations, we may or may not need a separate PMD for the fixmap. Am I missing something? -hpa _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jun-21 17:57 UTC
[Xen-devel] Re: [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping
On Mon, 20 Jun 2011, H. Peter Anvin wrote:> On 06/07/2011 11:13 AM, stefano.stabellini@eu.citrix.com wrote: > > > > - remove the extra page added on x86_32 for the fixmap because is not > > needed: the PMD entry is already allocated and contiguous for the whole > > range (a PMD page covers 4G of virtual addresses) and the pte entry is > > already allocated by early_ioremap_init. > > > > Hi Stefano, > > I think this is wrong. A PMD page covers *1G* of virtual addresses, and > in the 2+2 and 1+3 memory configurations, we may or may not need a > separate PMD for the fixmap. > > Am I missing something?You are right, a PMD page covers 1G of virtual addresses so that part of the explanation in the comment is wrong. The reason why we don''t need a separate PMD for the fixmap is that in both PAE and non-PAE cases the last gigabyte of virtual addresses is always covered by the initial allocation in head_32.S (swapper_pg_dir or initial_pg_pmd). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
H. Peter Anvin
2011-Jun-21 18:02 UTC
[Xen-devel] Re: [PATCH 1/3] x86: calculate precisely the memory needed by init_memory_mapping
Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote:>On Mon, 20 Jun 2011, H. Peter Anvin wrote: >> On 06/07/2011 11:13 AM, stefano.stabellini@eu.citrix.com wrote: >> > >> > - remove the extra page added on x86_32 for the fixmap because is >not >> > needed: the PMD entry is already allocated and contiguous for the >whole >> > range (a PMD page covers 4G of virtual addresses) and the pte entry >is >> > already allocated by early_ioremap_init. >> > >> >> Hi Stefano, >> >> I think this is wrong. A PMD page covers *1G* of virtual addresses, >and >> in the 2+2 and 1+3 memory configurations, we may or may not need a >> separate PMD for the fixmap. >> >> Am I missing something? > >You are right, a PMD page covers 1G of virtual addresses so that part >of >the explanation in the comment is wrong. > >The reason why we don''t need a separate PMD for the fixmap is that in >both PAE and non-PAE cases the last gigabyte of virtual addresses is >always covered by the initial allocation in head_32.S (swapper_pg_dir >or >initial_pg_pmd).Ok, wasn''t sure if Xen used the static allocation or not. -- Sent from my mobile phone. Please excuse my brevity and lack of formatting. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2011-Jun-21 20:58 UTC
[Xen-devel] Re: [PATCH 0/3] x86: remove x86_init.mapping.pagetable_reserve
-tip testing found that these patches cause the following boot crash on native: [ 0.000000] Base memory trampoline at [ffff88000009d000] 9d000 size 8192 [ 0.000000] init_memory_mapping: 0000000000000000-000000003fff0000 [ 0.000000] 0000000000 - 003fff0000 page 4k [ 0.000000] kernel direct mapping tables up to 3fff0000 @ 3fef0000-3fff0000 [ 0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory Config attached, full bootlog below. I''ve excluded the commits for now. Ingo Decompressing Linux... Parsing ELF... done. Booting the kernel. [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 3.0.0-rc4-tip+ (mingo@sirius) (gcc version 4.6.0 20110509 (Red Hat 4.6.0-7) (GCC) ) #139263 SMP PREEMPT Tue Jun 21 22:47:13 CEST 2011 [ 0.000000] Command line: root=/dev/sda6 earlyprintk=ttyS0,115200 console=ttyS0,115200 debug initcall_debug sysrq_always_enabled ignore_loglevel selinux=0 nmi_watchdog=1 panic=1 3 [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable) [ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) [ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS) [ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) [ 0.000000] BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) [ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) [ 0.000000] bootconsole [earlyser0] enabled [ 0.000000] debug: ignoring loglevel setting. [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI 2.3 present. [ 0.000000] DMI: System manufacturer System Product Name/A8N-E, BIOS ASUS A8N-E ACPI BIOS Revision 1008 08/22/2005 [ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved) [ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable) [ 0.000000] No AGP bridge found [ 0.000000] last_pfn = 0x3fff0 max_arch_pfn = 0x400000000 [ 0.000000] MTRR default type: uncachable [ 0.000000] MTRR fixed ranges enabled: [ 0.000000] 00000-9FFFF write-back [ 0.000000] A0000-BFFFF uncachable [ 0.000000] C0000-C7FFF write-protect [ 0.000000] C8000-FFFFF uncachable [ 0.000000] MTRR variable ranges enabled: [ 0.000000] 0 base 0000000000 mask FFC0000000 write-back [ 0.000000] 1 disabled [ 0.000000] 2 disabled [ 0.000000] 3 disabled [ 0.000000] 4 disabled [ 0.000000] 5 disabled [ 0.000000] 6 disabled [ 0.000000] 7 disabled [ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 [ 0.000000] found SMP MP-table at [ffff8800000f5680] f5680 [ 0.000000] initial memory mapped : 0 - 20000000 [ 0.000000] Base memory trampoline at [ffff88000009d000] 9d000 size 8192 [ 0.000000] init_memory_mapping: 0000000000000000-000000003fff0000 [ 0.000000] 0000000000 - 003fff0000 page 4k [ 0.000000] kernel direct mapping tables up to 3fff0000 @ 3fef0000-3fff0000 [ 0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jun-22 17:16 UTC
[Xen-devel] Re: [PATCH 0/3] x86: remove x86_init.mapping.pagetable_reserve
On Tue, 21 Jun 2011, Ingo Molnar wrote:> > -tip testing found that these patches cause the following boot crash > on native: > > [ 0.000000] Base memory trampoline at [ffff88000009d000] 9d000 size 8192 > [ 0.000000] init_memory_mapping: 0000000000000000-000000003fff0000 > [ 0.000000] 0000000000 - 003fff0000 page 4k > [ 0.000000] kernel direct mapping tables up to 3fff0000 @ 3fef0000-3fff0000 > [ 0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory > > Config attached, full bootlog below. I''ve excluded the commits for > now. >Thanks for the logs; I was able to reproduce the problem and I know what the issue is: CONFIG_DEBUG_PAGEALLOC forces use_pse to 0 while on x86_64 cpu_has_pse is 1. As a consequence the initial pagetable allocator in head_64.S didn''t allocate any pte pages but find_early_table_space assumes it did. The issue doesn''t happen on x86_32 (PAE and non-PAE) because head_32.S always uses 4KB pages. The patch below fixes the problem: on x86_64 we should not limit the memory size we need to cover with 4KB ptes depending on the initial allocation, because head_64.S always uses 2MB pages. Ingo, if you know any other debug config options that might affect page table allocations, please let me know. --- commit 2b66a94cf8dbbf4cf2148456381b8674ed8191f0 Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Date: Wed Jun 22 11:46:23 2011 +0000 x86_64: do not assume head_64.S used 4KB pages when !use_pse head_64.S, which sets up the initial page table on x86_64, is not aware of PSE being enabled or disabled and it always allocates the initial mapping using 2MB pages. Therefore on x86_64 find_early_table_space shouldn''t update the amount of pages needed for pte pages depending on the size of the initial mapping, because we know for sure that no pte pages have been allocated yet. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reported-by: Ingo Molnar <mingo@elte.hu> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 36bacfe..1e3098b 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -42,12 +42,19 @@ static void __init find_early_table_space(unsigned long start, (PMD_SIZE * PTRS_PER_PMD)); pmd_mapped *= (PMD_SIZE * PTRS_PER_PMD); + /* + * On x86_64 do not limit the size we need to cover with 4KB pages + * depending on the initial allocation because head_64.S always uses + * 2MB pages. + */ +#ifdef CONFIG_X86_32 if (start < PFN_PHYS(max_pfn_mapped)) { if (PFN_PHYS(max_pfn_mapped) < end) size -= PFN_PHYS(max_pfn_mapped) - start; else size = 0; } +#endif #ifndef __PAGETABLE_PUD_FOLDED if (end > pud_mapped) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel