Pekka Enberg
2009-Aug-25 16:29 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
Hi Arnd, On Tue, Aug 25, 2009 at 6:48 PM, Arnd Hannemann<hannemann@nets.rwth-aachen.de> wrote:> current 2.6.31 fails to boot on our xen host (32bit pae). > Unfortunately it fails in a way that there is absolutely no output > on the console. Config is as 32bit guest. > > Git bisect gave the following: > > 83b519e8b9572c319c8e0c615ee5dd7272856090 is first bad commit > commit 83b519e8b9572c319c8e0c615ee5dd7272856090 > Author: Pekka Enberg <penberg@cs.helsinki.fi> > Date: Wed Jun 10 19:40:04 2009 +0300 > > slab: setup allocators earlier in the boot sequence > > This patch makes kmalloc() available earlier in the boot sequence so we can get > rid of some bootmem allocations. The bulk of the changes are due to > kmem_cache_init() being called with interrupts disabled which requires some > changes to allocator boostrap code. > > Note: 32-bit x86 does WP protect test in mem_init() so we must setup traps > before we call mem_init() during boot as reported by Ingo Molnar: > > We have a hard crash in the WP-protect code: > > [ 0.000000] Checking if this processor honours the WP bit even in supervisor mode...BUG: Int 14: CR2 ffcff000 > [ 0.000000] EDI 00000188 ESI 00000ac7 EBP c17eaf9c ESP c17eaf8c > [ 0.000000] EBX 000014e0 EDX 0000000e ECX 01856067 EAX 00000001 > [ 0.000000] err 00000003 EIP c10135b1 CS 00000060 flg 00010002 > [ 0.000000] Stack: c17eafa8 c17fd410 c16747bc c17eafc4 c17fd7e5 000011fd f8616000 c18237cc > [ 0.000000] 00099800 c17bb000 c17eafec c17f1668 000001c5 c17f1322 c166e039 c1822bf0 > [ 0.000000] c166e033 c153a014 c18237cc 00020800 c17eaff8 c17f106a 00020800 01ba5003 > [ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip-02161-g7a74539-dirty #52203 > [ 0.000000] Call Trace: > [ 0.000000] [<c15357c2>] ? printk+0x14/0x16 > [ 0.000000] [<c10135b1>] ? do_test_wp_bit+0x19/0x23 > [ 0.000000] [<c17fd410>] ? test_wp_bit+0x26/0x64 > [ 0.000000] [<c17fd7e5>] ? mem_init+0x1ba/0x1d8 > [ 0.000000] [<c17f1668>] ? start_kernel+0x164/0x2f7 > [ 0.000000] [<c17f1322>] ? unknown_bootoption+0x0/0x19c > [ 0.000000] [<c17f106a>] ? __init_begin+0x6a/0x6f > > Acked-by: Johannes Weiner <hannes@cmpxchg.org> > Acked-by Linus Torvalds <torvalds@linux-foundation.org> > Cc: Christoph Lameter <cl@linux-foundation.org> > Cc: Ingo Molnar <mingo@elte.hu> > Cc: Matt Mackall <mpm@selenic.com> > Cc: Nick Piggin <npiggin@suse.de> > Cc: Yinghai Lu <yinghai@kernel.org> > Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> > > However a > git revert 83b519e8b9572c319c8e0c615ee5dd7272856090 > to verify that current git without that commit would work, > didn''t succeed right away, so I was not able to test that.Thanks for doing the bisect! Can we also see your .config also? I doubt this is a slab allocator initialization issue so I''m CC''ing some Xen folks. Jeremy, I don''t know Xen well but on quick read, the only thing that I can see is that trap_init() is called before sched_init() now. I see Xen doing preempt_enable()/preempt_disable so maybe that''s a problem now? Pekka _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Arnd Hannemann
2009-Aug-25 16:49 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
Hi Pekka, Pekka Enberg wrote:> On Tue, Aug 25, 2009 at 6:48 PM, Arnd > Hannemann<hannemann@nets.rwth-aachen.de> wrote: >> current 2.6.31 fails to boot on our xen host (32bit pae). >> Unfortunately it fails in a way that there is absolutely no output >> on the console. Config is as 32bit guest. >> >> Git bisect gave the following: >> >> 83b519e8b9572c319c8e0c615ee5dd7272856090 is first bad commit commit >> 83b519e8b9572c319c8e0c615ee5dd7272856090 Author: Pekka Enberg >> <penberg@cs.helsinki.fi> Date: Wed Jun 10 19:40:04 2009 +0300 >> >> slab: setup allocators earlier in the boot sequence >> >> This patch makes kmalloc() available earlier in the boot sequence >> so we can get rid of some bootmem allocations. The bulk of the >> changes are due to kmem_cache_init() being called with interrupts >> disabled which requires some changes to allocator boostrap code. >> >> Note: 32-bit x86 does WP protect test in mem_init() so we must >> setup traps before we call mem_init() during boot as reported by >> Ingo Molnar: >> >> We have a hard crash in the WP-protect code: >> >> [ 0.000000] Checking if this processor honours the WP bit even >> in supervisor mode...BUG: Int 14: CR2 ffcff000 [ 0.000000] >> EDI 00000188 ESI 00000ac7 EBP c17eaf9c ESP c17eaf8c [ >> 0.000000] EBX 000014e0 EDX 0000000e ECX 01856067 EAX >> 00000001 [ 0.000000] err 00000003 EIP c10135b1 CS >> 00000060 flg 00010002 [ 0.000000] Stack: c17eafa8 c17fd410 >> c16747bc c17eafc4 c17fd7e5 000011fd f8616000 c18237cc [ >> 0.000000] 00099800 c17bb000 c17eafec c17f1668 000001c5 >> c17f1322 c166e039 c1822bf0 [ 0.000000] c166e033 c153a014 >> c18237cc 00020800 c17eaff8 c17f106a 00020800 01ba5003 [ >> 0.000000] Pid: 0, comm: swapper Not tainted >> 2.6.30-tip-02161-g7a74539-dirty #52203 [ 0.000000] Call Trace: [ >> 0.000000] [<c15357c2>] ? printk+0x14/0x16 [ 0.000000] >> [<c10135b1>] ? do_test_wp_bit+0x19/0x23 [ 0.000000] >> [<c17fd410>] ? test_wp_bit+0x26/0x64 [ 0.000000] [<c17fd7e5>] ? >> mem_init+0x1ba/0x1d8 [ 0.000000] [<c17f1668>] ? >> start_kernel+0x164/0x2f7 [ 0.000000] [<c17f1322>] ? >> unknown_bootoption+0x0/0x19c [ 0.000000] [<c17f106a>] ? >> __init_begin+0x6a/0x6f >> >> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by Linus >> Torvalds <torvalds@linux-foundation.org> Cc: Christoph Lameter >> <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matt >> Mackall <mpm@selenic.com> Cc: Nick Piggin <npiggin@suse.de> Cc: >> Yinghai Lu <yinghai@kernel.org> Signed-off-by: Pekka Enberg >> <penberg@cs.helsinki.fi> >> >> However a git revert 83b519e8b9572c319c8e0c615ee5dd7272856090 to >> verify that current git without that commit would work, didn''t >> succeed right away, so I was not able to test that. > > Thanks for doing the bisect! Can we also see your .config also?Config for -rc7 is attached. My bisect configs were based on that.> > I doubt this is a slab allocator initialization issue so I''m CC''ing > some Xen folks. Jeremy, I don''t know Xen well but on quick read, the > only thing that I can see is that trap_init() is called before > sched_init() now. I see Xen doing preempt_enable()/preempt_disable so > maybe that''s a problem now? > > PekkaBest regards, Arnd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pekka Enberg
2009-Aug-25 16:52 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
Hi Arnd, On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote:> > Thanks for doing the bisect! Can we also see your .config also? > > Config for -rc7 is attached. My bisect configs were based on thatThanks! While we wait for the Xen people, you can try the following patch to see if we can narrow the bug down to trap_init(). Pekka diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 3cd7711..7e8e4e4 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -956,8 +956,10 @@ void __init mem_init(void) BUG_ON(VMALLOC_START >= VMALLOC_END); BUG_ON((unsigned long)high_memory > VMALLOC_START); +#if 0 if (boot_cpu_data.wp_works_ok < 0) test_wp_bit(); +#endif save_pg_dir(); zap_low_mappings(true); diff --git a/init/main.c b/init/main.c index 2d9d6bd..5c4dacb 100644 --- a/init/main.c +++ b/init/main.c @@ -603,7 +603,6 @@ asmlinkage void __init start_kernel(void) pidhash_init(); vfs_caches_init_early(); sort_main_extable(); - trap_init(); mm_init(); /* * Set up the scheduler prior starting any interrupts (such as the @@ -621,6 +620,7 @@ asmlinkage void __init start_kernel(void) "enabled *very* early, fixing it\n"); local_irq_disable(); } + trap_init(); rcu_init(); /* init some links before init_ISA_irqs() */ early_irq_init(); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Arnd Hannemann
2009-Aug-25 17:49 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
Hi Pekka, Pekka Enberg wrote:> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: >>> Thanks for doing the bisect! Can we also see your .config also? >> Config for -rc7 is attached. My bisect configs were based on that > > Thanks! While we wait for the Xen people, you can try the following > patch to see if we can narrow the bug down to trap_init().Yes seems to be trap_init(). -rc7 with this patch applied boots up to the prompt. Best regards, Arnd> > diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index > 3cd7711..7e8e4e4 100644 --- a/arch/x86/mm/init_32.c +++ > b/arch/x86/mm/init_32.c @@ -956,8 +956,10 @@ void __init > mem_init(void) BUG_ON(VMALLOC_START >= VMALLOC_END); > BUG_ON((unsigned long)high_memory > VMALLOC_START); > > +#if 0 if (boot_cpu_data.wp_works_ok < 0) test_wp_bit(); +#endif > > save_pg_dir(); zap_low_mappings(true); diff --git a/init/main.c > b/init/main.c index 2d9d6bd..5c4dacb 100644 --- a/init/main.c +++ > b/init/main.c @@ -603,7 +603,6 @@ asmlinkage void __init > start_kernel(void) pidhash_init(); vfs_caches_init_early(); > sort_main_extable(); - trap_init(); mm_init(); /* * Set up the > scheduler prior starting any interrupts (such as the @@ -621,6 +620,7 > @@ asmlinkage void __init start_kernel(void) "enabled *very* early, > fixing it\n"); local_irq_disable(); } + trap_init(); rcu_init(); /* > init some links before init_ISA_irqs() */ early_irq_init(); > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pekka Enberg
2009-Aug-25 18:03 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote:> Hi Pekka, > > Pekka Enberg wrote: > > On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: > >>> Thanks for doing the bisect! Can we also see your .config also? > >> Config for -rc7 is attached. My bisect configs were based on that > > > > Thanks! While we wait for the Xen people, you can try the following > > patch to see if we can narrow the bug down to trap_init(). > > Yes seems to be trap_init(). > -rc7 with this patch applied boots up to the prompt.Thanks for testing! Ingo, what do you think of the following patch? AFAICT, x86-32 is the only architecture playing with traps in mem_init() so this should be the safest fix for 2.6.31. Pekka>From b739e3c3baa6312664b4ea676bdf73df27fcecbc Mon Sep 17 00:00:00 2001From: Pekka Enberg <penberg@cs.helsinki.fi> Date: Tue, 25 Aug 2009 20:55:25 +0300 Subject: [PATCH] x86: Move WP bit testing to trap_init() Commit 83b519e8b9572c319c8e0c615ee5dd7272856090 ("slab: setup allocators earlier in the boot sequence") moved trap_init() earlier in the boot sequence to avoid a hard crash with 32-bit x86 in mem_init(). Unfortunately the change broke Xen so make trap_init() later and move the WP bit test from mem_init() to trap_init(). Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> --- arch/x86/kernel/traps.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++ arch/x86/mm/init_32.c | 56 ---------------------------------------------- init/main.c | 2 +- 3 files changed, 58 insertions(+), 57 deletions(-) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 5204332..2084408 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -906,6 +906,60 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code) return; do_trap(32, SIGILL, "iret exception", regs, error_code, &info); } + +static noinline int do_test_wp_bit(void); + +/* + * Test if the WP bit works in supervisor mode. It isn''t supported on 386''s + * and also on some strange 486''s. All 586+''s are OK. This used to involve + * black magic jumps to work around some nasty CPU bugs, but fortunately the + * switch to using exceptions got rid of all that. + */ +static void __init test_wp_bit(void) +{ + printk(KERN_INFO + "Checking if this processor honours the WP bit even in supervisor mode..."); + + /* Any page-aligned address will do, the test is non-destructive */ + __set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_READONLY); + boot_cpu_data.wp_works_ok = do_test_wp_bit(); + clear_fixmap(FIX_WP_TEST); + + if (!boot_cpu_data.wp_works_ok) { + printk(KERN_CONT "No.\n"); +#ifdef CONFIG_X86_WP_WORKS_OK + panic( + "This kernel doesn''t support CPU''s with broken WP. Recompile it for a 386!"); +#endif + } else { + printk(KERN_CONT "Ok.\n"); + } +} + +/* + * This function cannot be __init, since exceptions don''t work in that + * section. Put this after the callers, so that it cannot be inlined. + */ +static noinline int do_test_wp_bit(void) +{ + char tmp_reg; + int flag; + + __asm__ __volatile__( + " movb %0, %1 \n" + "1: movb %1, %0 \n" + " xorl %2, %2 \n" + "2: \n" + _ASM_EXTABLE(1b,2b) + :"=m" (*(char *)fix_to_virt(FIX_WP_TEST)), + "=q" (tmp_reg), + "=r" (flag) + :"2" (1) + :"memory"); + + return flag; +} + #endif void __init trap_init(void) @@ -982,5 +1036,8 @@ void __init trap_init(void) #ifdef CONFIG_X86_32 x86_quirk_trap_init(); + + if (boot_cpu_data.wp_works_ok < 0) + test_wp_bit(); #endif } diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 3cd7711..e22bb8a 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -54,8 +54,6 @@ unsigned long highstart_pfn, highend_pfn; -static noinline int do_test_wp_bit(void); - bool __read_mostly __vmalloc_start_set = false; static __init void *alloc_low_page(void) @@ -830,33 +828,6 @@ void __init paging_init(void) zone_sizes_init(); } -/* - * Test if the WP bit works in supervisor mode. It isn''t supported on 386''s - * and also on some strange 486''s. All 586+''s are OK. This used to involve - * black magic jumps to work around some nasty CPU bugs, but fortunately the - * switch to using exceptions got rid of all that. - */ -static void __init test_wp_bit(void) -{ - printk(KERN_INFO - "Checking if this processor honours the WP bit even in supervisor mode..."); - - /* Any page-aligned address will do, the test is non-destructive */ - __set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_READONLY); - boot_cpu_data.wp_works_ok = do_test_wp_bit(); - clear_fixmap(FIX_WP_TEST); - - if (!boot_cpu_data.wp_works_ok) { - printk(KERN_CONT "No.\n"); -#ifdef CONFIG_X86_WP_WORKS_OK - panic( - "This kernel doesn''t support CPU''s with broken WP. Recompile it for a 386!"); -#endif - } else { - printk(KERN_CONT "Ok.\n"); - } -} - static struct kcore_list kcore_mem, kcore_vmalloc; void __init mem_init(void) @@ -956,9 +927,6 @@ void __init mem_init(void) BUG_ON(VMALLOC_START >= VMALLOC_END); BUG_ON((unsigned long)high_memory > VMALLOC_START); - if (boot_cpu_data.wp_works_ok < 0) - test_wp_bit(); - save_pg_dir(); zap_low_mappings(true); } @@ -975,30 +943,6 @@ int arch_add_memory(int nid, u64 start, u64 size) } #endif -/* - * This function cannot be __init, since exceptions don''t work in that - * section. Put this after the callers, so that it cannot be inlined. - */ -static noinline int do_test_wp_bit(void) -{ - char tmp_reg; - int flag; - - __asm__ __volatile__( - " movb %0, %1 \n" - "1: movb %1, %0 \n" - " xorl %2, %2 \n" - "2: \n" - _ASM_EXTABLE(1b,2b) - :"=m" (*(char *)fix_to_virt(FIX_WP_TEST)), - "=q" (tmp_reg), - "=r" (flag) - :"2" (1) - :"memory"); - - return flag; -} - #ifdef CONFIG_DEBUG_RODATA const int rodata_test_data = 0xC3; EXPORT_SYMBOL_GPL(rodata_test_data); diff --git a/init/main.c b/init/main.c index 2d9d6bd..5c4dacb 100644 --- a/init/main.c +++ b/init/main.c @@ -603,7 +603,6 @@ asmlinkage void __init start_kernel(void) pidhash_init(); vfs_caches_init_early(); sort_main_extable(); - trap_init(); mm_init(); /* * Set up the scheduler prior starting any interrupts (such as the @@ -621,6 +620,7 @@ asmlinkage void __init start_kernel(void) "enabled *very* early, fixing it\n"); local_irq_disable(); } + trap_init(); rcu_init(); /* init some links before init_ISA_irqs() */ early_irq_init(); -- 1.5.6.3 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-25 18:06 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On 08/25/09 09:52, Pekka Enberg wrote:> Hi Arnd, > > On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: > >>> Thanks for doing the bisect! Can we also see your .config also? >>> >> Config for -rc7 is attached. My bisect configs were based on that >> > Thanks! While we wait for the Xen people, you can try the following > patch to see if we can narrow the bug down to trap_init(). >I think there''s a problem that the side-effect of this change is that interrupt initialization comes later, and so the dynamically allocated arrays are not set up when the first interrupt comes in. However, this particular change shouldn''t have any effect on interrupts being enabled early, right? I have a local workaround which simply reverts the arrays back to statically allocated, but it isn''t very satisfactory (large memory hit, esp if you''re not running Xen). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-25 18:08 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On 08/25/09 11:03, Pekka Enberg wrote:> On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: > >> Hi Pekka, >> >> Pekka Enberg wrote: >> >>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: >>> >>>>> Thanks for doing the bisect! Can we also see your .config also? >>>>> >>>> Config for -rc7 is attached. My bisect configs were based on that >>>> >>> Thanks! While we wait for the Xen people, you can try the following >>> patch to see if we can narrow the bug down to trap_init(). >>> >> Yes seems to be trap_init(). >> -rc7 with this patch applied boots up to the prompt. >> > Thanks for testing! Ingo, what do you think of the following patch? > AFAICT, x86-32 is the only architecture playing with traps in mem_init() > so this should be the safest fix for 2.6.31. >Huh, interesting. I wonder if this is the same as the problem I''d been chasing or separate... J> Pekka > > >From b739e3c3baa6312664b4ea676bdf73df27fcecbc Mon Sep 17 00:00:00 2001 > From: Pekka Enberg <penberg@cs.helsinki.fi> > Date: Tue, 25 Aug 2009 20:55:25 +0300 > Subject: [PATCH] x86: Move WP bit testing to trap_init() > > Commit 83b519e8b9572c319c8e0c615ee5dd7272856090 ("slab: setup allocators > earlier in the boot sequence") moved trap_init() earlier in the boot > sequence to avoid a hard crash with 32-bit x86 in mem_init(). > > Unfortunately the change broke Xen so make trap_init() later and move > the WP bit test from mem_init() to trap_init(). > > Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> > --- > arch/x86/kernel/traps.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++ > arch/x86/mm/init_32.c | 56 ---------------------------------------------- > init/main.c | 2 +- > 3 files changed, 58 insertions(+), 57 deletions(-) > > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c > index 5204332..2084408 100644 > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -906,6 +906,60 @@ dotraplinkage void do_iret_error(struct pt_regs *regs, long error_code) > return; > do_trap(32, SIGILL, "iret exception", regs, error_code, &info); > } > + > +static noinline int do_test_wp_bit(void); > + > +/* > + * Test if the WP bit works in supervisor mode. It isn''t supported on 386''s > + * and also on some strange 486''s. All 586+''s are OK. This used to involve > + * black magic jumps to work around some nasty CPU bugs, but fortunately the > + * switch to using exceptions got rid of all that. > + */ > +static void __init test_wp_bit(void) > +{ > + printk(KERN_INFO > + "Checking if this processor honours the WP bit even in supervisor mode..."); > + > + /* Any page-aligned address will do, the test is non-destructive */ > + __set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_READONLY); > + boot_cpu_data.wp_works_ok = do_test_wp_bit(); > + clear_fixmap(FIX_WP_TEST); > + > + if (!boot_cpu_data.wp_works_ok) { > + printk(KERN_CONT "No.\n"); > +#ifdef CONFIG_X86_WP_WORKS_OK > + panic( > + "This kernel doesn''t support CPU''s with broken WP. Recompile it for a 386!"); > +#endif > + } else { > + printk(KERN_CONT "Ok.\n"); > + } > +} > + > +/* > + * This function cannot be __init, since exceptions don''t work in that > + * section. Put this after the callers, so that it cannot be inlined. > + */ > +static noinline int do_test_wp_bit(void) > +{ > + char tmp_reg; > + int flag; > + > + __asm__ __volatile__( > + " movb %0, %1 \n" > + "1: movb %1, %0 \n" > + " xorl %2, %2 \n" > + "2: \n" > + _ASM_EXTABLE(1b,2b) > + :"=m" (*(char *)fix_to_virt(FIX_WP_TEST)), > + "=q" (tmp_reg), > + "=r" (flag) > + :"2" (1) > + :"memory"); > + > + return flag; > +} > + > #endif > > void __init trap_init(void) > @@ -982,5 +1036,8 @@ void __init trap_init(void) > > #ifdef CONFIG_X86_32 > x86_quirk_trap_init(); > + > + if (boot_cpu_data.wp_works_ok < 0) > + test_wp_bit(); > #endif > } > diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c > index 3cd7711..e22bb8a 100644 > --- a/arch/x86/mm/init_32.c > +++ b/arch/x86/mm/init_32.c > @@ -54,8 +54,6 @@ > > unsigned long highstart_pfn, highend_pfn; > > -static noinline int do_test_wp_bit(void); > - > bool __read_mostly __vmalloc_start_set = false; > > static __init void *alloc_low_page(void) > @@ -830,33 +828,6 @@ void __init paging_init(void) > zone_sizes_init(); > } > > -/* > - * Test if the WP bit works in supervisor mode. It isn''t supported on 386''s > - * and also on some strange 486''s. All 586+''s are OK. This used to involve > - * black magic jumps to work around some nasty CPU bugs, but fortunately the > - * switch to using exceptions got rid of all that. > - */ > -static void __init test_wp_bit(void) > -{ > - printk(KERN_INFO > - "Checking if this processor honours the WP bit even in supervisor mode..."); > - > - /* Any page-aligned address will do, the test is non-destructive */ > - __set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_READONLY); > - boot_cpu_data.wp_works_ok = do_test_wp_bit(); > - clear_fixmap(FIX_WP_TEST); > - > - if (!boot_cpu_data.wp_works_ok) { > - printk(KERN_CONT "No.\n"); > -#ifdef CONFIG_X86_WP_WORKS_OK > - panic( > - "This kernel doesn''t support CPU''s with broken WP. Recompile it for a 386!"); > -#endif > - } else { > - printk(KERN_CONT "Ok.\n"); > - } > -} > - > static struct kcore_list kcore_mem, kcore_vmalloc; > > void __init mem_init(void) > @@ -956,9 +927,6 @@ void __init mem_init(void) > BUG_ON(VMALLOC_START >= VMALLOC_END); > BUG_ON((unsigned long)high_memory > VMALLOC_START); > > - if (boot_cpu_data.wp_works_ok < 0) > - test_wp_bit(); > - > save_pg_dir(); > zap_low_mappings(true); > } > @@ -975,30 +943,6 @@ int arch_add_memory(int nid, u64 start, u64 size) > } > #endif > > -/* > - * This function cannot be __init, since exceptions don''t work in that > - * section. Put this after the callers, so that it cannot be inlined. > - */ > -static noinline int do_test_wp_bit(void) > -{ > - char tmp_reg; > - int flag; > - > - __asm__ __volatile__( > - " movb %0, %1 \n" > - "1: movb %1, %0 \n" > - " xorl %2, %2 \n" > - "2: \n" > - _ASM_EXTABLE(1b,2b) > - :"=m" (*(char *)fix_to_virt(FIX_WP_TEST)), > - "=q" (tmp_reg), > - "=r" (flag) > - :"2" (1) > - :"memory"); > - > - return flag; > -} > - > #ifdef CONFIG_DEBUG_RODATA > const int rodata_test_data = 0xC3; > EXPORT_SYMBOL_GPL(rodata_test_data); > diff --git a/init/main.c b/init/main.c > index 2d9d6bd..5c4dacb 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -603,7 +603,6 @@ asmlinkage void __init start_kernel(void) > pidhash_init(); > vfs_caches_init_early(); > sort_main_extable(); > - trap_init(); > mm_init(); > /* > * Set up the scheduler prior starting any interrupts (such as the > @@ -621,6 +620,7 @@ asmlinkage void __init start_kernel(void) > "enabled *very* early, fixing it\n"); > local_irq_disable(); > } > + trap_init(); > rcu_init(); > /* init some links before init_ISA_irqs() */ > early_irq_init(); >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pekka Enberg
2009-Aug-25 18:14 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On Tue, 2009-08-25 at 11:06 -0700, Jeremy Fitzhardinge wrote:> On 08/25/09 09:52, Pekka Enberg wrote: > > Hi Arnd, > > > > On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: > > > >>> Thanks for doing the bisect! Can we also see your .config also? > >>> > >> Config for -rc7 is attached. My bisect configs were based on that > >> > > Thanks! While we wait for the Xen people, you can try the following > > patch to see if we can narrow the bug down to trap_init(). > > > > I think there''s a problem that the side-effect of this change is that > interrupt initialization comes later, and so the dynamically allocated > arrays are not set up when the first interrupt comes in.Which interrupt initialization is that? We call trap_init() _earlier_ now.> However, this particular change shouldn''t have any effect on interrupts > being enabled early, right?Yeah, interrupt enabling should not be affected. Pekka _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Arnd Hannemann
2009-Aug-25 18:22 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
Pekka Enberg wrote:> On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: >> Hi Pekka, >> >> Pekka Enberg wrote: >>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: >>>>> Thanks for doing the bisect! Can we also see your .config >>>>> also? >>>> Config for -rc7 is attached. My bisect configs were based on >>>> that >>> Thanks! While we wait for the Xen people, you can try the >>> following patch to see if we can narrow the bug down to >>> trap_init(). >> Yes seems to be trap_init(). -rc7 with this patch applied boots up >> to the prompt. > > Thanks for testing! Ingo, what do you think of the following patch? > AFAICT, x86-32 is the only architecture playing with traps in > mem_init() so this should be the safest fix for 2.6.31.Hmm, -rc7 + this fix does not work for me :-/ Still hangs before any output... Best regards, Arnd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ingo Molnar
2009-Aug-25 18:25 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
* Arnd Hannemann <hannemann@nets.rwth-aachen.de> wrote:> Pekka Enberg wrote: > > On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: > >> Hi Pekka, > >> > >> Pekka Enberg wrote: > >>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: > >>>>> Thanks for doing the bisect! Can we also see your .config > >>>>> also? > >>>> Config for -rc7 is attached. My bisect configs were based on > >>>> that > >>> Thanks! While we wait for the Xen people, you can try the > >>> following patch to see if we can narrow the bug down to > >>> trap_init(). > >> Yes seems to be trap_init(). -rc7 with this patch applied boots up > >> to the prompt. > > > > Thanks for testing! Ingo, what do you think of the following patch? > > AFAICT, x86-32 is the only architecture playing with traps in > > mem_init() so this should be the safest fix for 2.6.31. > > Hmm, -rc7 + this fix does not work for me :-/ > Still hangs before any output...does earlyprintk=vga tell you anything about precisely where it hangs? Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-25 18:30 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On 08/25/09 11:25, Ingo Molnar wrote:> * Arnd Hannemann <hannemann@nets.rwth-aachen.de> wrote: > > >> Pekka Enberg wrote: >> >>> On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: >>> >>>> Hi Pekka, >>>> >>>> Pekka Enberg wrote: >>>> >>>>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: >>>>> >>>>>>> Thanks for doing the bisect! Can we also see your .config >>>>>>> also? >>>>>>> >>>>>> Config for -rc7 is attached. My bisect configs were based on >>>>>> that >>>>>> >>>>> Thanks! While we wait for the Xen people, you can try the >>>>> following patch to see if we can narrow the bug down to >>>>> trap_init(). >>>>> >>>> Yes seems to be trap_init(). -rc7 with this patch applied boots up >>>> to the prompt. >>>> >>> Thanks for testing! Ingo, what do you think of the following patch? >>> AFAICT, x86-32 is the only architecture playing with traps in >>> mem_init() so this should be the safest fix for 2.6.31. >>> >> Hmm, -rc7 + this fix does not work for me :-/ >> Still hangs before any output... >> > does earlyprintk=vga tell you anything about precisely where it > hangs? >It''s a Xen domain, so it should be earlyprintk=xen J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Arnd Hannemann
2009-Aug-25 18:38 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
Jeremy Fitzhardinge wrote:> On 08/25/09 11:25, Ingo Molnar wrote: >> * Arnd Hannemann <hannemann@nets.rwth-aachen.de> wrote: >> >> >>> Pekka Enberg wrote: >>> >>>> On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: >>>> >>>>> Hi Pekka, >>>>> >>>>> Pekka Enberg wrote: >>>>> >>>>>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: >>>>>> >>>>>>>> Thanks for doing the bisect! Can we also see your >>>>>>>> .config also? >>>>>>>> >>>>>>> Config for -rc7 is attached. My bisect configs were based >>>>>>> on that >>>>>>> >>>>>> Thanks! While we wait for the Xen people, you can try the >>>>>> following patch to see if we can narrow the bug down to >>>>>> trap_init(). >>>>>> >>>>> Yes seems to be trap_init(). -rc7 with this patch applied >>>>> boots up to the prompt. >>>>> >>>> Thanks for testing! Ingo, what do you think of the following >>>> patch? AFAICT, x86-32 is the only architecture playing with >>>> traps in mem_init() so this should be the safest fix for >>>> 2.6.31. >>>> >>> Hmm, -rc7 + this fix does not work for me :-/ Still hangs before >>> any output... >>> >> does earlyprintk=vga tell you anything about precisely where it >> hangs? >> > > It''s a Xen domain, so it should be earlyprintk=xen > > J >Here is the output with earlyprintk=xen and the second patch from pekka applied: (early) [ 0.000000] Initializing CPU#0 (early) [ 0.000000] Checking if this processor honours the WP bit even in supervisor mode...(early) (early) [ 0.000000] BUG: unable to handle kernel (early) NULL pointer dereference(early) at (null) (early) [ 0.000000] IP:(early) [<c1192993>] xen_evtchn_do_upcall+0xd3/0x160 (early) [ 0.000000] *pdpt = 0000000008386001 (early) (early) [ 0.000000] Thread overran stack, or stack corrupted (early) [ 0.000000] Oops: 0000 [#1] (early) SMP (early) (early) [ 0.000000] last sysfs file: (early) [ 0.000000] Modules linked in:(early) (early) [ 0.000000] (early) [ 0.000000] Pid: 0, comm: swapper Not tainted (2.6.31-rc7-pae-um #10) (early) [ 0.000000] EIP: 0061:[<c1192993>] EFLAGS: 00010046 CPU: 0 (early) [ 0.000000] EIP is at xen_evtchn_do_upcall+0xd3/0x160 (early) [ 0.000000] EAX: 00000004 EBX: 00000000 ECX: 00000004 EDX: ffffffff (early) [ 0.000000] ESI: fffffffe EDI: 00000000 EBP: 00000000 ESP: c1413e64 (early) [ 0.000000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: e021 (early) [ 0.000000] Process swapper (pid: 0, ti=c1412000 task=c13d11a0 task.ti=c1412000) (early) [ 0.000000] Stack: (early) [ 0.000000] f5793000(early) c146d9f0(early) c146d9f0(early) 00000000(early) c1413e9c(early) 00000000(early) 00000000(early) c3a01020(early) (early) [ 0.000000] <0>(early) 00000000(early) eec06067(early) c000cff8(early) 00000000(early) 00000000(early) c10086d7(early) eec06067(early) c000cff8(early) (early) [ 0.000000] <0>(early) f55ff000(early) c000cff8(early) 00000000(early) 00000000(early) c13d7d60(early) c101e021(early) c141e021(early) c10100d8(early) (early) [ 0.000000] Call Trace: (early) [ 0.000000] [<c10086d7>] ? xen_do_upcall+0x7/0xc (early) [ 0.000000] [<c101e021>] ? ptep_set_access_flags+0x1/0x80 (early) [ 0.000000] [<c141e021>] ? find_e820_area_size+0x51/0x330 (early) Best regards, Arnd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pekka Enberg
2009-Aug-25 18:43 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On Tue, 2009-08-25 at 20:38 +0200, Arnd Hannemann wrote:> >>> Hmm, -rc7 + this fix does not work for me :-/ Still hangs before > >>> any output... > >>> > >> does earlyprintk=vga tell you anything about precisely where it > >> hangs? > >> > > > > It''s a Xen domain, so it should be earlyprintk=xen > > > > J > > > Here is the output with earlyprintk=xen and the second patch from pekka > applied: > > (early) [ 0.000000] Initializing CPU#0 > (early) [ 0.000000] Checking if this processor honours the WP bit > even in supervisor mode...(early) > (early) [ 0.000000] BUG: unable to handle kernel (early) NULL pointer > dereference(early) at (null) > (early) [ 0.000000] IP:(early) [<c1192993>] > xen_evtchn_do_upcall+0xd3/0x160 > (early) [ 0.000000] *pdpt = 0000000008386001 (early) > (early) [ 0.000000] Thread overran stack, or stack corrupted > (early) [ 0.000000] Oops: 0000 [#1] (early) SMP (early) > (early) [ 0.000000] last sysfs file: > (early) [ 0.000000] Modules linked in:(early) > (early) [ 0.000000] > (early) [ 0.000000] Pid: 0, comm: swapper Not tainted > (2.6.31-rc7-pae-um #10) > (early) [ 0.000000] EIP: 0061:[<c1192993>] EFLAGS: 00010046 CPU: 0 > (early) [ 0.000000] EIP is at xen_evtchn_do_upcall+0xd3/0x160 > (early) [ 0.000000] EAX: 00000004 EBX: 00000000 ECX: 00000004 EDX: > ffffffff > (early) [ 0.000000] ESI: fffffffe EDI: 00000000 EBP: 00000000 ESP: > c1413e64 > (early) [ 0.000000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: e021 > (early) [ 0.000000] Process swapper (pid: 0, ti=c1412000 > task=c13d11a0 task.ti=c1412000) > (early) [ 0.000000] Stack: > (early) [ 0.000000] f5793000(early) c146d9f0(early) > c146d9f0(early) 00000000(early) c1413e9c(early) 00000000(early) > 00000000(early) c3a01020(early) > (early) [ 0.000000] <0>(early) 00000000(early) eec06067(early) > c000cff8(early) 00000000(early) 00000000(early) c10086d7(early) > eec06067(early) c000cff8(early) > (early) [ 0.000000] <0>(early) f55ff000(early) c000cff8(early) > 00000000(early) 00000000(early) c13d7d60(early) c101e021(early) > c141e021(early) c10100d8(early) > (early) [ 0.000000] Call Trace: > (early) [ 0.000000] [<c10086d7>] ? xen_do_upcall+0x7/0xc > (early) [ 0.000000] [<c101e021>] ? ptep_set_access_flags+0x1/0x80 > (early) [ 0.000000] [<c141e021>] ? find_e820_area_size+0x51/0x330 > (early)Aha, the previous patch worked because I #ifdef the WP test completely. Jeremy, the root cause here is that we do the WP test much earlier than before. Even with the test moved to trap_init(), we do it early compared to what we did before. I guess Xen is not prepared to handle traps this early in the boot sequence? Can we fix that? Pekka _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pekka Enberg
2009-Aug-25 18:58 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On Tue, 2009-08-25 at 20:22 +0200, Arnd Hannemann wrote:> Pekka Enberg wrote: > > On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: > >> Hi Pekka, > >> > >> Pekka Enberg wrote: > >>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: > >>>>> Thanks for doing the bisect! Can we also see your .config > >>>>> also? > >>>> Config for -rc7 is attached. My bisect configs were based on > >>>> that > >>> Thanks! While we wait for the Xen people, you can try the > >>> following patch to see if we can narrow the bug down to > >>> trap_init(). > >> Yes seems to be trap_init(). -rc7 with this patch applied boots up > >> to the prompt. > > > > Thanks for testing! Ingo, what do you think of the following patch? > > AFAICT, x86-32 is the only architecture playing with traps in > > mem_init() so this should be the safest fix for 2.6.31. > > Hmm, -rc7 + this fix does not work for me :-/ > Still hangs before any output...Arnd, does this work for you? Pekka diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 13ffa5d..b86472d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -50,6 +50,7 @@ config X86 select HAVE_KERNEL_BZIP2 select HAVE_KERNEL_LZMA select HAVE_ARCH_KMEMCHECK + select HAVE_ARCH_MEM_INIT_LATE if X86_32 config OUTPUT_FORMAT string diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 3cd7711..488ed4b 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -956,13 +956,17 @@ void __init mem_init(void) BUG_ON(VMALLOC_START >= VMALLOC_END); BUG_ON((unsigned long)high_memory > VMALLOC_START); - if (boot_cpu_data.wp_works_ok < 0) - test_wp_bit(); - save_pg_dir(); zap_low_mappings(true); } +void __init mem_init_late(void) +{ + /* Interrupts are enabled at this point. */ + if (boot_cpu_data.wp_works_ok < 0) + test_wp_bit(); +} + #ifdef CONFIG_MEMORY_HOTPLUG int arch_add_memory(int nid, u64 start, u64 size) { diff --git a/include/linux/mm.h b/include/linux/mm.h index 9a72cc7..eefcfbe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1052,6 +1052,14 @@ extern void si_meminfo(struct sysinfo * val); extern void si_meminfo_node(struct sysinfo *val, int nid); extern int after_bootmem; +#ifdef CONFIG_HAVE_ARCH_MEM_INIT_LATE +extern void mem_init_late(void); +#else +static inline void mem_init_late(void) +{ +} +#endif + #ifdef CONFIG_NUMA extern void setup_per_cpu_pageset(void); #else diff --git a/init/main.c b/init/main.c index 2d9d6bd..45d8dbd 100644 --- a/init/main.c +++ b/init/main.c @@ -643,6 +643,7 @@ asmlinkage void __init start_kernel(void) set_gfp_allowed_mask(__GFP_BITS_MASK); kmem_cache_init_late(); + mem_init_late(); /* * HACK ALERT! This is early. We''re enabling the console before _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pekka Enberg
2009-Aug-25 19:07 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On Tue, 2009-08-25 at 21:58 +0300, Pekka Enberg wrote:> On Tue, 2009-08-25 at 20:22 +0200, Arnd Hannemann wrote: > > Pekka Enberg wrote: > > > On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: > > >> Hi Pekka, > > >> > > >> Pekka Enberg wrote: > > >>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: > > >>>>> Thanks for doing the bisect! Can we also see your .config > > >>>>> also? > > >>>> Config for -rc7 is attached. My bisect configs were based on > > >>>> that > > >>> Thanks! While we wait for the Xen people, you can try the > > >>> following patch to see if we can narrow the bug down to > > >>> trap_init(). > > >> Yes seems to be trap_init(). -rc7 with this patch applied boots up > > >> to the prompt. > > > > > > Thanks for testing! Ingo, what do you think of the following patch? > > > AFAICT, x86-32 is the only architecture playing with traps in > > > mem_init() so this should be the safest fix for 2.6.31. > > > > Hmm, -rc7 + this fix does not work for me :-/ > > Still hangs before any output... > > Arnd, does this work for you?Here''s a version of the patch that actually compiles. :-) Pekka diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 13ffa5d..b6ff185 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -50,6 +50,7 @@ config X86 select HAVE_KERNEL_BZIP2 select HAVE_KERNEL_LZMA select HAVE_ARCH_KMEMCHECK + select HAVE_ARCH_MEM_INIT_LATE if X86_32 config OUTPUT_FORMAT string @@ -86,6 +87,10 @@ config STACKTRACE_SUPPORT config HAVE_LATENCYTOP_SUPPORT def_bool y +config HAVE_ARCH_MEM_INIT_LATE + def_bool y + depends on X86_32 + config FAST_CMPXCHG_LOCAL bool default y diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 3cd7711..488ed4b 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -956,13 +956,17 @@ void __init mem_init(void) BUG_ON(VMALLOC_START >= VMALLOC_END); BUG_ON((unsigned long)high_memory > VMALLOC_START); - if (boot_cpu_data.wp_works_ok < 0) - test_wp_bit(); - save_pg_dir(); zap_low_mappings(true); } +void __init mem_init_late(void) +{ + /* Interrupts are enabled at this point. */ + if (boot_cpu_data.wp_works_ok < 0) + test_wp_bit(); +} + #ifdef CONFIG_MEMORY_HOTPLUG int arch_add_memory(int nid, u64 start, u64 size) { diff --git a/include/linux/mm.h b/include/linux/mm.h index 9a72cc7..eefcfbe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1052,6 +1052,14 @@ extern void si_meminfo(struct sysinfo * val); extern void si_meminfo_node(struct sysinfo *val, int nid); extern int after_bootmem; +#ifdef CONFIG_HAVE_ARCH_MEM_INIT_LATE +extern void mem_init_late(void); +#else +static inline void mem_init_late(void) +{ +} +#endif + #ifdef CONFIG_NUMA extern void setup_per_cpu_pageset(void); #else diff --git a/init/main.c b/init/main.c index 2d9d6bd..45d8dbd 100644 --- a/init/main.c +++ b/init/main.c @@ -643,6 +643,7 @@ asmlinkage void __init start_kernel(void) set_gfp_allowed_mask(__GFP_BITS_MASK); kmem_cache_init_late(); + mem_init_late(); /* * HACK ALERT! This is early. We''re enabling the console before _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Arnd Hannemann
2009-Aug-25 19:13 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
Pekka Enberg wrote:> On Tue, 2009-08-25 at 21:58 +0300, Pekka Enberg wrote: >> On Tue, 2009-08-25 at 20:22 +0200, Arnd Hannemann wrote: >>> Pekka Enberg wrote: >>>> On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: >>>>> Hi Pekka, >>>>> >>>>> Pekka Enberg wrote: >>>>>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: >>>>>>>> Thanks for doing the bisect! Can we also see your >>>>>>>> .config also? >>>>>>> Config for -rc7 is attached. My bisect configs were based >>>>>>> on that >>>>>> Thanks! While we wait for the Xen people, you can try the >>>>>> following patch to see if we can narrow the bug down to >>>>>> trap_init(). >>>>> Yes seems to be trap_init(). -rc7 with this patch applied >>>>> boots up to the prompt. >>>> Thanks for testing! Ingo, what do you think of the following >>>> patch? AFAICT, x86-32 is the only architecture playing with >>>> traps in mem_init() so this should be the safest fix for >>>> 2.6.31. >>> Hmm, -rc7 + this fix does not work for me :-/ Still hangs before >>> any output... >> Arnd, does this work for you? > > Here''s a version of the patch that actually compiles. :-)Great, this actually works, too :-) Boots up to the prompt. Best regards, Arnd _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-25 19:30 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On 08/25/09 11:38, Arnd Hannemann wrote:> Jeremy Fitzhardinge wrote: > >> On 08/25/09 11:25, Ingo Molnar wrote: >> >>> * Arnd Hannemann <hannemann@nets.rwth-aachen.de> wrote: >>> >>> >>> >>>> Pekka Enberg wrote: >>>> >>>> >>>>> On Tue, 2009-08-25 at 19:49 +0200, Arnd Hannemann wrote: >>>>> >>>>> >>>>>> Hi Pekka, >>>>>> >>>>>> Pekka Enberg wrote: >>>>>> >>>>>> >>>>>>> On Tue, 2009-08-25 at 18:49 +0200, Arnd Hannemann wrote: >>>>>>> >>>>>>> >>>>>>>>> Thanks for doing the bisect! Can we also see your >>>>>>>>> .config also? >>>>>>>>> >>>>>>>>> >>>>>>>> Config for -rc7 is attached. My bisect configs were based >>>>>>>> on that >>>>>>>> >>>>>>>> >>>>>>> Thanks! While we wait for the Xen people, you can try the >>>>>>> following patch to see if we can narrow the bug down to >>>>>>> trap_init(). >>>>>>> >>>>>>> >>>>>> Yes seems to be trap_init(). -rc7 with this patch applied >>>>>> boots up to the prompt. >>>>>> >>>>>> >>>>> Thanks for testing! Ingo, what do you think of the following >>>>> patch? AFAICT, x86-32 is the only architecture playing with >>>>> traps in mem_init() so this should be the safest fix for >>>>> 2.6.31. >>>>> >>>>> >>>> Hmm, -rc7 + this fix does not work for me :-/ Still hangs before >>>> any output... >>>> >>>> >>> does earlyprintk=vga tell you anything about precisely where it >>> hangs? >>> >>> >> It''s a Xen domain, so it should be earlyprintk=xen >> >> J >> >> > Here is the output with earlyprintk=xen and the second patch from pekka > applied: > > (early) [ 0.000000] Initializing CPU#0 > (early) [ 0.000000] Checking if this processor honours the WP bit > even in supervisor mode...(early) > (early) [ 0.000000] BUG: unable to handle kernel (early) NULL pointer > dereference(early) at (null) > (early) [ 0.000000] IP:(early) [<c1192993>] > xen_evtchn_do_upcall+0xd3/0x160 >OK, that''s the same problem I''ve seen; its trying to enable and delver interrupts before init_IRQ has been called, so the various allocated arrays aren''t set up. Ingo, I''m assuming that interrupts aren''t supposed to be enabled this early? Thanks, J> (early) [ 0.000000] *pdpt = 0000000008386001 (early) > (early) [ 0.000000] Thread overran stack, or stack corrupted > (early) [ 0.000000] Oops: 0000 [#1] (early) SMP (early) > (early) [ 0.000000] last sysfs file: > (early) [ 0.000000] Modules linked in:(early) > (early) [ 0.000000] > (early) [ 0.000000] Pid: 0, comm: swapper Not tainted > (2.6.31-rc7-pae-um #10) > (early) [ 0.000000] EIP: 0061:[<c1192993>] EFLAGS: 00010046 CPU: 0 > (early) [ 0.000000] EIP is at xen_evtchn_do_upcall+0xd3/0x160 > (early) [ 0.000000] EAX: 00000004 EBX: 00000000 ECX: 00000004 EDX: > ffffffff > (early) [ 0.000000] ESI: fffffffe EDI: 00000000 EBP: 00000000 ESP: > c1413e64 > (early) [ 0.000000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: e021 > (early) [ 0.000000] Process swapper (pid: 0, ti=c1412000 > task=c13d11a0 task.ti=c1412000) > (early) [ 0.000000] Stack: > (early) [ 0.000000] f5793000(early) c146d9f0(early) > c146d9f0(early) 00000000(early) c1413e9c(early) 00000000(early) > 00000000(early) c3a01020(early) > (early) [ 0.000000] <0>(early) 00000000(early) eec06067(early) > c000cff8(early) 00000000(early) 00000000(early) c10086d7(early) > eec06067(early) c000cff8(early) > (early) [ 0.000000] <0>(early) f55ff000(early) c000cff8(early) > 00000000(early) 00000000(early) c13d7d60(early) c101e021(early) > c141e021(early) c10100d8(early) > (early) [ 0.000000] Call Trace: > (early) [ 0.000000] [<c10086d7>] ? xen_do_upcall+0x7/0xc > (early) [ 0.000000] [<c101e021>] ? ptep_set_access_flags+0x1/0x80 > (early) [ 0.000000] [<c141e021>] ? find_e820_area_size+0x51/0x330 > (early) > > > Best regards, > Arnd > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Aug-25 19:31 UTC
[Xen-devel] Re: [bisected] 2.6.31 regression: fails to boot as xen guest
On 08/25/09 11:43, Pekka Enberg wrote:> Aha, the previous patch worked because I #ifdef the WP test completely. > Jeremy, the root cause here is that we do the WP test much earlier than > before. Even with the test moved to trap_init(), we do it early compared > to what we did before. > > I guess Xen is not prepared to handle traps this early in the boot > sequence? Can we fix that? >The crash is in the event (interrupt) path, which shouldn''t be involved in trap handling at all. I''m wondering if interrupts are getting enabled as a side-effect of handling the trap. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel