Konrad Rzeszutek Wilk
2013-Feb-21 13:05 UTC
Linux v3.9-rc0, three different bootup issues.
Heya, I am hitting _three_ different regressions on v3.9-rc0. I was wondering if anybody is willing to help me out in narrowing down the faulty git commits. The first one is when booting PV guests: [ 0.831091] BUG: unable to handle kernel NULL pointer dereference at 0000000000000024 [ 0.831107] IP: [<ffffffff8138978a>] apei_hest_parse+0x2a/0x140 [ 0.831122] PGD 0 [ 0.831130] Oops: 0000 [#1] SMP [ 0.831142] Modules linked in: [ 0.831150] CPU 0 [ 0.831156] Pid: 1, comm: swapper/0 Not tainted 3.8.0upstream-03229-ge1f5dd0 #1 [ 0.831166] RIP: e030:[<ffffffff8138978a>] [<ffffffff8138978a>] apei_hest_parse+0x2a/0x140 [ 0.831178] RSP: e02b:ffff88003d369e88 EFLAGS: 00010246 [ 0.831185] RAX: 00000000ffffffea RBX: 0000000000000030 RCX: 0000000000000000 [ 0.831193] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81329ae0 [ 0.831200] RBP: ffff88003d369ea8 R08: 0000000000000000 R09: 0000000000000000 [ 0.831208] R10: 0000000000007ff0 R11: 0000000000000002 R12: 0000000000000000 [ 0.831215] R13: ffffffff81329ae0 R14: 0000000000000000 R15: 0000000000000000 [ 0.831227] FS: 0000000000000000(0000) GS:ffff88003f800000(0000) knlGS:0000000000000000 [ 0.831236] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [ 0.831242] CR2: 0000000000000024 CR3: 0000000001a0c000 CR4: 0000000000000660 [ 0.831251] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.831258] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 0.831266] Process swapper/0 (pid: 1, threadinfo ffff88003d368000, task ffff88003d362800) [ 0.831274] Stack: [ 0.831280] 0000000000000030 0000000000000000 ffffffff81ae960e ffffffff81b402e8 [ 0.831298] ffff88003d369eb8 ffffffff81329b3b ffff88003d369ec8 ffffffff81ae9620 [ 0.831316] ffff88003d369ef8 ffffffff8100203d 0000000000000030 0000000000000007 [ 0.831336] Call Trace: [ 0.831347] [<ffffffff81ae960e>] ? pcie_portdrv_init+0x7a/0x7a [ 0.831359] [<ffffffff81329b3b>] aer_acpi_firmware_first+0x1b/0x30 [ 0.831370] [<ffffffff81ae9620>] aer_service_init+0x12/0x2b [ 0.831380] [<ffffffff8100203d>] do_one_initcall+0x3d/0x170 [ 0.831391] [<ffffffff81abd82c>] kernel_init_freeable+0x157/0x1e6 [ 0.831401] [<ffffffff81abd8bb>] ? kernel_init_freeable+0x1e6/0x1e6 [ 0.831412] [<ffffffff81647e90>] ? rest_init+0xa0/0xa0 [ 0.831422] [<ffffffff81647e99>] kernel_init+0x9/0xf0 [ 0.831432] [<ffffffff816669fc>] ret_from_fork+0x7c/0xb0 [ 0.831442] [<ffffffff81647e90>] ? rest_init+0xa0/0xa0 [ 0.831448] Code: 90 55 80 3d 00 3f 8d 00 00 b8 ea ff ff ff 48 89 e5 41 56 49 89 f6 4 The second when booting PVHVM guests: [ 28.249000] BUG: soft lockup - CPU#0 stuck for 23s! [migration/0:8] [ 28.249000] Modules linked in: [ 28.249000] CPU 0 [ 28.249000] Pid: 8, comm: migration/0 Not tainted 3.8.0upstream-03229-ge1f5dd0 #1 Xen HVM domU [ 28.249000] RIP: 0010:[<ffffffff81105bdb>] [<ffffffff81105bdb>] stop_machine_cpu_stop+0x7b/0xf0 [ 28.249000] RSP: 0000:ffff88003aa49d38 EFLAGS: 00000293 [ 28.249000] RAX: 0000000000000001 RBX: ffffffff810bff77 RCX: 0000000000000000 [ 28.249000] RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffff88003aa33de8 [ 28.249000] RBP: ffff88003aa49d68 R08: 0000000000000000 R09: 0000000000000001 [ 28.249000] R10: 0000000000000000 R11: 0000000000000003 R12: ffff88003aa48000 [ 28.249000] R13: 0000000000000000 R14: ffffffff810c96d5 R15: ffff88003aa49cb8 [ 28.249000] FS: 0000000000000000(0000) GS:ffff88003ae00000(0000) knlGS:0000000000000000 [ 28.249000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 28.249000] CR2: 0000000000000000 CR3: 0000000001a0c000 CR4: 00000000000006f0 [ 28.249000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 28.249000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 28.249000] Process migration/0 (pid: 8, threadinfo ffff88003aa48000, task ffff88003aa2c000) [ 28.249000] Stack: [ 28.249000] ffff88003aa49d98 ffff88003aa33d58 ffff88003aa48000 ffffffff81105b60 [ 28.249000] ffff88003aa33de8 ffff88003ae0e5e0 ffff88003aa49e48 ffffffff811058b3 [ 28.249000] ffff88003aa49da8 ffff88003aa49fd8 ffff88003ae0e5e8 ffff88003aa48000 [ 28.249000] Call Trace: [ 28.249000] [<ffffffff81105b60>] ? stop_one_cpu_nowait+0x30/0x30 [ 28.249000] [<ffffffff811058b3>] cpu_stopper_thread+0xb3/0x160 [ 28.249000] [<ffffffff8165d65e>] ? __schedule+0x3be/0x7d0 [ 28.249000] [<ffffffff8107f079>] ? default_spin_lock_flags+0x9/0x10 [ 28.249000] [<ffffffff810bebd7>] smpboot_thread_fn+0x157/0x1e0 [ 28.249000] [<ffffffff810bea80>] ? smpboot_create_threads+0x80/0x80 [ 28.249000] [<ffffffff810b5f06>] kthread+0xc6/0xd0 [ 28.249000] [<ffffffff810b5e40>] ? kthread_freezable_should_stop+0x80/0x80 [ 28.249000] [<ffffffff816669fc>] ret_from_fork+0x7c/0xb0 [ 28.249000] [<ffffffff810b5e40>] ? kthread_freezable_should_stop+0x80/0x80 [ 28.249000] Code: 41 83 fd 03 74 42 f0 41 ff 0e 0f 94 c0 84 c0 74 0f 8b 43 20 8b 4b 10 83 c0 01 89 4b 24 89 43 20 41 83 fd 04 74 3a 44 89 e8 f3 90 <44> 8b 6b 20 41 39 c5 74 ec 41 83 fd 02 75 c6 fa 66 66 90 66 66 And the third when booting dom0 and with xen-acpi-processor it ends up with -22 right after: [ 14.712473] xen-acpi-processor: Max ACPI ID: 8
>>> On 21.02.13 at 14:05, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > The first one is when booting PV guests: > > [ 0.831091] BUG: unable to handle kernel NULL pointer dereference at 0000000000000024 > [ 0.831107] IP: [<ffffffff8138978a>] apei_hest_parse+0x2a/0x140 > [ 0.831122] PGD 0 > [ 0.831130] Oops: 0000 [#1] SMP > [ 0.831142] Modules linked in: > [ 0.831150] CPU 0 > [ 0.831156] Pid: 1, comm: swapper/0 Not tainted 3.8.0upstream-03229-ge1f5dd0 #1 > [ 0.831166] RIP: e030:[<ffffffff8138978a>] [<ffffffff8138978a>] apei_hest_parse+0x2a/0x140 > [ 0.831178] RSP: e02b:ffff88003d369e88 EFLAGS: 00010246 > [ 0.831185] RAX: 00000000ffffffea RBX: 0000000000000030 RCX: 0000000000000000 > [ 0.831193] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81329ae0 > [ 0.831200] RBP: ffff88003d369ea8 R08: 0000000000000000 R09: 0000000000000000 > [ 0.831208] R10: 0000000000007ff0 R11: 0000000000000002 R12: 0000000000000000 > [ 0.831215] R13: ffffffff81329ae0 R14: 0000000000000000 R15: 0000000000000000 > [ 0.831227] FS: 0000000000000000(0000) GS:ffff88003f800000(0000) knlGS:0000000000000000 > [ 0.831236] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 0.831242] CR2: 0000000000000024 CR3: 0000000001a0c000 CR4: 0000000000000660 > [ 0.831251] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 0.831258] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 0.831266] Process swapper/0 (pid: 1, threadinfo ffff88003d368000, task ffff88003d362800) > [ 0.831274] Stack: > [ 0.831280] 0000000000000030 0000000000000000 ffffffff81ae960e ffffffff81b402e8 > [ 0.831298] ffff88003d369eb8 ffffffff81329b3b ffff88003d369ec8 ffffffff81ae9620 > [ 0.831316] ffff88003d369ef8 ffffffff8100203d 0000000000000030 0000000000000007 > [ 0.831336] Call Trace: > [ 0.831347] [<ffffffff81ae960e>] ? pcie_portdrv_init+0x7a/0x7a > [ 0.831359] [<ffffffff81329b3b>] aer_acpi_firmware_first+0x1b/0x30 > [ 0.831370] [<ffffffff81ae9620>] aer_service_init+0x12/0x2bThe bug here quite obviously is that with no ACPI you shouldn''t even get to that point. Yet apei_hest_parse() blindly uses hest_tab - that ought to regress on real hardware too. Jan