Ian Pratt
2006-Apr-19  09:08 UTC
RE: [Xen-devel] Another dom0 crash (Unable to handle kernel NULLpointer dereference)
> Subject: [Xen-devel] Another dom0 crash (Unable to handle > kernel NULLpointer dereference) > > This time running Xen 3.0.2-2, PAEPlease can you supply more information e.g. how many VMs were running, do you know what they were doing, can you repro, details about the h/w, amount of memory etc. This is a separate issue from the previous one. I seem to recall a similar crash has been reported before, but there was no repro information in the report either. Thanks, Ian> Previous post: > http://lists.xensource.com/archives/html/xen-devel/2006-04/msg > 00446.html > > Unable to handle kernel NULL pointer dereference at virtual > address 00000004 > printing eip: > c0161905 > 07786000 -> *pde = 00000001:058d6001 > 06e54000 -> *pme = 00000000:00000000 > Oops: 0002 [#1] > SMP > Modules linked in: > CPU: 1 > EIP: 0061:[<c0161905>] Not tainted VLI > EFLAGS: 00010046 (2.6.16-xen0 #1) > EIP is at cache_alloc_refill+0x185/0x560 > eax: c0670f00 ebx: ffffffff ecx: 00000010 edx: 00000000 > esi: c8bd1000 edi: 00000002 ebp: c00dfe00 esp: cfa7be3c > ds: 007b es: 007b ss: 0069 > Process kjournald (pid: 1187, threadinfo=cfa7a000 task=cfa0ca70) > Stack: <0>000016a7 00000000 c0525e40 00000050 00000050 > 00000050 c0673b40 c066f800 > c0670f00 c01408db c0664958 cfa7a000 c10f06e0 c8bd101c > 00000001 c10f06e0 > 00000000 00000000 00001000 c0161777 c10f06e0 00000000 > c01679e9 c0673b40 Call Trace: > [<c01408db>] add_to_page_cache+0x6b/0x100 > [<c0161777>] kmem_cache_alloc+0x77/0x80 > [<c01679e9>] alloc_buffer_head+0x19/0x50 > [<c01695fc>] alloc_page_buffers+0x3c/0xe0 > [<c016b16f>] __getblk+0x15f/0x2b0 > [<c01ea978>] journal_get_descriptor_buffer+0x68/0xd0 > [<c01e68d1>] journal_commit_transaction+0xd01/0x11f0 > [<c0128300>] lock_timer_base+0x20/0x50 > [<c0129259>] try_to_del_timer_sync+0x49/0x60 > [<c01e9680>] kjournald+0xc0/0x230 > [<c01e8e80>] commit_timeout+0x0/0x10 > [<c0134130>] autoremove_wake_function+0x0/0x60 > [<c01e95c0>] kjournald+0x0/0x230 > [<c0102cc5>] kernel_thread_helper+0x5/0x10 > Code: 56 0c 8b 3c b8 89 7e 14 89 54 8d 14 41 89 4d 00 8b 44 > 24 18 8b 7e 10 3b 78 38 0f 83 3a ff ff ff 4b 83 fb ff 75 c0 > 8b 16 8b 46 04 <89> 42 > 04 89 10 83 7e 14 ff c7 06 00 01 10 00 c7 46 04 00 02 20 > > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2006-Apr-19  13:18 UTC
Re: [Xen-devel] Another dom0 crash (Unable to handle kernel NULLpointer dereference)
Ian Pratt wrote:>> Subject: [Xen-devel] Another dom0 crash (Unable to handle >> kernel NULLpointer dereference) >> >> This time running Xen 3.0.2-2, PAE > > Please can you supply more information e.g. how many VMs were running, > do you know what they were doing, can you repro, details about the h/w, > amount of memory etc. This is a separate issue from the previous one. > > I seem to recall a similar crash has been reported before, but there was > no repro information in the report either. > > Thanks, > IanThere were roughly 40 VMs running on the machine. Host server is a SuperMicro AS-1020A-T, 16G RAM (only 14 shows in Xen :( ), 3ware 9550SX-4LP h/w raid card. I don''t have an easy way to reproduce this. It seems to have happened this time and last when I asked a few of the VM runners to exec hdparm -t. # lspci 0000:00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8132 PCI-X Bridge (rev 11) 0000:00:01.1 PIC: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC (rev 11) 0000:00:02.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8132 PCI-X Bridge (rev 11) 0000:00:02.1 PIC: Advanced Micro Devices [AMD] AMD-8132 PCI-X IOAPIC (rev 11) 0000:00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07) 0000:00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05) 0000:00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03) 0000:00:07.2 SMBus: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 (rev 02) 0000:00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05) 0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 0000:00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 0000:00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 0000:00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 0000:00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 0000:01:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b) 0000:01:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b) 0000:01:04.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) 0000:02:01.0 RAID bus controller: 3ware Inc: Unknown device 1003 0000:02:03.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 10) 0000:02:03.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 10) 0000:03:03.0 RAID bus controller: Marvell Technology Group Ltd. MV88SX6041 4-port SATA II PCI-X Controller (rev 03) Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2006-Apr-19  23:12 UTC
Re: [Xen-devel] Another dom0 crash (Unable to handle kernel NULL pointer dereference)
Two more today... This happened immediately after hdparm -t /dev/sda 
completed running in dom0.  It appeared as though the rest of the 
domains remained up, at least for a few moments.
Unable to handle kernel NULL pointer dereference at virtual address 00000028
  printing eip:
c042e60d
0e3f0000 -> *pde = 00000001:f4676001
0b076000 -> *pme = 00000000:00000000
Oops: 0000 [#1]
SMP
Modules linked in:
CPU:    0
EIP:    0061:[<c042e60d>]    Not tainted VLI
EFLAGS: 00210292   (2.6.16-xen0 #1)
EIP is at sock_poll+0x1d/0x30
eax: 00000000   ebx: 00000008   ecx: c75bb600   edx: c6afe3c0
esi: 00400000   edi: 00000036   ebp: 0000000a   esp: cd82bea4
ds: 007b   es: 007b   ss: 0069
Process xenstored (pid: 5099, threadinfo=cd82a000 task=c073a030)
Stack: <0>c75bb600 c6afe3c0 00000000 c75bb600 c017a79a c75bb600 00000000 
c85b88c0
        00000000 c85b88dc c85b88e4 c85b88ec c85b88c8 c85b88d0 c85b88d8 
3fffffff
        00000000 00000000 3fffffff 00000000 00000000 00000000 00000000 
00000304
Call Trace:
  [<c017a79a>] do_select+0x23a/0x4a0
  [<c017a490>] __pollwait+0x0/0xd0
  [<c017b2a6>] core_sys_select+0x1f6/0x2f0
  [<c017b71f>] sys_select+0x4f/0x1d0
  [<c0166bab>] sys_write+0x4b/0x80
  [<c0105141>] syscall_call+0x7/0xb
Code: 00 c3 8d b6 00 00 00 00 8d bf 00 00 00 00 53 83 ec 0c 8b 4c 24 14 
8b 44 24 18 8b 51 78 8b 5a 08 89 44 24 08 89 54 24 04 89 0c 24 <ff> 53 
20 83 c4 0c 5b c3 8d 74 26 00 8d bc 27 00 00 00 00 53 83
<1>Unable to handle kernel NULL pointer dereference at virtual address 
00000014
  printing eip:
c0160dce
0e7bd000 -> *pde = 00000001:f11be001
0e7be000 -> *pme = 00000000:00000000
Oops: 0002 [#2]
SMP
Modules linked in:
CPU:    1
EIP:    0061:[<c0160dce>]    Not tainted VLI
EFLAGS: 00010016   (2.6.16-xen0 #1)
EIP is at free_block+0x6e/0xe0
eax: 00000001   ebx: c6afe040   ecx: c6afe240   edx: 00000010
esi: c0670440   edi: c00e0d40   ebp: 00000000   esp: c00d9ed8
ds: 007b   es: 007b   ss: 0069
Process events/1 (pid: 7, threadinfo=c00d8000 task=cfb6a550)
Stack: <0>0000000b c06161d4 c06161d4 c06161c0 0000000b c00e0d40 c0160ea8 
00000000
        c00f5418 00000000 cfb68a50 c0670440 c00e0d40 c00e0da8 c0160f92 
00000000
        c00d8000 cfb68a50 c067046c 00000004 c00d8000 c121de80 c121de84 
cfb68cc0
Call Trace:
  [<c0160ea8>] drain_array_locked+0x68/0xc0
  [<c0160f92>] cache_reap+0x92/0x1e0
  [<c0130390>] run_workqueue+0x70/0xf0
  [<c0160f00>] cache_reap+0x0/0x1e0
  [<c0130640>] worker_thread+0x120/0x160
  [<c0119e70>] default_wake_function+0x0/0x20
  [<c0133f0f>] kthread+0xff/0x110
  [<c0130520>] worker_thread+0x0/0x160
  [<c0133e10>] kthread+0x0/0x110
  [<c0102cc5>] kernel_thread_helper+0x5/0x10
Code: 00 8b 44 24 04 8b 15 68 64 5e c0 8b 0c a8 8d 81 00 00 00 40 c1 e8 
0c c1 e0 05 8b 5c 10 1c 8b 44 24 1c 8b 13 8b 74 87 30 8b 43 04 <89> 42 
04 89 10 31 d2 2b 4b 0c c7 03 00 01 10 00 c7 43 04 00 02
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel