Peter Sandin
2011-Apr-12 15:58 UTC
[Xen-devel] 2.6.38 x86_64 domU null pointer in xennet_alloc_rx_buffers
We''ve got some 64 bit guests that have been trying to dereference a null pointer in xennet_alloc_rx_buffers. We have only been receiving reports of this issue since introducing 2.6.38 guest kernels. The only reports that we have received of this are on guests that are running 64 bit kernels. These reports have come from multiple separate physical machines. One of the instances that ran in to this issue was repeatedly restarting the nginx web server, and failing because port 80 was already in use, however we were unable to replicate the issue using this method in a controlled environment. Any suggestions on replicating or resolving this issue are would be appreciated. More traces, the .config and kernel binary can be found at: http://thesandins.net/xen/2.6.38-x86_64/ -- BUG: Bad page state in process swapper pfn:5bb31 page:ffffea000140f2b8 count:-1 mapcount:0 mapping: (null) index:0xffff88005b8bdf80 page flags: 0x100000000000000() BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9 PGD 7bacb067 PUD 7b930067 PMD 0 Oops: 0002 [#1] SMP last sysfs file: /sys/kernel/uevent_seqnum CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.38-x86_64-linode17 #1 RIP: e030:[<ffffffff81370b27>] [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9 RSP: e02b:ffff88007ff7fcf0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff88007bfa85c0 RCX: 0000000000000000 RDX: ffff88007d36bf00 RSI: ffff88007b309400 RDI: ffff88007b309400 RBP: ffff88007ff7fd50 R08: 0000000000000000 R09: 000000000007195a R10: 0000000000000001 R11: 00000000000006fa R12: ffff88007bfa92b0 R13: ffff88007bfa8000 R14: 0000000000000001 R15: 00000000000002cd FS: 00007f4de5d42760(0000) GS:ffff88007ff7c000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 000000007bb74000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a9b020) Stack: ffff88007d36bf00 ffff88007bfa8000 ffff88007d36bf00 ffff88007bfa85c0 ffff88007ff7fd50 00000017813f46c5 ffff88007d36bf00 ffff88007bfa85c0 ffff88007ff7fe10 ffff88007bfa8000 0000000000000001 ffff88007bfa85c0 Call Trace: <IRQ> [<ffffffff81372822>] xennet_poll+0xbef/0xc85 [<ffffffff815272aa>] ? _raw_spin_unlock_irqrestore+0x19/0x1c [<ffffffff813f4d51>] net_rx_action+0xb6/0x1dc [<ffffffff812ef6e7>] ? unmask_evtchn+0x1f/0xa3 [<ffffffff810431a4>] __do_softirq+0xc7/0x1a3 [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1 [<ffffffff810069b2>] ? check_events+0x12/0x20 [<ffffffff8100a85c>] call_softirq+0x1c/0x30 [<ffffffff8100bebd>] do_softirq+0x41/0x7e [<ffffffff8104303b>] irq_exit+0x36/0x78 [<ffffffff812f022c>] xen_evtchn_do_upcall+0x2f/0x3c [<ffffffff8100a8ae>] xen_do_hypervisor_callback+0x1e/0x30 <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a [<ffffffff81010998>] ? default_idle+0x4b/0x85 [<ffffffff81008d53>] ? cpu_idle+0x60/0x97 [<ffffffff8151b349>] ? rest_init+0x6d/0x6f [<ffffffff81b2ad34>] ? start_kernel+0x37f/0x38a [<ffffffff81b2a2cd>] ? x86_64_start_reservations+0xb8/0xbc [<ffffffff81b2de71>] ? xen_start_kernel+0x528/0x52f Code: c8 00 00 00 41 ff c6 48 89 44 37 38 8b 82 c4 00 00 00 48 8b b2 c8 00 00 00 66 c7 04 06 01 00 49 8b 44 24 08 4c 89 22 48 89 4 2 08 <48> 89 10 49 89 54 24 08 ff 83 00 0d 00 00 44 3b 75 cc 0f 8c 5a RIP [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9 RSP <ffff88007ff7fcf0> CR2: 0000000000000000 ---[ end trace e0e245c8a8426fde ]--- Kernel panic - not syncing: Fatal exception in interrupt Pid: 0, comm: swapper Tainted: G D 2.6.38-x86_64-linode17 #1 Call Trace: <IRQ> [<ffffffff8152550d>] ? panic+0x8c/0x195 [<ffffffff8152856b>] ? oops_end+0xb7/0xc7 [<ffffffff8102709f>] ? no_context+0x1f7/0x206 [<ffffffff810ad088>] ? get_page_from_freelist+0x445/0x715 [<ffffffff81027236>] ? __bad_area_nosemaphore+0x188/0x1ab [<ffffffff8144f390>] ? tcp_v4_rcv+0x521/0x681 [<ffffffff81027267>] ? bad_area_nosemaphore+0xe/0x10 [<ffffffff8152a4e7>] ? do_page_fault+0x1ef/0x3ee [<ffffffff8144f390>] ? tcp_v4_rcv+0x521/0x681 [<ffffffff810ad55c>] ? __alloc_pages_nodemask+0x14d/0x6ab [<ffffffff813eb0bb>] ? __netdev_alloc_skb+0x1d/0x3a [<ffffffff81527a55>] ? page_fault+0x25/0x30 [<ffffffff81370b27>] ? xennet_alloc_rx_buffers+0xe1/0x2d9 [<ffffffff81372822>] ? xennet_poll+0xbef/0xc85 [<ffffffff815272aa>] ? _raw_spin_unlock_irqrestore+0x19/0x1c [<ffffffff813f4d51>] ? net_rx_action+0xb6/0x1dc [<ffffffff812ef6e7>] ? unmask_evtchn+0x1f/0xa3 [<ffffffff810431a4>] ? __do_softirq+0xc7/0x1a3 [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1 [<ffffffff810069b2>] ? check_events+0x12/0x20 [<ffffffff8100a85c>] ? call_softirq+0x1c/0x30 [<ffffffff8100bebd>] ? do_softirq+0x41/0x7e [<ffffffff8104303b>] ? irq_exit+0x36/0x78 [<ffffffff812f022c>] ? xen_evtchn_do_upcall+0x2f/0x3c [<ffffffff8100a8ae>] ? xen_do_hypervisor_callback+0x1e/0x30 <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a [<ffffffff81010998>] ? default_idle+0x4b/0x85 [<ffffffff81008d53>] ? cpu_idle+0x60/0x97 [<ffffffff8151b349>] ? rest_init+0x6d/0x6f [<ffffffff81b2ad34>] ? start_kernel+0x37f/0x38a [<ffffffff81b2a2cd>] ? x86_64_start_reservations+0xb8/0xbc [<ffffffff81b2de71>] ? xen_start_kernel+0x528/0x52f --Peter _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Apr-12 21:06 UTC
Re: [Xen-devel] 2.6.38 x86_64 domU null pointer in xennet_alloc_rx_buffers
On Tue, Apr 12, 2011 at 11:58:35AM -0400, Peter Sandin wrote:> > We''ve got some 64 bit guests that have been trying to dereference a null pointer in xennet_alloc_rx_buffers. We have only been receiving reports of this issue since introducing 2.6.38 guest kernels. The only reports that we have received of this are on guests that are running 64 bit kernels. These reports have come from multiple separate physical machines. One of the instances that ran in to this issue was repeatedly restarting the nginx web server, and failing because port 80 was already in use, however we were unable to replicate the issue using this method in a controlled environment. Any suggestions on replicating or resolving this issue are would be appreciated.> > More traces, the .config and kernel binary can be found at: > > http://thesandins.net/xen/2.6.38-x86_64/Nothing in the Xen hypervisor console?> > -- > > BUG: Bad page state in process swapper pfn:5bb31 > page:ffffea000140f2b8 count:-1 mapcount:0 mapping: (null) index:0xffff88005b8bdf80 > page flags: 0x100000000000000() > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9So it looks as if it just does an alloc_page, and alloc_page does an check_new_page(), which checks the values mentioned above. The one that is odd is the page->_count (it should have been zero, it is -1). .. which sadly is not getting us closer to trying to reproduce this. But it looks familiar..> PGD 7bacb067 PUD 7b930067 PMD 0 > Oops: 0002 [#1] SMP > last sysfs file: /sys/kernel/uevent_seqnum > CPU 0 > Modules linked in: > > Pid: 0, comm: swapper Not tainted 2.6.38-x86_64-linode17 #1 > RIP: e030:[<ffffffff81370b27>] [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9 > RSP: e02b:ffff88007ff7fcf0 EFLAGS: 00010202 > RAX: 0000000000000000 RBX: ffff88007bfa85c0 RCX: 0000000000000000 > RDX: ffff88007d36bf00 RSI: ffff88007b309400 RDI: ffff88007b309400 > RBP: ffff88007ff7fd50 R08: 0000000000000000 R09: 000000000007195a > R10: 0000000000000001 R11: 00000000000006fa R12: ffff88007bfa92b0 > R13: ffff88007bfa8000 R14: 0000000000000001 R15: 00000000000002cd > FS: 00007f4de5d42760(0000) GS:ffff88007ff7c000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000000000000000 CR3: 000000007bb74000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a9b020) > Stack: > ffff88007d36bf00 ffff88007bfa8000 ffff88007d36bf00 ffff88007bfa85c0 > ffff88007ff7fd50 00000017813f46c5 ffff88007d36bf00 ffff88007bfa85c0 > ffff88007ff7fe10 ffff88007bfa8000 0000000000000001 ffff88007bfa85c0 > Call Trace: > <IRQ> > [<ffffffff81372822>] xennet_poll+0xbef/0xc85 > [<ffffffff815272aa>] ? _raw_spin_unlock_irqrestore+0x19/0x1c > [<ffffffff813f4d51>] net_rx_action+0xb6/0x1dc > [<ffffffff812ef6e7>] ? unmask_evtchn+0x1f/0xa3 > [<ffffffff810431a4>] __do_softirq+0xc7/0x1a3 > [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1 > [<ffffffff810069b2>] ? check_events+0x12/0x20 > [<ffffffff8100a85c>] call_softirq+0x1c/0x30 > [<ffffffff8100bebd>] do_softirq+0x41/0x7e > [<ffffffff8104303b>] irq_exit+0x36/0x78 > [<ffffffff812f022c>] xen_evtchn_do_upcall+0x2f/0x3c > [<ffffffff8100a8ae>] xen_do_hypervisor_callback+0x1e/0x30 > <EOI> > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a > [<ffffffff81010998>] ? default_idle+0x4b/0x85 > [<ffffffff81008d53>] ? cpu_idle+0x60/0x97 > [<ffffffff8151b349>] ? rest_init+0x6d/0x6f > [<ffffffff81b2ad34>] ? start_kernel+0x37f/0x38a > [<ffffffff81b2a2cd>] ? x86_64_start_reservations+0xb8/0xbc > [<ffffffff81b2de71>] ? xen_start_kernel+0x528/0x52f > Code: c8 00 00 00 41 ff c6 48 89 44 37 38 8b 82 c4 00 00 00 48 8b b2 c8 00 00 00 66 c7 04 06 01 00 49 8b 44 24 08 4c 89 22 48 89 4 > 2 08 <48> 89 10 49 89 54 24 08 ff 83 00 0d 00 00 44 3b 75 cc 0f 8c 5a > RIP [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9 > RSP <ffff88007ff7fcf0> > CR2: 0000000000000000 > ---[ end trace e0e245c8a8426fde ]--- > Kernel panic - not syncing: Fatal exception in interrupt > Pid: 0, comm: swapper Tainted: G D 2.6.38-x86_64-linode17 #1 > Call Trace: > <IRQ> [<ffffffff8152550d>] ? panic+0x8c/0x195 > [<ffffffff8152856b>] ? oops_end+0xb7/0xc7 > [<ffffffff8102709f>] ? no_context+0x1f7/0x206 > [<ffffffff810ad088>] ? get_page_from_freelist+0x445/0x715 > [<ffffffff81027236>] ? __bad_area_nosemaphore+0x188/0x1ab > [<ffffffff8144f390>] ? tcp_v4_rcv+0x521/0x681 > [<ffffffff81027267>] ? bad_area_nosemaphore+0xe/0x10 > [<ffffffff8152a4e7>] ? do_page_fault+0x1ef/0x3ee > [<ffffffff8144f390>] ? tcp_v4_rcv+0x521/0x681 > [<ffffffff810ad55c>] ? __alloc_pages_nodemask+0x14d/0x6ab > [<ffffffff813eb0bb>] ? __netdev_alloc_skb+0x1d/0x3a > [<ffffffff81527a55>] ? page_fault+0x25/0x30 > [<ffffffff81370b27>] ? xennet_alloc_rx_buffers+0xe1/0x2d9 > [<ffffffff81372822>] ? xennet_poll+0xbef/0xc85 > [<ffffffff815272aa>] ? _raw_spin_unlock_irqrestore+0x19/0x1c > [<ffffffff813f4d51>] ? net_rx_action+0xb6/0x1dc > [<ffffffff812ef6e7>] ? unmask_evtchn+0x1f/0xa3 > [<ffffffff810431a4>] ? __do_softirq+0xc7/0x1a3 > [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1 > [<ffffffff810069b2>] ? check_events+0x12/0x20 > [<ffffffff8100a85c>] ? call_softirq+0x1c/0x30 > [<ffffffff8100bebd>] ? do_softirq+0x41/0x7e > [<ffffffff8104303b>] ? irq_exit+0x36/0x78 > [<ffffffff812f022c>] ? xen_evtchn_do_upcall+0x2f/0x3c > [<ffffffff8100a8ae>] ? xen_do_hypervisor_callback+0x1e/0x30 > <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a > [<ffffffff81010998>] ? default_idle+0x4b/0x85 > [<ffffffff81008d53>] ? cpu_idle+0x60/0x97 > [<ffffffff8151b349>] ? rest_init+0x6d/0x6f > [<ffffffff81b2ad34>] ? start_kernel+0x37f/0x38a > [<ffffffff81b2a2cd>] ? x86_64_start_reservations+0xb8/0xbc > [<ffffffff81b2de71>] ? xen_start_kernel+0x528/0x52f > > --Peter > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
retsyx
2011-May-23 16:19 UTC
[Xen-devel] Re: 2.6.38 x86_64 domU null pointer in xennet_alloc_rx_buffers
Here is a similar kernel panic on a 32-bit system. I can get this to consistently happen when the panicked machine is being used a PPTP server. The panic is triggered instantly when a particular PPTP client attempts to establish a tunnel (encryption is off BTW): BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c0193144>] page_address+0x14/0xe0 *pdpt = 000000001eaf3007 *pde = 0000000000000000 Oops: 0000 [#1] SMP last sysfs file: /sys/kernel/uevent_seqnum Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.38.3-linode32 #1 EIP: 0061:[<c0193144>] EFLAGS: 00010286 CPU: 0 EIP is at page_address+0x14/0xe0 EAX: 00000000 EBX: 00000000 ECX: 00000251 EDX: 00000250 ESI: 00000250 EDI: de6b7980 EBP: 20411000 ESP: df40feb4 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 Process swapper (pid: 0, ti=df40e000 task=c079df20 task.ti=c0788000) Stack: deb58340 00000250 de6b7980 20411000 c049e742 00000000 000000d0 dee8a000 00d02280 deb58be4 00000010 deb58000 deb59010 000000c0 deb58340 00000001 deb58340 de6af8d8 c049fcff 00000000 00000001 df40ff9c dfbba380 df40ff78 Call Trace: [<c049e742>] ? xennet_alloc_rx_buffers+0x1c2/0x300 [<c049fcff>] ? xennet_poll+0x4df/0xc20 [<c04fc43a>] ? net_rx_action+0x9a/0x130 [<c01380ec>] ? __do_softirq+0x7c/0x130 [<c0138070>] ? __do_softirq+0x0/0x130 <IRQ> [<c0137fe5>] ? irq_exit+0x65/0x70 [<c043a03d>] ? xen_evtchn_do_upcall+0x1d/0x30 [<c0109487>] ? xen_do_upcall+0x7/0xc [<c01013a7>] ? hypercall_page+0x3a7/0x1010 [<c0105b8f>] ? xen_safe_halt+0xf/0x20 [<c010f66f>] ? default_idle+0x2f/0x60 [<c0107ed2>] ? cpu_idle+0x42/0x70 [<c07ca8ac>] ? start_kernel+0x2da/0x2df [<c07ca410>] ? unknown_bootoption+0x0/0x190 [<c07cdaa5>] ? xen_start_kernel+0x530/0x538 Code: 89 c2 b8 2c 1e 85 c0 e9 3f ff ff ff 0f 0b eb fe 8d b4 26 00 00 00 00 83 ec 10 89 1c 24 89 c3 89 74 24 04 89 7c 24 08 89 6c 24 0c <8b> 00 c1 e8 1e 69 c0 80 03 00 00 05 40 05 7c c0 2b 80 4c 03 00 EIP: [<c0193144>] page_address+0x14/0xe0 SS:ESP 0069:df40feb4 CR2: 0000000000000000 ---[ end trace 19ddaabd0d19ad12 ]--- Kernel panic - not syncing: Fatal exception in interrupt Pid: 0, comm: swapper Tainted: G D 2.6.38.3-linode32 #1 Call Trace: [<c063cfbf>] ? panic+0x57/0x13e [<c010bec6>] ? oops_end+0x96/0xa0 [<c011e362>] ? no_context+0xc2/0x190 [<c011e58f>] ? bad_area_nosemaphore+0xf/0x20 [<c011e943>] ? do_page_fault+0x223/0x3e0 [<c0184325>] ? __alloc_pages_nodemask+0xf5/0x670 [<c011e720>] ? do_page_fault+0x0/0x3e0 [<c063fea6>] ? error_code+0x5a/0x60 [<c011e720>] ? do_page_fault+0x0/0x3e0 [<c0193144>] ? page_address+0x14/0xe0 [<c049e742>] ? xennet_alloc_rx_buffers+0x1c2/0x300 [<c049fcff>] ? xennet_poll+0x4df/0xc20 [<c04fc43a>] ? net_rx_action+0x9a/0x130 [<c01380ec>] ? __do_softirq+0x7c/0x130 [<c0138070>] ? __do_softirq+0x0/0x130 <IRQ> [<c0137fe5>] ? irq_exit+0x65/0x70 [<c043a03d>] ? xen_evtchn_do_upcall+0x1d/0x30 [<c0109487>] ? xen_do_upcall+0x7/0xc [<c01013a7>] ? hypercall_page+0x3a7/0x1010 [<c0105b8f>] ? xen_safe_halt+0xf/0x20 [<c010f66f>] ? default_idle+0x2f/0x60 [<c0107ed2>] ? cpu_idle+0x42/0x70 [<c07ca8ac>] ? start_kernel+0x2da/0x2df [<c07ca410>] ? unknown_bootoption+0x0/0x190 [<c07cdaa5>] ? xen_start_kernel+0x530/0x538 -- View this message in context: http://xen.1045712.n5.nabble.com/2-6-38-x86-64-domU-null-pointer-in-xennet-alloc-rx-buffers-tp4298573p4419471.html Sent from the Xen - Dev mailing list archive at Nabble.com. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-24 15:43 UTC
Re: [Xen-devel] Re: 2.6.38 x86_64 domU null pointer in xennet_alloc_rx_buffers
On Mon, May 23, 2011 at 09:19:54AM -0700, retsyx wrote:> Here is a similar kernel panic on a 32-bit system. I can get this to > consistently happen when the panicked machine is being used a PPTP server. > The panic is triggered instantly when a particular PPTP client attempts to > establish a tunnel (encryption is off BTW):Great. Do you have a step-by-step instruction on how to reproduce this failure? The more details the better.> > > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [<c0193144>] page_address+0x14/0xe0 > *pdpt = 000000001eaf3007 *pde = 0000000000000000 > Oops: 0000 [#1] SMP > last sysfs file: /sys/kernel/uevent_seqnum > Modules linked in: > > Pid: 0, comm: swapper Not tainted 2.6.38.3-linode32 #1 > EIP: 0061:[<c0193144>] EFLAGS: 00010286 CPU: 0 > EIP is at page_address+0x14/0xe0 > EAX: 00000000 EBX: 00000000 ECX: 00000251 EDX: 00000250 > ESI: 00000250 EDI: de6b7980 EBP: 20411000 ESP: df40feb4 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > Process swapper (pid: 0, ti=df40e000 task=c079df20 task.ti=c0788000) > Stack: > deb58340 00000250 de6b7980 20411000 c049e742 00000000 000000d0 dee8a000 > 00d02280 deb58be4 00000010 deb58000 deb59010 000000c0 deb58340 00000001 > deb58340 de6af8d8 c049fcff 00000000 00000001 df40ff9c dfbba380 df40ff78 > Call Trace: > [<c049e742>] ? xennet_alloc_rx_buffers+0x1c2/0x300 > [<c049fcff>] ? xennet_poll+0x4df/0xc20 > [<c04fc43a>] ? net_rx_action+0x9a/0x130 > [<c01380ec>] ? __do_softirq+0x7c/0x130 > [<c0138070>] ? __do_softirq+0x0/0x130 > <IRQ> > [<c0137fe5>] ? irq_exit+0x65/0x70 > [<c043a03d>] ? xen_evtchn_do_upcall+0x1d/0x30 > [<c0109487>] ? xen_do_upcall+0x7/0xc > [<c01013a7>] ? hypercall_page+0x3a7/0x1010 > [<c0105b8f>] ? xen_safe_halt+0xf/0x20 > [<c010f66f>] ? default_idle+0x2f/0x60 > [<c0107ed2>] ? cpu_idle+0x42/0x70 > [<c07ca8ac>] ? start_kernel+0x2da/0x2df > [<c07ca410>] ? unknown_bootoption+0x0/0x190 > [<c07cdaa5>] ? xen_start_kernel+0x530/0x538 > Code: 89 c2 b8 2c 1e 85 c0 e9 3f ff ff ff 0f 0b eb fe 8d b4 26 00 00 00 00 > 83 ec 10 89 1c 24 89 c3 89 74 24 04 89 7c 24 08 89 6c 24 0c <8b> 00 c1 e8 1e > 69 c0 80 03 00 00 05 40 05 7c c0 2b 80 4c 03 00 > EIP: [<c0193144>] page_address+0x14/0xe0 SS:ESP 0069:df40feb4 > CR2: 0000000000000000 > ---[ end trace 19ddaabd0d19ad12 ]--- > Kernel panic - not syncing: Fatal exception in interrupt > Pid: 0, comm: swapper Tainted: G D 2.6.38.3-linode32 #1 > Call Trace: > [<c063cfbf>] ? panic+0x57/0x13e > [<c010bec6>] ? oops_end+0x96/0xa0 > [<c011e362>] ? no_context+0xc2/0x190 > [<c011e58f>] ? bad_area_nosemaphore+0xf/0x20 > [<c011e943>] ? do_page_fault+0x223/0x3e0 > [<c0184325>] ? __alloc_pages_nodemask+0xf5/0x670 > [<c011e720>] ? do_page_fault+0x0/0x3e0 > [<c063fea6>] ? error_code+0x5a/0x60 > [<c011e720>] ? do_page_fault+0x0/0x3e0 > [<c0193144>] ? page_address+0x14/0xe0 > [<c049e742>] ? xennet_alloc_rx_buffers+0x1c2/0x300 > [<c049fcff>] ? xennet_poll+0x4df/0xc20 > [<c04fc43a>] ? net_rx_action+0x9a/0x130 > [<c01380ec>] ? __do_softirq+0x7c/0x130 > [<c0138070>] ? __do_softirq+0x0/0x130 > <IRQ> [<c0137fe5>] ? irq_exit+0x65/0x70 > [<c043a03d>] ? xen_evtchn_do_upcall+0x1d/0x30 > [<c0109487>] ? xen_do_upcall+0x7/0xc > [<c01013a7>] ? hypercall_page+0x3a7/0x1010 > [<c0105b8f>] ? xen_safe_halt+0xf/0x20 > [<c010f66f>] ? default_idle+0x2f/0x60 > [<c0107ed2>] ? cpu_idle+0x42/0x70 > [<c07ca8ac>] ? start_kernel+0x2da/0x2df > [<c07ca410>] ? unknown_bootoption+0x0/0x190 > [<c07cdaa5>] ? xen_start_kernel+0x530/0x538 > > > -- > View this message in context: http://xen.1045712.n5.nabble.com/2-6-38-x86-64-domU-null-pointer-in-xennet-alloc-rx-buffers-tp4298573p4419471.html > Sent from the Xen - Dev mailing list archive at Nabble.com. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
retsyx
2011-May-24 21:13 UTC
Re: [Xen-devel] Re: 2.6.38 x86_64 domU null pointer in xennet_alloc_rx_buffers
Sorry about that. Yes, for me it happens very consistently when the machine is configured as a PPTP server and a particular OS X client (over a transcontinental link) attempts to establish a VPN connection. The PPTP configuration is very simple. PPTP with the standard options delta these changes which I don''t think are important but am including for completeness. In pptpd-options, relative to the default config file: require-mppe-128 is commented out. Add options: noipx mtu 1490 mru 1490 chap-secrets is configured with the equivalent of: username * password * The OS X client is configured to connect with PPTP, the username/password and no encryption. I have two OS X clients, one is across the Peninsula from the machine in question, the second is in the Middle East. The local client has never caused a panic when connecting while the remote client (in the ME) will almost always cause a panic. One item which appears to have some effect on the frequency of panics and their nature is the vm.min_free_kbytes sysctl. When left to its own devices on this particular machine it defaults to the 2900 region and the panic detailed earlier happens every time. When vm.min_free_kbytes is manually set to 16384, it can take a couple tries before the machine panics and the trace of the panic seems to vary. Thanks. On May 24, 2011, at 8:43 AM, Konrad Rzeszutek Wilk wrote:> On Mon, May 23, 2011 at 09:19:54AM -0700, retsyx wrote: >> Here is a similar kernel panic on a 32-bit system. I can get this to >> consistently happen when the panicked machine is being used a PPTP server. >> The panic is triggered instantly when a particular PPTP client attempts to >> establish a tunnel (encryption is off BTW): > > Great. > > Do you have a step-by-step instruction on how to reproduce this failure? > The more details the better. > >> >> >> BUG: unable to handle kernel NULL pointer dereference at (null) >> IP: [<c0193144>] page_address+0x14/0xe0 >> *pdpt = 000000001eaf3007 *pde = 0000000000000000 >> Oops: 0000 [#1] SMP >> last sysfs file: /sys/kernel/uevent_seqnum >> Modules linked in: >> >> Pid: 0, comm: swapper Not tainted 2.6.38.3-linode32 #1 >> EIP: 0061:[<c0193144>] EFLAGS: 00010286 CPU: 0 >> EIP is at page_address+0x14/0xe0 >> EAX: 00000000 EBX: 00000000 ECX: 00000251 EDX: 00000250 >> ESI: 00000250 EDI: de6b7980 EBP: 20411000 ESP: df40feb4 >> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 >> Process swapper (pid: 0, ti=df40e000 task=c079df20 task.ti=c0788000) >> Stack: >> deb58340 00000250 de6b7980 20411000 c049e742 00000000 000000d0 dee8a000 >> 00d02280 deb58be4 00000010 deb58000 deb59010 000000c0 deb58340 00000001 >> deb58340 de6af8d8 c049fcff 00000000 00000001 df40ff9c dfbba380 df40ff78 >> Call Trace: >> [<c049e742>] ? xennet_alloc_rx_buffers+0x1c2/0x300 >> [<c049fcff>] ? xennet_poll+0x4df/0xc20 >> [<c04fc43a>] ? net_rx_action+0x9a/0x130 >> [<c01380ec>] ? __do_softirq+0x7c/0x130 >> [<c0138070>] ? __do_softirq+0x0/0x130 >> <IRQ> >> [<c0137fe5>] ? irq_exit+0x65/0x70 >> [<c043a03d>] ? xen_evtchn_do_upcall+0x1d/0x30 >> [<c0109487>] ? xen_do_upcall+0x7/0xc >> [<c01013a7>] ? hypercall_page+0x3a7/0x1010 >> [<c0105b8f>] ? xen_safe_halt+0xf/0x20 >> [<c010f66f>] ? default_idle+0x2f/0x60 >> [<c0107ed2>] ? cpu_idle+0x42/0x70 >> [<c07ca8ac>] ? start_kernel+0x2da/0x2df >> [<c07ca410>] ? unknown_bootoption+0x0/0x190 >> [<c07cdaa5>] ? xen_start_kernel+0x530/0x538 >> Code: 89 c2 b8 2c 1e 85 c0 e9 3f ff ff ff 0f 0b eb fe 8d b4 26 00 00 00 00 >> 83 ec 10 89 1c 24 89 c3 89 74 24 04 89 7c 24 08 89 6c 24 0c <8b> 00 c1 e8 1e >> 69 c0 80 03 00 00 05 40 05 7c c0 2b 80 4c 03 00 >> EIP: [<c0193144>] page_address+0x14/0xe0 SS:ESP 0069:df40feb4 >> CR2: 0000000000000000 >> ---[ end trace 19ddaabd0d19ad12 ]--- >> Kernel panic - not syncing: Fatal exception in interrupt >> Pid: 0, comm: swapper Tainted: G D 2.6.38.3-linode32 #1 >> Call Trace: >> [<c063cfbf>] ? panic+0x57/0x13e >> [<c010bec6>] ? oops_end+0x96/0xa0 >> [<c011e362>] ? no_context+0xc2/0x190 >> [<c011e58f>] ? bad_area_nosemaphore+0xf/0x20 >> [<c011e943>] ? do_page_fault+0x223/0x3e0 >> [<c0184325>] ? __alloc_pages_nodemask+0xf5/0x670 >> [<c011e720>] ? do_page_fault+0x0/0x3e0 >> [<c063fea6>] ? error_code+0x5a/0x60 >> [<c011e720>] ? do_page_fault+0x0/0x3e0 >> [<c0193144>] ? page_address+0x14/0xe0 >> [<c049e742>] ? xennet_alloc_rx_buffers+0x1c2/0x300 >> [<c049fcff>] ? xennet_poll+0x4df/0xc20 >> [<c04fc43a>] ? net_rx_action+0x9a/0x130 >> [<c01380ec>] ? __do_softirq+0x7c/0x130 >> [<c0138070>] ? __do_softirq+0x0/0x130 >> <IRQ> [<c0137fe5>] ? irq_exit+0x65/0x70 >> [<c043a03d>] ? xen_evtchn_do_upcall+0x1d/0x30 >> [<c0109487>] ? xen_do_upcall+0x7/0xc >> [<c01013a7>] ? hypercall_page+0x3a7/0x1010 >> [<c0105b8f>] ? xen_safe_halt+0xf/0x20 >> [<c010f66f>] ? default_idle+0x2f/0x60 >> [<c0107ed2>] ? cpu_idle+0x42/0x70 >> [<c07ca8ac>] ? start_kernel+0x2da/0x2df >> [<c07ca410>] ? unknown_bootoption+0x0/0x190 >> [<c07cdaa5>] ? xen_start_kernel+0x530/0x538 >> >> >> -- >> View this message in context: http://xen.1045712.n5.nabble.com/2-6-38-x86-64-domU-null-pointer-in-xennet-alloc-rx-buffers-tp4298573p4419471.html >> Sent from the Xen - Dev mailing list archive at Nabble.com. >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel