Christopher S. Aker
2011-Apr-26 15:32 UTC
[Xen-devel] 2.6.38 x86_64 domU - BUG: Bad page state
We''ve been experiencing this behavior since switching to 2.6.38 64 bit. Lots of reports across our fleet, so not an isolated problem... DomU: 2.6.38 x86_64 Xen: 3.4.1 BUG: Bad page state in process swapper pfn:1a399 page:ffffea00005bc978 count:-1 mapcount:0 mapping: (null) index:0xffff88001a399700 page flags: 0x100000000000000() Pid: 0, comm: swapper Not tainted 2.6.38-x86_64-linode17 #1 Call Trace: <IRQ> [<ffffffff810aa910>] ? dump_page+0xb1/0xb6 [<ffffffff810ab86a>] ? bad_page+0xd8/0xf0 [<ffffffff810ad0ca>] ? get_page_from_freelist+0x487/0x715 [<ffffffff8100699f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff810dc49a>] ? kmem_cache_free+0x71/0xad [<ffffffff810ad55c>] ? __alloc_pages_nodemask+0x14d/0x6ab [<ffffffff81403a73>] ? __netdev_alloc_skb+0x1d/0x3a [<ffffffff8144b392>] ? ip_rcv_finish+0x319/0x343 [<ffffffff81403a73>] ? __netdev_alloc_skb+0x1d/0x3a [<ffffffff810d6f35>] ? alloc_pages_current+0xaa/0xcd [<ffffffff81372fd0>] ? xennet_alloc_rx_buffers+0x7a/0x2d9 [<ffffffff81374d32>] ? xennet_poll+0xbef/0xc85 [<ffffffff8100699f>] ? xen_restore_fl_direct_end+0x0/0x1 [<ffffffff8140d709>] ? net_rx_action+0xb6/0x1dc [<ffffffff812f1bf7>] ? unmask_evtchn+0x1f/0xa3 [<ffffffff810431a4>] ? __do_softirq+0xc7/0x1a3 [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1 [<ffffffff810069b2>] ? check_events+0x12/0x20 [<ffffffff8100a85c>] ? call_softirq+0x1c/0x30 [<ffffffff8100bebd>] ? do_softirq+0x41/0x7e [<ffffffff8104303b>] ? irq_exit+0x36/0x78 [<ffffffff812f273c>] ? xen_evtchn_do_upcall+0x2f/0x3c [<ffffffff8100a8ae>] ? xen_do_hypervisor_callback+0x1e/0x30 <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a [<ffffffff81010998>] ? default_idle+0x4b/0x85 [<ffffffff81008d53>] ? cpu_idle+0x60/0x97 [<ffffffff81533d09>] ? rest_init+0x6d/0x6f [<ffffffff81b2bd34>] ? start_kernel+0x37f/0x38a [<ffffffff81b2b2cd>] ? x86_64_start_reservations+0xb8/0xbc [<ffffffff81b2ee71>] ? xen_start_kernel+0x528/0x52f ... it continues with more BUGs. Full log here: http://www.theshore.net/~caker/xen/BUGS/2.6.38/ Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Apr-26 17:43 UTC
Re: [Xen-devel] 2.6.38 x86_64 domU - BUG: Bad page state
On Tue, Apr 26, 2011 at 11:32:29AM -0400, Christopher S. Aker wrote:> We''ve been experiencing this behavior since switching to 2.6.38 64 > bit. Lots of reports across our fleet, so not an isolated > problem... > > DomU: 2.6.38 x86_64 > Xen: 3.4.1 > > BUG: Bad page state in process swapper pfn:1a399 > page:ffffea00005bc978 count:-1 mapcount:0 mapping: (null)And the same issue as somebody else reported. Where the page count is negative. Ian, any thoughts on this? Could the grant freeing have a race? Or double-freeing?> index:0xffff88001a399700 > page flags: 0x100000000000000() > Pid: 0, comm: swapper Not tainted 2.6.38-x86_64-linode17 #1 > Call Trace: > <IRQ> [<ffffffff810aa910>] ? dump_page+0xb1/0xb6 > [<ffffffff810ab86a>] ? bad_page+0xd8/0xf0 > [<ffffffff810ad0ca>] ? get_page_from_freelist+0x487/0x715 > [<ffffffff8100699f>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff810dc49a>] ? kmem_cache_free+0x71/0xad > [<ffffffff810ad55c>] ? __alloc_pages_nodemask+0x14d/0x6ab > [<ffffffff81403a73>] ? __netdev_alloc_skb+0x1d/0x3a > [<ffffffff8144b392>] ? ip_rcv_finish+0x319/0x343 > [<ffffffff81403a73>] ? __netdev_alloc_skb+0x1d/0x3a > [<ffffffff810d6f35>] ? alloc_pages_current+0xaa/0xcd > [<ffffffff81372fd0>] ? xennet_alloc_rx_buffers+0x7a/0x2d9 > [<ffffffff81374d32>] ? xennet_poll+0xbef/0xc85 > [<ffffffff8100699f>] ? xen_restore_fl_direct_end+0x0/0x1 > [<ffffffff8140d709>] ? net_rx_action+0xb6/0x1dc > [<ffffffff812f1bf7>] ? unmask_evtchn+0x1f/0xa3 > [<ffffffff810431a4>] ? __do_softirq+0xc7/0x1a3 > [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1 > [<ffffffff810069b2>] ? check_events+0x12/0x20 > [<ffffffff8100a85c>] ? call_softirq+0x1c/0x30 > [<ffffffff8100bebd>] ? do_softirq+0x41/0x7e > [<ffffffff8104303b>] ? irq_exit+0x36/0x78 > [<ffffffff812f273c>] ? xen_evtchn_do_upcall+0x2f/0x3c > [<ffffffff8100a8ae>] ? xen_do_hypervisor_callback+0x1e/0x30 > <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a > [<ffffffff81010998>] ? default_idle+0x4b/0x85 > [<ffffffff81008d53>] ? cpu_idle+0x60/0x97 > [<ffffffff81533d09>] ? rest_init+0x6d/0x6f > [<ffffffff81b2bd34>] ? start_kernel+0x37f/0x38a > [<ffffffff81b2b2cd>] ? x86_64_start_reservations+0xb8/0xbc > [<ffffffff81b2ee71>] ? xen_start_kernel+0x528/0x52f > > ... it continues with more BUGs. Full log here: > > http://www.theshore.net/~caker/xen/BUGS/2.6.38/ > > Thanks, > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-May-03 11:50 UTC
Re: [Xen-devel] 2.6.38 x86_64 domU - BUG: Bad page state
On Tue, 2011-04-26 at 18:43 +0100, Konrad Rzeszutek Wilk wrote:> On Tue, Apr 26, 2011 at 11:32:29AM -0400, Christopher S. Aker wrote: > > We''ve been experiencing this behavior since switching to 2.6.38 64 > > bit. Lots of reports across our fleet, so not an isolated > > problem... > > > > DomU: 2.6.38 x86_64 > > Xen: 3.4.1 > > > > BUG: Bad page state in process swapper pfn:1a399 > > page:ffffea00005bc978 count:-1 mapcount:0 mapping: (null) > > And the same issue as somebody else reported. Where the page > count is negative. > > Ian, any thoughts on this?Nothing in particular. Is it reproducible enough to be bisectable? You mention switching to 2.6.38 64 bit, what were you running before? Do you have any feeling for (or data suggesting) whether it is related to the switch to 64 bit or the switch to 2.6.38?> Could the grant freeing have a race? > or double-freeing?netfront is relatively unchanged in 2.6.38 but the m2p override stuff went in during the 2.6.38 merge window, perhaps this relates to that? The full log shows: Disabling lock debugging due to kernel taint BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81373037>] xennet_alloc_rx_buffers+0xe1/0x2d9 PGD 1d990067 PUD 1d9df067 PMD 0 Oops: 0002 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Tainted: G B 2.6.38-x86_64-linode17 #1 RIP: e030:[<ffffffff81373037>] [<ffffffff81373037>] xennet_alloc_rx_buffers+0xe1/0x2d9 It''d be useful to know what ffffffff81373037 and.or xennet_alloc_rx_buffers+0xe1 corresponds to in this particular kernel image.> > index:0xffff88001a399700 > > page flags: 0x100000000000000() > > Pid: 0, comm: swapper Not tainted 2.6.38-x86_64-linode17 #1 > > Call Trace: > > <IRQ> [<ffffffff810aa910>] ? dump_page+0xb1/0xb6 > > [<ffffffff810ab86a>] ? bad_page+0xd8/0xf0 > > [<ffffffff810ad0ca>] ? get_page_from_freelist+0x487/0x715 > > [<ffffffff8100699f>] ? xen_restore_fl_direct_end+0x0/0x1 > > [<ffffffff810dc49a>] ? kmem_cache_free+0x71/0xad > > [<ffffffff810ad55c>] ? __alloc_pages_nodemask+0x14d/0x6ab > > [<ffffffff81403a73>] ? __netdev_alloc_skb+0x1d/0x3a > > [<ffffffff8144b392>] ? ip_rcv_finish+0x319/0x343 > > [<ffffffff81403a73>] ? __netdev_alloc_skb+0x1d/0x3a > > [<ffffffff810d6f35>] ? alloc_pages_current+0xaa/0xcd > > [<ffffffff81372fd0>] ? xennet_alloc_rx_buffers+0x7a/0x2d9 > > [<ffffffff81374d32>] ? xennet_poll+0xbef/0xc85 > > [<ffffffff8100699f>] ? xen_restore_fl_direct_end+0x0/0x1 > > [<ffffffff8140d709>] ? net_rx_action+0xb6/0x1dc > > [<ffffffff812f1bf7>] ? unmask_evtchn+0x1f/0xa3 > > [<ffffffff810431a4>] ? __do_softirq+0xc7/0x1a3 > > [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1 > > [<ffffffff810069b2>] ? check_events+0x12/0x20 > > [<ffffffff8100a85c>] ? call_softirq+0x1c/0x30 > > [<ffffffff8100bebd>] ? do_softirq+0x41/0x7e > > [<ffffffff8104303b>] ? irq_exit+0x36/0x78 > > [<ffffffff812f273c>] ? xen_evtchn_do_upcall+0x2f/0x3c > > [<ffffffff8100a8ae>] ? xen_do_hypervisor_callback+0x1e/0x30 > > <EOI> [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > > [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006 > > [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a > > [<ffffffff81010998>] ? default_idle+0x4b/0x85 > > [<ffffffff81008d53>] ? cpu_idle+0x60/0x97 > > [<ffffffff81533d09>] ? rest_init+0x6d/0x6f > > [<ffffffff81b2bd34>] ? start_kernel+0x37f/0x38a > > [<ffffffff81b2b2cd>] ? x86_64_start_reservations+0xb8/0xbc > > [<ffffffff81b2ee71>] ? xen_start_kernel+0x528/0x52f > > > > ... it continues with more BUGs. Full log here: > > > > http://www.theshore.net/~caker/xen/BUGS/2.6.38/ > > > > Thanks, > > -Chris > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel