thr3ads.net - Xen devel - [Xen-devel] kernel BUG at mm/swapfile.c:2527! [Sep 2011]

If this information is useful, please help other people find it:
Share via:

Shaun Reitan

2011-Sep-15 18:56 UTC

[Xen-devel] kernel BUG at mm/swapfile.c:2527!

We''ve been seeing the following bugs hit.  This is happening with
kernel
versions 2.6.39 and 3.0.1.

So far we''ve only see this problem happen on ubuntu servers and it 
always seams to be the apache process that triggers it.  Also this time 
we were running a PCI compliance scan on the server.  We are thinking 
that may have triggered it.


2.6.39 Dump
------------[ cut here ]------------
kernel BUG at mm/swapfile.c:2527!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/uevent
Modules linked in:

Pid: 30706, comm: apache2 Not tainted 2.6.39-2 #3
EIP: 0061:[<c01ab016>] EFLAGS: 00210246 CPU: 0
EIP is at swap_count_continued+0x176/0x190
EAX: 00000000 EBX: ebba0800 ECX: 80000001 EDX: f57ba95f
ESI: 00000080 EDI: ebbd7d40 EBP: 0000095f ESP: df4dbe38
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process apache2 (pid: 30706, ti=df4da000 task=e9259bd0 task.ti=df4da000)
Stack:
  ea298d40 0000495f ee11a000 00000000 c01ab157 0000495f 00092be0 ea298d40
  b8f33000 c01ac277 00000000 00092be0 e91ed998 c019dba7 6afaa065 80000001
  00000000 00000000 c01065b3 c01036cd b9531fff 00000000 e8fdb348 df4dbf0c
Call Trace:
  [<c01ab157>] ? swap_entry_free+0x127/0x150
  [<c01ac277>] ? free_swap_and_cache+0x27/0xd0
  [<c019dba7>] ? unmap_vmas+0x587/0x7f0
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01036cd>] ? xen_mc_flush+0xdd/0x190
  [<c01a1e0a>] ? exit_mmap+0x8a/0x140
  [<c0132aa1>] ? mmput+0x41/0xd0
  [<c0136afd>] ? exit_mm+0xed/0x110
  [<c0652710>] ? _raw_spin_lock_irq+0x10/0x20
  [<c01380d7>] ? do_exit+0x197/0x760
  [<c04417a7>] ? __xen_evtchn_do_upcall+0x1e7/0x240
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01386cf>] ? do_group_exit+0x2f/0x90
  [<c013873d>] ? sys_exit_group+0xd/0x10
  [<c0652a41>] ? syscall_call+0x7/0xb
  [<c0650000>] ? cpuup_callback+0x100/0x260
Code: d7 fe ff ff 89 d8 e8 7a 9f f7 ff 8d 54 05 00 c6 02 00 eb b0 0f 0b 
eb fe 0f 0b eb fe 89 f2 31 c0 80 fa 80 0f 94 c0 e9 b2 fe ff ff <0f> 0b 
eb fe 0f 0b eb fe 0f 0b eb fe 8d b4 26 00 00 00 00 8d bc
EIP: [<c01ab016>] swap_count_continued+0x176/0x190 SS:ESP 0069:df4dbe38
---[ end trace 9fa17c616c267728 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: apache2/30706/0x00000001
Modules linked in:
Pid: 30706, comm: apache2 Tainted: G      D     2.6.39-2 #3
Call Trace:
  [<c065104f>] ? schedule+0x76f/0x840
  [<c01358ff>] ? vprintk+0x19f/0x3a0
  [<c01065bc>] ? check_events+0x8/0xc
  [<c0652731>] ? _raw_spin_unlock_irqrestore+0x11/0x20
  [<c01358ff>] ? vprintk+0x19f/0x3a0
  [<c01385ea>] ? do_exit+0x6aa/0x760
  [<c06526e7>] ? _raw_spin_lock_irqsave+0x27/0x40
  [<c0652731>] ? _raw_spin_unlock_irqrestore+0x11/0x20
  [<c0135016>] ? kmsg_dump+0x36/0xd0
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c0135b1b>] ? printk+0x1b/0x20
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c010b98f>] ? oops_end+0x9f/0xa0
  [<c0109c0f>] ? do_invalid_op+0x7f/0x90
  [<c01ab016>] ? swap_count_continued+0x176/0x190
  [<c018a939>] ? free_pcppages_bulk+0x2c9/0x2f0
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01065bc>] ? check_events+0x8/0xc
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c018b4f6>] ? free_hot_cold_page+0xd6/0x160
  [<c0103ff5>] ? pte_pfn_to_mfn+0xb5/0xd0
  [<c0104071>] ? xen_make_pte+0x41/0x110
  [<c0652fb6>] ? error_code+0x5a/0x60
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c01ab016>] ? swap_count_continued+0x176/0x190
  [<c01ab157>] ? swap_entry_free+0x127/0x150
  [<c01ac277>] ? free_swap_and_cache+0x27/0xd0
  [<c019dba7>] ? unmap_vmas+0x587/0x7f0
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01036cd>] ? xen_mc_flush+0xdd/0x190
  [<c01a1e0a>] ? exit_mmap+0x8a/0x140
  [<c0132aa1>] ? mmput+0x41/0xd0
  [<c0136afd>] ? exit_mm+0xed/0x110
  [<c0652710>] ? _raw_spin_lock_irq+0x10/0x20
  [<c01380d7>] ? do_exit+0x197/0x760
  [<c04417a7>] ? __xen_evtchn_do_upcall+0x1e7/0x240
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01386cf>] ? do_group_exit+0x2f/0x90
  [<c013873d>] ? sys_exit_group+0xd/0x10
  [<c0652a41>] ? syscall_call+0x7/0xb
  [<c0650000>] ? cpuup_callback+0x100/0x260




Here''s the 3.0.1 Dump, unfortunately i didn''t catch a full
dump.

  BUG: unable to handle kernel paging request at f57ba13c
IP: [<c01ae845>] swap_count_continued+0x85/0x190
*pdpt = 0000000000959027 *pde = 00000000008f5067 *pte = 0000000000000000
Oops: 0000 [#1] SMP
Modules linked in:

Pid: 3666, comm: apache2 Not tainted 3.0.1-1 #1
EIP: 0061:[<c01ae845>] EFLAGS: 00010246 CPU: 0
EIP is at swap_count_continued+0x85/0x190
EAX: 00000080 EBX: ed302400 ECX: ecb870a0 EDX: f57ba13c
ESI: 00000080 EDI: ed3d7760 EBP: 0000013c ESP: ea479dec
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process apache2 (pid: 3666, ti=ea478000 task=ebe91bd0 task.ti=ea478000)
Stack:
  ec6915c0 0001913c ee129000 00000040 c01aea77 0001913c 00322780 ec6915c0
  b9275000 c01b0927 00000000 00322780 ea6533a8 c01a2d41 6e484067 80000001
  c01059ef 80000000 00000000 ebad13c0 eae13e48 ec6ebb1c ea479ee8 00000000
Call Trace:
  [<c01aea77>] ? swap_entry_free+0x127/0x150
  [<c01b0927>] ? free_swap_and_cache+0x27/0xd0
  [<c01a2d41>] ? zap_pte_range+0x321/0x420
  [<c01059ef>] ? xen_make_pte+0x3f/0xc0
  [<c01a2f98>] ? unmap_page_range+0x158/0x1a0
  [<c01a3058>] ? unmap_vmas+0x78/0xb0
  [<c01a524e>] ? exit_mmap+0x6e/0xf0
  [<c0136421>] ? mmput+0x41/0xd0
  [<c0139fcd>] ? exit_mm+0xed/0x110
  [<c06c76e0>] ? _raw_spin_lock_irq+0x10/0x20
  [<c013b7e7>] ? do_exit+0x197/0x340
  [<c01a5309>] ? remove_vma_list+0x39/0x50
  [<c013b9bf>] ? do_group_exit+0x2f/0x90
  [<c013ba2d>] ? sys_exit_group+0xd/0x10
  [<c06c7a11>] ? syscall_call+0x7/0xb
Code: 2a 90 8d 74 26 00 e9 15 01 00 00 89 d0 e8 c4 7e f7 ff 8b 5b 18 83 
eb 18 39 df 0f 84 e5 00 00 00 89 d8 e8 3f 81 f7 ff 8d 54 05 00 <0f> b6 
02 3c 80 74 d9 84 c0 0f 84 e2 00 00 00 83 e8 01 84 c0 88
EIP: [<c01ae845>] swap_count_continued+0x85/0x190 SS:ESP 0069:ea479dec
CR2: 00000000f57ba13c
---[ end trace 36a533bb83dd2812 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: apache2/3666/0x00000001
Modules linked in:
Pid: 3666, comm: apache2 Tainted: G      D     3.0.1-1 #1
Call Trace:
  [<c06c60ed>] ? schedule+0x50d/0x520
  [<c0106a23>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01061d7>] ? xen_force_evtchn_callback+0x17/0x30
  [<c013b92f>] ? do_exit+0x2df/0x340
  [<c0138c3b>] ? printk+0x1b/0x20
  [<c010bf6f>] ? oops_end+0x9f/0xa0
  [<c0120f4f>] ? bad_area_nosemaphore+0xf/0x20
  [<c012149b>] ? do_page_fault+0x1bb/0x420
  [<c0177e85>] ? irq_get_irq_data+0x5/0x10
  [<c047da45>] ? info_for_irq+0x5/0x20
  [<c047e270>] ? evtchn_from_irq+0x10/0x40
  [<c01061d7>] ? xen_force_evtchn_callback+0x17/0x30
  [<c0106a2c>] ? check_events+0x8/0xc
  [<c0106a23>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c0104bab>] ? xen_batched_set_pte+0xab/0xf0
  [<c01212e0>] ? vmalloc_fault+0x2c0/0x2c0
  [<c06c7f86>] ? error_code+0x5a/0x60
  [<c01212e0>] ? vmalloc_fault+0x2c0/0x2c0
  [<c01ae845>] ? swap_count_continued+0x85/0x190
  [<c01aea77>] ? swap_entry_free+0x127/0x150
  [<c01b0927>] ? free_swap_and_cache+0x27/0xd0
  [<c01a2d41>] ? zap_pte_range+0x321/0x420
  [<c01059ef>] ? xen_make_pte+0x3f/0xc0
  [<c01a2f98>] ? unmap_page_range+0x158/0x1a0
  [<c01a3058>] ? unmap_vmas+0x78/0xb0
  [<c01a524e>] ? exit_mmap+0x6e/0xf0
  [<c0136421>] ? mmput+0x41/0xd0
  [<c0139fcd>] ? exit_mm+0xed/0x110
  [<c06c76e0>] ? _raw_spin_lock_irq+0x10/0x20
  [<c013b7e7>] ? do_exit+0x197/0x340
  [<c01a5309>] ? remove_vma_list+0x39/0x50
  [<c013b9bf>] ? do_group_exit+0x2f/0x90
  [<c013ba2d>] ? sys_exit_group+0xd/0x10
  [<c06c7a11>] ? syscall_call+0x7/0xb




-- 
Shaun Retian
Chief Technical Officer
Network Data Center Host, Inc.
http://www.ndchost.com


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shaun Reitan

2011-Sep-15 19:52 UTC

head link

[Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

I can just about reproduce this bug on the fly, a PCI compliance scan 
seams to be triggering it every time.  Let me know what you guys need!

~Shaun


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Sep-16 08:24 UTC

head link

Re: [Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

On Thu, Sep 15, 2011 at 12:52:42PM -0700, Shaun Reitan
wrote:> I can just about reproduce this bug on the fly, a PCI compliance
> scan seams to be triggering it every time.  Let me know what you
> guys need!
How do I reproduce it? Is the PCI compliance easily available? Is
there any chance we can get access to the physical box to figure
out what is happening?
> 
> ~Shaun
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shaun Reitan

2011-Sep-16 16:52 UTC

head link

[Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

> How do I reproduce it? Is the PCI compliance easily available? Is
> there any chance we can get access to the physical box to figure
> out what is happening?
At this point I''m not able to reproduce the problem on the fly.  We had
thought it was a PCI compliance scan that was triggering the error but 
now this customer is seeing the error constantly and the scans are not 
running.  I''m thrashing a test server that i attempted to setup exactly
like this customers server and so far no crash.

The customers server is crashing like crazy, I''m attempting to figure 
out the trigger but it''s proving difficult.  What do you need to see to
figure out why it''s crashing?  I''m willing to do whatever it
takes but I
cannot give you access to the host, but customer is willing to give you 
access to there virtual instance as a last resort.

~Shaun


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Kent Hoxsey

2011-Sep-19 16:30 UTC

head link

Re: [Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

Joining this thread lately as a follow-on from a similar problem that is
happening in Amazon AWS instances. There is a thread on the AWS forums where an
instance owner has figured out how to cause this bug on demand using apache and
PHP:

https://forums.aws.amazon.com/thread.jspa?messageID=269851

In case those forums require a login, the php script to hit is:

<?php
$data = array();
for($x = 0; $x< 10000; $x++)
{
        for($y = 0; $y<1000; $y++){
                $data[][]=rand(1,100000);
        }
}
echo count($data);


I am not a PHP programmer, so unsure if that php tag needs to be closed or not,
but that is what is posted on the forum. Run apache bench against your test URL
with 200 concurrent connections.

My Amazon instance isn''t running PHP but encounters a similar problem
once a day (1:48pm Pacific). I cannot allow people onto the instance but am
willing to run diagnostics and post them here.

Kent

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Shaun Reitan

2011-Sep-20 04:33 UTC

head link

[Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

On 9/16/2011 1:24 AM, Konrad Rzeszutek Wilk wrote:> How do I reproduce it? Is the PCI compliance easily available? Is
> there any chance we can get access to the physical box to figure
> out what is happening?
Konrad,

did you get my email with the server I setup for you and logins?

-- 
Shaun

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Konrad Rzeszutek Wilk

2011-Sep-22 11:06 UTC

head link

Re: [Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

On Mon, Sep 19, 2011 at 09:33:40PM -0700, Shaun Reitan
wrote:> On 9/16/2011 1:24 AM, Konrad Rzeszutek Wilk wrote:
> >How do I reproduce it? Is the PCI compliance easily available? Is
> >there any chance we can get access to the physical box to figure
> >out what is happening?
> 
> Konrad,
> 
> did you get my email with the server I setup for you and logins?
Yup. Just came back from a conference so getting back to the
groove.> 
> -- 
> Shaun
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Sep 2011 - kernel BUG at mm/swapfile.c:2527!

[Xen-devel] kernel BUG at mm/swapfile.c:2527!

[Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

Re: [Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

[Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

Re: [Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

[Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!

Re: [Xen-devel] Re: kernel BUG at mm/swapfile.c:2527!