Tim Nelson
2012-Oct-18 20:44 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
I see this ocasionally on one of my CentOS 6.3 x64 systems: Oct 18 03:10:52 backup kernel: swapper: page allocation failure. order:1, mode:0x20 Oct 18 03:10:52 backup kernel: Pid: 0, comm: swapper Not tainted 2.6.32-279.9.1.el6.x86_64 #1 Oct 18 03:10:52 backup kernel: Call Trace: Oct 18 03:10:52 backup kernel: <IRQ> [<ffffffff8112789f>] ? __alloc_pages_nodemask+0x77f/0x940 Oct 18 03:10:52 backup kernel: [<ffffffff811620a2>] ? kmem_getpages+0x62/0x170 Oct 18 03:10:52 backup kernel: [<ffffffff81162cba>] ? fallback_alloc+0x1ba/0x270 Oct 18 03:10:52 backup kernel: [<ffffffff8116270f>] ? cache_grow+0x2cf/0x320 Oct 18 03:10:52 backup kernel: [<ffffffff81162a39>] ? ____cache_alloc_node+0x99/0x160 Oct 18 03:10:52 backup kernel: [<ffffffff8116381b>] ? kmem_cache_alloc+0x11b/0x190 Oct 18 03:10:52 backup kernel: [<ffffffff8142dfa8>] ? sk_prot_alloc+0x48/0x1c0 Oct 18 03:10:52 backup kernel: [<ffffffff8142e272>] ? sk_clone+0x22/0x2e0 Oct 18 03:10:52 backup kernel: [<ffffffff8147bfa6>] ? inet_csk_clone+0x16/0xd0 Oct 18 03:10:52 backup kernel: [<ffffffff81494f83>] ? tcp_create_openreq_child+0x23/0x450 Oct 18 03:10:52 backup kernel: [<ffffffff814927ed>] ? tcp_v4_syn_recv_sock+0x4d/0x310 Oct 18 03:10:52 backup kernel: [<ffffffff81494d26>] ? tcp_check_req+0x226/0x460 Oct 18 03:10:52 backup kernel: [<ffffffff8148a6d6>] ? tcp_rcv_state_process+0x126/0xa10 Oct 18 03:10:52 backup kernel: [<ffffffff8149220b>] ? tcp_v4_do_rcv+0x35b/0x430 Oct 18 03:10:52 backup kernel: [<ffffffff8142fd97>] ? __kfree_skb+0x47/0xa0 Oct 18 03:10:52 backup kernel: [<ffffffff81493a4e>] ? tcp_v4_rcv+0x4fe/0x8d0 Oct 18 03:10:52 backup kernel: [<ffffffff81492193>] ? tcp_v4_do_rcv+0x2e3/0x430 Oct 18 03:10:52 backup kernel: [<ffffffff814716dd>] ? ip_local_deliver_finish+0xdd/0x2d0 Oct 18 03:10:52 backup kernel: [<ffffffff81471968>] ? ip_local_deliver+0x98/0xa0 Oct 18 03:10:52 backup kernel: [<ffffffff81470e2d>] ? ip_rcv_finish+0x12d/0x440 Oct 18 03:10:52 backup kernel: [<ffffffff814713b5>] ? ip_rcv+0x275/0x350 Oct 18 03:10:52 backup kernel: [<ffffffff8104efd4>] ? scale_rt_power+0x24/0x80 Oct 18 03:10:52 backup kernel: [<ffffffff8143aafb>] ? __netif_receive_skb+0x49b/0x6f0 Oct 18 03:10:52 backup kernel: [<ffffffff81490d3a>] ? tcp4_gro_receive+0x5a/0xd0 Oct 18 03:10:52 backup kernel: [<ffffffff8143cd78>] ? netif_receive_skb+0x58/0x60 Oct 18 03:10:52 backup kernel: [<ffffffff8143ce80>] ? napi_skb_finish+0x50/0x70 Oct 18 03:10:52 backup kernel: [<ffffffff8143f3b9>] ? napi_gro_receive+0x39/0x50 Oct 18 03:10:52 backup kernel: [<ffffffffa025af2f>] ? bnx2_poll_work+0xd4f/0x1270 [bnx2] Oct 18 03:10:52 backup kernel: [<ffffffff810632bb>] ? enqueue_task_fair+0xb/0x100 Oct 18 03:10:52 backup kernel: [<ffffffffa025b579>] ? bnx2_poll+0x69/0x2d8 [bnx2] Oct 18 03:10:52 backup kernel: [<ffffffff8106010c>] ? try_to_wake_up+0x24c/0x3e0 Oct 18 03:10:52 backup kernel: [<ffffffff8143f4d3>] ? net_rx_action+0x103/0x2f0 Oct 18 03:10:52 backup kernel: [<ffffffff81073f41>] ? __do_softirq+0xc1/0x1e0 Oct 18 03:10:52 backup kernel: [<ffffffff810dbb00>] ? handle_IRQ_event+0x60/0x170 Oct 18 03:10:52 backup kernel: [<ffffffff8100c24c>] ? call_softirq+0x1c/0x30 Oct 18 03:10:52 backup kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0 Oct 18 03:10:52 backup kernel: [<ffffffff81073d25>] ? irq_exit+0x85/0x90 Oct 18 03:10:52 backup kernel: [<ffffffff81506095>] ? do_IRQ+0x75/0xf0 Oct 18 03:10:52 backup kernel: [<ffffffff8100ba53>] ? ret_from_intr+0x0/0x11 Oct 18 03:10:52 backup kernel: <EOI> [<ffffffff81014877>] ? mwait_idle+0x77/0xd0 Oct 18 03:10:52 backup kernel: [<ffffffff8150392a>] ? atomic_notifier_call_chain+0x1a/0x20 Oct 18 03:10:52 backup kernel: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110 Oct 18 03:10:52 backup kernel: [<ffffffff814e48da>] ? rest_init+0x7a/0x80 Oct 18 03:10:52 backup kernel: [<ffffffff81c21f7b>] ? start_kernel+0x424/0x430 Oct 18 03:10:52 backup kernel: [<ffffffff81c2133a>] ? x86_64_start_reservations+0x125/0x129 Oct 18 03:10:52 backup kernel: [<ffffffff81c21438>] ? x86_64_start_kernel+0xfa/0x109 Any thoughts on the cause? The system has 16GB of RAM, and whenever checked, there is no swap usage. Is this a memory error (bad RAM)? --Tim
Stephen Harris
2012-Oct-19 00:02 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
On Thu, Oct 18, 2012 at 03:44:30PM -0500, Tim Nelson wrote:> Oct 18 03:10:52 backup kernel: Pid: 0, comm: swapper Not tainted 2.6.32-279.9.1.el6.x86_64 #1...> Oct 18 03:10:52 backup kernel: [<ffffffff814927ed>] ? tcp_v4_syn_recv_sock+0x4d/0x310 > Oct 18 03:10:52 backup kernel: [<ffffffff81494d26>] ? tcp_check_req+0x226/0x460 > Oct 18 03:10:52 backup kernel: [<ffffffff8148a6d6>] ? tcp_rcv_state_process+0x126/0xa10> Any thoughts on the cause? The system has 16GB of RAM, and wheneverIt''s not a normal " out of memory " error.> checked, there is no swap usage. Is this a memory error (bad RAM)?I see this sometimes on my linode using their custom kernel when doing lots of network communication (rsync''ing files). It looks like an issue with the TCP/IP4 stack. When I switched to IP6 then the messages went away. -- rgds Stephen
Tim Nelson
2012-Oct-19 02:26 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
----- Original Message -----> On Thu, Oct 18, 2012 at 03:44:30PM -0500, Tim Nelson wrote: > > Oct 18 03:10:52 backup kernel: Pid: 0, comm: swapper Not tainted > > 2.6.32-279.9.1.el6.x86_64 #1 > > ... > > > Oct 18 03:10:52 backup kernel: [<ffffffff814927ed>] ? > > tcp_v4_syn_recv_sock+0x4d/0x310 > > Oct 18 03:10:52 backup kernel: [<ffffffff81494d26>] ? > > tcp_check_req+0x226/0x460 > > Oct 18 03:10:52 backup kernel: [<ffffffff8148a6d6>] ? > > tcp_rcv_state_process+0x126/0xa10 > > > Any thoughts on the cause? The system has 16GB of RAM, and whenever > > It''s not a normal " out of memory " error. > > > checked, there is no swap usage. Is this a memory error (bad RAM)? > > I see this sometimes on my linode using their custom kernel when > doing > lots of network communication (rsync''ing files). It looks like an > issue with the TCP/IP4 stack. When I switched to IP6 then the > messages > went away. >Interesting! I too am running heavy rsync operations on this host, and the messages appear to come once an evening when the rsync jobs are running. Of note though, the kernel is stock from the CentOS repos, nothing special. Is this possibly a known ''issue'' or ''bug'' with CentOS (or upstream)? --Tim
Tony Molloy
2012-Oct-19 09:53 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
On Thursday 18 October 2012 21:44:30 Tim Nelson wrote:> I see this ocasionally on one of my CentOS 6.3 x64 systems: > > Oct 18 03:10:52 backup kernel: swapper: page allocation failure. > order:1, mode:0x20 Oct 18 03:10:52 backup kernel: Pid: 0, comm: > swapper Not tainted 2.6.32-279.9.1.el6.x86_64 #1 Oct 18 03:10:52 > backup kernel: Call Trace: > Oct 18 03:10:52 backup kernel: <IRQ> [<ffffffff8112789f>] ? > __alloc_pages_nodemask+0x77f/0x940 Oct 18 03:10:52 backup kernel:<snip>> > Any thoughts on the cause? The system has 16GB of RAM, and whenever > checked, there is no swap usage. Is this a memory error (bad RAM)? > > --TimI have the same problem on a Dell PE R720 with 16GB of RAM doing lots of networking. It''s a file server. It was discussed on the dell-poweredge mailing list last week <linux-poweredge at dell.com> The conclusion was that it was harmless but for a discussion and possible workaround see <https://bugzilla.redhat.com/show_bug.cgi?id=770545#c16> Hope this helps, Tony
Tim Nelson
2012-Oct-19 14:20 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
----- Original Message -----> On Thursday 18 October 2012 21:44:30 Tim Nelson wrote: > > I see this ocasionally on one of my CentOS 6.3 x64 systems: > > > > Oct 18 03:10:52 backup kernel: swapper: page allocation failure. > > order:1, mode:0x20 Oct 18 03:10:52 backup kernel: Pid: 0, comm: > > swapper Not tainted 2.6.32-279.9.1.el6.x86_64 #1 Oct 18 03:10:52 > > backup kernel: Call Trace: > > Oct 18 03:10:52 backup kernel: <IRQ> [<ffffffff8112789f>] ? > > __alloc_pages_nodemask+0x77f/0x940 Oct 18 03:10:52 backup kernel: > > <snip> > > > > > Any thoughts on the cause? The system has 16GB of RAM, and whenever > > checked, there is no swap usage. Is this a memory error (bad RAM)? > > > > --Tim > > I have the same problem on a Dell PE R720 with 16GB of RAM doing lots > of networking. It''s a file server. It was discussed on the > dell-poweredge mailing list last week > <linux-poweredge at dell.com> > > The conclusion was that it was harmless but for a discussion and > possible workaround see > <https://bugzilla.redhat.com/show_bug.cgi?id=770545#c16> > > Hope this helps, >*VERY* helpful, thanks! --Tim
m.roth at 5-cent.us
2012-Oct-19 14:27 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
Tim Nelson wrote:> ----- Original Message ----- >> On Thursday 18 October 2012 21:44:30 Tim Nelson wrote: >> > I see this ocasionally on one of my CentOS 6.3 x64 systems: >> > >> > Oct 18 03:10:52 backup kernel: swapper: page allocation failure. >> > order:1, mode:0x20 Oct 18 03:10:52 backup kernel: Pid: 0, comm: >> > swapper Not tainted 2.6.32-279.9.1.el6.x86_64 #1 Oct 18 03:10:52 >> > backup kernel: Call Trace: >> > Oct 18 03:10:52 backup kernel: <IRQ> [<ffffffff8112789f>] ? >> > __alloc_pages_nodemask+0x77f/0x940 Oct 18 03:10:52 backup kernel: >> <snip> >> > Any thoughts on the cause? The system has 16GB of RAM, and whenever >> > checked, there is no swap usage. Is this a memory error (bad RAM)? >> >> I have the same problem on a Dell PE R720 with 16GB of RAM doing lots >> of networking. It''s a file server. It was discussed on the >> dell-poweredge mailing list last week >> <linux-poweredge at dell.com> >> >> The conclusion was that it was harmless but for a discussion and >> possible workaround see >> <https://bugzilla.redhat.com/show_bug.cgi?id=770545#c16> >> >> Hope this helps,Thanks, but I agree with the person in the bugzilla thread, this is not " just harmless " - when I see one in the logs, I usually see several within a single hour. I *think* that it seems to happen more when someone''s copying or d/l large datasets, and it makes me extrememly worried about the consistency of the data. mark
Tony Molloy
2012-Oct-19 16:16 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
On Friday 19 October 2012 15:27:52 m.roth at 5-cent.us wrote:> Tim Nelson wrote: > > ----- Original Message ----- > > > >> On Thursday 18 October 2012 21:44:30 Tim Nelson wrote: > >> > I see this ocasionally on one of my CentOS 6.3 x64 systems: > >> > > >> > Oct 18 03:10:52 backup kernel: swapper: page allocation > >> > failure. order:1, mode:0x20 Oct 18 03:10:52 backup kernel: > >> > Pid: 0, comm: swapper Not tainted 2.6.32-279.9.1.el6.x86_64 #1 > >> > Oct 18 03:10:52 backup kernel: Call Trace: > >> > Oct 18 03:10:52 backup kernel: <IRQ> [<ffffffff8112789f>] ? > >> > __alloc_pages_nodemask+0x77f/0x940 Oct 18 03:10:52 backup > >> > kernel: > >> > >> <snip> > >> > >> > Any thoughts on the cause? The system has 16GB of RAM, and > >> > whenever checked, there is no swap usage. Is this a memory > >> > error (bad RAM)? > >> > >> I have the same problem on a Dell PE R720 with 16GB of RAM doing > >> lots of networking. It''s a file server. It was discussed on the > >> dell-poweredge mailing list last week > >> <linux-poweredge at dell.com> > >> > >> The conclusion was that it was harmless but for a discussion and > >> possible workaround see > >> <https://bugzilla.redhat.com/show_bug.cgi?id=770545#c16> > >> > >> Hope this helps, > > Thanks, but I agree with the person in the bugzilla thread, this is > not " just harmless " - when I see one in the logs, I usually see > several within a single hour. I *think* that it seems to happen > more when someone''s copying or d/l large datasets, and it makes me > extrememly worried about the consistency of the data.Agree it happens when there is a lot of network activity. My box during the day is a student fileserver and at night it does backups using BackupPC so a lot of network activity. I haven''t seen any ill-effects but would obviously be happy to get it sorted. I tried the workaround suggested in the bugzilla thread so I''ll see if it has any effect. Tony> > mark > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos >
Tony Molloy
2012-Oct-19 16:42 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
On Friday 19 October 2012 15:20:15 Tim Nelson wrote:> ----- Original Message ----- > > > On Thursday 18 October 2012 21:44:30 Tim Nelson wrote: > > > I see this ocasionally on one of my CentOS 6.3 x64 systems: > > > > > > Oct 18 03:10:52 backup kernel: swapper: page allocation > > > failure. order:1, mode:0x20 Oct 18 03:10:52 backup kernel: Pid: > > > 0, comm: swapper Not tainted 2.6.32-279.9.1.el6.x86_64 #1 Oct > > > 18 03:10:52 backup kernel: Call Trace: > > > Oct 18 03:10:52 backup kernel: <IRQ> [<ffffffff8112789f>] ? > > > __alloc_pages_nodemask+0x77f/0x940 Oct 18 03:10:52 backup > > > kernel: > > > > <snip> > > > > > Any thoughts on the cause? The system has 16GB of RAM, and > > > whenever checked, there is no swap usage. Is this a memory > > > error (bad RAM)? > > > > > > --Tim > > > > I have the same problem on a Dell PE R720 with 16GB of RAM doing > > lots of networking. It''s a file server. It was discussed on the > > dell-poweredge mailing list last week > > <linux-poweredge at dell.com> > > > > The conclusion was that it was harmless but for a discussion and > > possible workaround see > > <https://bugzilla.redhat.com/show_bug.cgi?id=770545#c16> > > > > Hope this helps, > > *VERY* helpful, thanks! > > --Tim > _______________________________________________Tim. Mark, For another discussion of this bug see: https://bugzilla.redhat.com/show_bug.cgi?id=713546 Again the conclusion seems to be that it''s harmless, just some lost network packets which are then re-transmitted. Should be fixed for 6.4 ;-) Tony
Tony Molloy
2012-Oct-22 15:30 UTC
[CentOS] swapper: page allocation failure. order:1, mode:0x20
On Friday 19 October 2012 17:16:10 Tony Molloy wrote:> On Friday 19 October 2012 15:27:52 m.roth at 5-cent.us wrote: > > Tim Nelson wrote: > > > ----- Original Message ----- > > > > > >> On Thursday 18 October 2012 21:44:30 Tim Nelson wrote: > > >> > I see this ocasionally on one of my CentOS 6.3 x64 systems: > > >> > > > >> > Oct 18 03:10:52 backup kernel: swapper: page allocation > > >> > failure. order:1, mode:0x20 Oct 18 03:10:52 backup kernel: > > >> > Pid: 0, comm: swapper Not tainted 2.6.32-279.9.1.el6.x86_64 > > >> > #1 Oct 18 03:10:52 backup kernel: Call Trace: > > >> > Oct 18 03:10:52 backup kernel: <IRQ> [<ffffffff8112789f>] ? > > >> > __alloc_pages_nodemask+0x77f/0x940 Oct 18 03:10:52 backup > > >> > kernel: > > >> > > >> <snip> > > >> > > >> > Any thoughts on the cause? The system has 16GB of RAM, and > > >> > whenever checked, there is no swap usage. Is this a memory > > >> > error (bad RAM)? > > >> > > >> I have the same problem on a Dell PE R720 with 16GB of RAM > > >> doing lots of networking. It''s a file server. It was discussed > > >> on the dell-poweredge mailing list last week > > >> <linux-poweredge at dell.com> > > >> > > >> The conclusion was that it was harmless but for a discussion > > >> and possible workaround see > > >> <https://bugzilla.redhat.com/show_bug.cgi?id=770545#c16> > > >> > > >> Hope this helps, > > > > Thanks, but I agree with the person in the bugzilla thread, this > > is not " just harmless " - when I see one in the logs, I usually > > see several within a single hour. I *think* that it seems to > > happen more when someone''s copying or d/l large datasets, and it > > makes me extrememly worried about the consistency of the data. > > Agree it happens when there is a lot of network activity. My box > during the day is a student fileserver and at night it does backups > using BackupPC so a lot of network activity. I haven''t seen any > ill-effects but would obviously be happy to get it sorted. I tried > the workaround suggested in the bugzilla thread so I''ll see if it > has any effect. > > Tony >Ok I tried that workaround set vm.zone_reclaim_mode = 1 in /etc/sysctl.conf and the message has gone. It even survived booting into the latest kernel 2.6.32-279.11.1.el6.x86_64 Tony