Hi, Got a very painful problem with my Xen 3 setup... I can run 6-7 guests fairly stably, but as soon as traffic to the file server guest picks up, it dies (guest console message below). I''ve seen this happen with NFS and SCP traffic and it''s very consistent. After that, networking to/from all guests is gone, until the host is rebooted. The setup is: AMD Athlon64, Nforce 4 chipset, Marvel Yukon gigabit ethernet, xen- unstable from 7/4/06, Debian sarge amd64 on host and guests, shoe size 8.5. Any ideas? TIA, Itai Unable to handle kernel NULL pointer dereference at 000000000000017e RIP: <ffffffff80257bfd>{network_tx_buf_gc+253} PGD aeb2067 PUD aeb3067 PMD 0 Oops: 0002 [1] CPU 0 Modules linked in: ipv6 nfsd xfs exportfs Pid: 0, comm: swapper Not tainted 2.6.16.13-xenU #2 RIP: e030:[<ffffffff80257bfd>] <ffffffff80257bfd>{network_tx_buf_gc+253} RSP: e02b:ffffffff80399718 EFLAGS: 00010092 RAX: 00000000000000c1 RBX: 00000000000000b6 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880000451550 RBP: 00000000000000ba R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000246 R12: ffff8800004503e0 R13: 00000000000164ec R14: 0000000000016502 R15: ffff880000450540 FS: 00002ac5d8e746d0(0000) GS:ffffffff803cb000(0000) knlGS: 0000000000000000 CS: e033 DS: 0000 ES: 0000 Process swapper (pid: 0, threadinfo ffffffff803d6000, task ffffffff803337a0) Stack: ffff880000450000 0000000000000000 ffff880000450000 ffff8800004503e0 0000000000000106 0000000000000000 ffffffff803d7ec8 ffffffff802588b4 ffff880009b2d740 0000000000000000 Call Trace: <IRQ> <ffffffff802588b4>{netif_int+52} <ffffffff8014776d> {handle_IRQ_event+61} <ffffffff80147820>{__do_IRQ+112} <ffffffff8010d461>{do_IRQ+65} <ffffffff8024cab8>{evtchn_do_upcall+136} <ffffffff8010b416> {do_hypervisor_callback+30} <EOI> <ffffffff801073aa>{hypercall_page+938} <ffffffff801073aa> {hypercall_page+938} <ffffffff8010f4fd>{safe_halt+29} <ffffffff80108b45>{xen_idle+85} <ffffffff80108b95>{cpu_idle+53} <ffffffff803d983a> {start_kernel+426} <ffffffff803d91ff>{x86_64_start_kernel+351} Code: ff 8d c4 00 00 00 0f 94 c0 84 c0 74 46 48 8b 05 af cb 0d 00 RIP <ffffffff80257bfd>{network_tx_buf_gc+253} RSP <ffffffff80399718> CR2: 000000000000017e <0>Kernel panic - not syncing: Aiee, killing interrupt handler! _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
In-Tuition Xen
2006-Aug-04 09:05 UTC
[Xen-users] File server in domU dies on heavy traffic
Itai Tavor scribbled on 14 July 2006 07:20:> Hi, > > Got a very painful problem with my Xen 3 setup... I can run 6-7 > guests fairly stably, but as soon as traffic to the file server guest > picks up, it dies (guest console message below). I''ve seen this > happen with NFS and SCP traffic and it''s very consistent. After that, > networking to/from all guests is gone, until the host is rebooted. > > The setup is: > > AMD Athlon64, Nforce 4 chipset, Marvel Yukon gigabit ethernet, xen- > unstable from 7/4/06, Debian sarge amd64 on host and guests, shoe > size 8.5.Hi all, I''ve got exactly the same problem happening on some recent xen0''s (xen-3.0.2-3.FC5). They''re running on exactly the same hardware as some rock solid xen0''s (6 blades) running an older version (xen-3.0.1-4). I''ve rolled back the problem xen0''s to this version in the hope that will fix it. Regards, Matt. BUG: unable to handle kernel NULL pointer dereference at virtual address 000000b0 printing eip: c90c21ac *pde = ma 2209b067 pa 03053067 *pte = ma 00000000 pa fffff000 Oops: 0002 [#1] SMP Modules linked in: ipv6 autofs4 sunrpc xennet ip_conntrack_ftp ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables dm_snapshot dm_zero dm_mirror dm_mod CPU: 0 EIP: 0061:[<c90c21ac>] Not tainted VLI EFLAGS: 00210046 (2.6.17-1.2157_FC5xenU #1) EIP is at network_tx_buf_gc+0xc3/0x1b7 [xennet] eax: 0000001c ebx: 0000002e ecx: c5820cfc edx: 00000000 esi: 00000001 edi: c5820400 ebp: 00000020 esp: c0651edc ds: 007b es: 007b ss: 0069 Process swapper (pid: 0, threadinfo=c0650000 task=c05f1800) Stack: <0>c5820cfc 00000000 00000000 00000004 c5820000 000136fe 00013703 000136dd 00000000 c5820488 c5820400 c5820000 c90c3150 c5d8b300 00000000 00000000 00000109 c043a57d 00000109 c5820000 c0651f88 c0651f88 00000109 c0643880 Call Trace: <c90c3150> netif_int+0x24/0x66 [xennet] <c043a57d> handle_IRQ_event+0x42/0x85 <c043a64d> __do_IRQ+0x8d/0xdc <c040665a> do_IRQ+0x1a/0x25 <c0519efd> evtchn_do_upcall+0x66/0x9f <c0404d79> hypervisor_callback+0x3d/0x48 <c042007b> ptrace_request+0x6d/0x207 <c0407a6a> safe_halt+0x84/0xa7 <c0402bde> xen_idle+0x46/0x4e <c0402cfd> cpu_idle+0x94/0xad <c0655772> start_kernel+0x346/0x34c Code: ff b4 9f 00 09 00 00 50 e8 9d 85 45 f7 c7 84 9f 00 09 00 00 00 00 00 00 8b 87 f4 00 00 00 89 84 9f f4 00 00 00 89 9f f4 00 00 00 <f0> ff 8d 90 00 00 00 0f 94 c0 83 c4 10 84 c0 74 62 bb 00 e0 ff EIP: [<c90c21ac>] network_tx_buf_gc+0xc3/0x1b7 [xennet] SS:ESP 0069:c0651edc <0>Kernel panic - not syncing: Fatal exception in interrupt BUG: warning at arch/i386/kernel/smp-xen.c:519/smp_call_function() (Not tainted) <c040c5a0> smp_call_function+0x69/0x110 <c040eee6> bust_spinlocks+0x3d/0x46 <c040c6a0> smp_send_stop+0x10/0x46 <c040c655> stop_this_cpu+0x0/0x3b <c0418a2a> panic+0x46/0x184 <c04057d0> die+0x246/0x27b <c040f139> do_page_fault+0x0/0x8c1 <c040f863> do_page_fault+0x72a/0x8c1 <c040f139> do_page_fault+0x0/0x8c1 <c0404d37> error_code+0x2b/0x30 <c90c21ac> network_tx_buf_gc+0xc3/0x1b7 [xennet] <c90c3150> netif_int+0x24/0x66 [xennet] <c043a57d> handle_IRQ_event+0x42/0x85 <c043a64d> __do_IRQ+0x8d/0xdc <c040665a> do_IRQ+0x1a/0x25 <c0519efd> evtchn_do_upcall+0x66/0x9f <c0404d79> hypervisor_callback+0x3d/0x48 <c042007b> ptrace_request+0x6d/0x207 <c0407a6a> safe_halt+0x84/0xa7 <c0402bde> xen_idle+0x46/0x4e <c0402cfd> cpu_idle+0x94/0xad <c0655772> start_kernel+0x346/0x34c _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Adrian Chadd
2006-Aug-04 14:35 UTC
Re: [Xen-users] File server in domU dies on heavy traffic
On Fri, Aug 04, 2006, In-Tuition Xen wrote:> I''ve got exactly the same problem happening on some recent xen0''s > (xen-3.0.2-3.FC5). They''re running on exactly the same hardware as some > rock solid xen0''s (6 blades) running an older version (xen-3.0.1-4). > I''ve rolled back the problem xen0''s to this version in the hope that > will fix it.I''ve had the same problem but I haven''t caught a stack trace. I''m trying to reproduce it on my local xen server to grab an actual panic message and I''ve rolled back the domU kernels to circa-3.0.1 which has alleviated the problem. Does anyone know if this has been fixed in a the later 3.0.x builds? (I hope the Fedora project release some updated FC5 Xen 3 builds, but thats a different story.) Adrian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
In-Tuition Xen
2006-Aug-04 16:55 UTC
Re: [Xen-users] File server in domU dies on heavy traffic
Adrian Chadd scribbled on 04 August 2006 15:36:> On Fri, Aug 04, 2006, In-Tuition Xen wrote: >> I''ve got exactly the same problem happening on some recent xen0''s >> (xen-3.0.2-3.FC5). They''re running on exactly the same hardware as >> some rock solid xen0''s (6 blades) running an older version >> (xen-3.0.1-4). I''ve rolled back the problem xen0''s to this version >> in the hope that will fix it. > > I''ve had the same problem but I haven''t caught a stack trace. > I''m trying to reproduce it on my local xen server to grab an actual > panic message and I''ve rolled back the domU kernels to circa-3.0.1 > which has alleviated the problem. > > Does anyone know if this has been fixed in a the later 3.0.x builds? > (I hope the Fedora project release some updated FC5 Xen 3 builds, but > thats a different story.)Thanks for the information Adrian. Just to be clear, are you still running xen-3.0.2-3.FC5 with the latest kernel-xen0-2.6.17-1.2157_FC5 on the dom0''s, but have rolled back just the domU kernels? Since my xen-3.0.1-4 dom0''s and U''s have been running for months now I think I''ll just stick with this version for the time being. Adding the following line to your domU config makes it easy to get the panic messages: on_crash = ''preserve'' Regards, Matt. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Adrian Chadd
2006-Aug-05 01:32 UTC
Re: [Xen-users] File server in domU dies on heavy traffic
On Fri, Aug 04, 2006, In-Tuition Xen wrote:> > Does anyone know if this has been fixed in a the later 3.0.x builds? > > (I hope the Fedora project release some updated FC5 Xen 3 builds, but > > thats a different story.) > > Thanks for the information Adrian. Just to be clear, are you still > running xen-3.0.2-3.FC5 with the latest kernel-xen0-2.6.17-1.2157_FC5 on > the dom0''s, but have rolled back just the domU kernels? Since my > xen-3.0.1-4 dom0''s and U''s have been running for months now I think I''ll > just stick with this version for the time being.I''ve been running: kernel-xenU-2.6.17-1.2157_FC5 xen-3.0.2-3.FC5 The kernel is from 3.0.1 if I remember - its kernel 2.6.16.13 but the server it was compiled on has been installed over (by another FC5 setup I''m trying to replicate the crash on!) I need make a choice on whether to stick with the FC5 version of Xen or roll my own older version of Xen 3.0.1. Or, since it seems to work, try dom0+Xen 3.0.2 with a kernel from 3.0.1 (patched with the relevant security fixes in the linux kernel of course.)> Adding the following line to your domU config makes it easy to get the > panic messages: > > on_crash = ''preserve''I''ll do that, thanks! Adrian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Adrian Chadd
2006-Aug-05 01:43 UTC
Re: [Xen-users] File server in domU dies on heavy traffic
On Sat, Aug 05, 2006, Adrian Chadd wrote:> I''ve been running: > > kernel-xenU-2.6.17-1.2157_FC5 > xen-3.0.2-3.FC5.. i mean xen0 there, not xen0.> The kernel is from 3.0.1 if I remember - its kernel 2.6.16.13 but the server > it was compiled on has been installed over (by another FC5 setup I''m trying > to replicate the crash on!)This is xenU. Sorry, I had an ENOEMAILBEFORECOFFEE error again. adrian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users