Charles Duffy
2005-Oct-31 00:53 UTC
[Xen-devel] DomU panic in net_rx_action; changeset 7503:20d1a79ebe31
Posted this to -users a while back; no response, and I''ve observed it a second time (with fewer modules loaded), so now posting to -devel. Per subject, this is changeset 7503:20d1a79ebe31. It''s a little bit old, granted, but I''ve been keeping an eye on -devel and haven''t seen anything that looks explicitly like a fix being applied. Unable to handle kernel paging request at ffff88000bc85700 RIP: <ffffffff8024026a>{netif_poll+1354} PGD 5e3067 PUD 5e4067 PMD 643067 PTE 0 Oops: 0002 [1] CPU 0 Modules linked in: ipv6 Pid: 0, comm: swapper Tainted: GF 2.6.12.6-xenU RIP: e030:[<ffffffff8024026a>] <ffffffff8024026a>{netif_poll+1354} RSP: e02b:ffffffff803bbd98 EFLAGS: 00010216 RAX: ffff88000bc85700 RBX: ffff880039f8e200 RCX: ffff88000bc85120 RDX: ffff88000bc85700 RSI: 0000000000000002 RDI: ffff880039f8e200 RBP: ffff880039f8e200 R08: ffff8800142d24c0 R09: 0000000101d4b358 R10: 0000000000000000 R11: 0000000000000212 R12: ffff88003fc2e360 R13: ffff8800004dd6a8 R14: 0000000000000080 R15: 0000000000000000 FS: 00002aaaab00e6e0(0000) GS:ffffffff803a7900(0000) knlGS:ffffffff803a7900 CS: e033 DS: 0000 ES: 0000 Process swapper (pid: 0, threadinfo ffffffff803ba000, task ffffffff80307380) Stack: 0000000180117376 0000000100000040 0000000000000001 0009f9d60009f9d6 ffffffff803bbe2c ffff88003fc2e000 ffffffff803bbdc8 ffffffff803bbdc8 ffff880000000000 ffff88003fc2e000 Call Trace:<ffffffff80255859>{net_rx_action+169} <ffffffff8013380b>{__do_softirq+107} <ffffffff801338ad>{do_softirq+61} <ffffffff80114e69>{do_IRQ+57} <ffffffff8010d948>{evtchn_do_upcall+136} <ffffffff80111fb9>{do_hypervisor_callback+17} <ffffffff8010f9f3>{xen_idle+83} <ffffffff8010f9f3>{xen_idle+83} <ffffffff8010fa2f>{cpu_idle+31} <ffffffff803bc6ea>{start_kernel+490} <ffffffff803bc169>{_sinittext+361} Code: c7 00 01 00 00 00 48 8b 83 10 01 00 00 c7 40 04 00 00 00 00 RIP <ffffffff8024026a>{netif_poll+1354} RSP <ffffffff803bbd98> CR2: ffff88000bc85700 <0>Kernel panic - not syncing: Aiee, killing interrupt handler! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Charles Duffy
2005-Oct-31 01:25 UTC
[Xen-devel] DomU panic in net_rx_action *initiated by another DomU*; 7503:20d1a79ebe31
Pardon me replying to my own post; it was only after hitting "send" on the first one I took a closer look and found a much more interesting (and worrisome) aspect to the issue I''ve been seeing: Another DomU crashed with the same error at the same time. This one was testing an experimental Linux kernel patch (impacting procfs handling of symlinks), and particularly unstable for that reason. It''s interesting, though, that the other DomU (without any such patch applied) appeared to be impacted as well by the same issue. This appears to be reproducible. From the DomU initiating the issue [running an experimental kernel patch and exercising a bug in that patch]: Bad rx buffer (memory squeeze?). Bad rx buffer (memory squeeze?). Unable to handle kernel paging request at ffff880000b3c700 RIP: <ffffffff8024026a>{netif_poll+1354} PGD c55067 PUD c56067 PMD c5c067 PTE 0 Oops: 0002 [1] CPU 0 Modules linked in: ext3 jbd unionfs Pid: 0, comm: swapper Tainted: GF 2.6.12.6-xenU RIP: e030:[<ffffffff8024026a>] <ffffffff8024026a>{netif_poll+1354} RSP: e02b:ffffffff803bbd98 EFLAGS: 00010212 RAX: ffff880000b3c700 RBX: ffff880000b97900 RCX: ffff880000b3c064 RDX: ffff880000b3c700 RSI: 0000000000000002 RDI: ffff880000b97900 RBP: ffff880000b97900 R08: 0000000000000000 R09: 0000000000000022 R10: 000000000003f998 R11: 0000000000000212 R12: ffff88003faba360 R13: ffff88003f41a138 R14: 0000000000000080 R15: 0000000000000000 FS: 00002aaaab2890a0(0000) GS:ffffffff803a7900(0000) knlGS:ffffffff80440600 CS: e033 DS: 0000 ES: 0000 Process swapper (pid: 0, threadinfo ffffffff803ba000, task ffffffff80307380) Stack: 0000000100000000 0000000100000040 0000000000000001 0000002800000028 ffffffff803bbe2c ffff88003faba000 ffffffff803bbdc8 ffffffff803bbdc8 0000000000000000 ffff88003f879c60 Call Trace:<ffffffff80255859>{net_rx_action+169} <ffffffff8013380b>{__do_softirq+107} <ffffffff801338ad>{do_softirq+61} <ffffffff80114e69>{do_IRQ+57} <ffffffff8010d948>{evtchn_do_upcall+136} <ffffffff80111fb9>{do_hypervisor_callback+17} <ffffffff8010f9f3>{xen_idle+83} <ffffffff8010f9f3>{xen_idle+83} <ffffffff8010fa2f>{cpu_idle+31} <ffffffff803bc6ea>{start_kernel+490} <ffffffff803bc169>{_sinittext+361} Code: c7 00 01 00 00 00 48 8b 83 10 01 00 00 c7 40 04 00 00 00 00 RIP <ffffffff8024026a>{netif_poll+1354} RSP <ffffffff803bbd98> CR2: ffff880000b3c700 <0>Kernel panic - not syncing: Aiee, killing interrupt handler! From the DomU being impacted by the issue [running no unusual patches or modules, and being stable *except* when the initiating DomU is running]: Unable to handle kernel paging request at ffff88003df8d700 RIP: <ffffffff8024026a>{netif_poll+1354} PGD 5e3067 PUD 5e4067 PMD 7d4067 PTE 0 Oops: 0002 [1] CPU 0 Modules linked in: ipv6 Pid: 0, comm: swapper Tainted: GF 2.6.12.6-xenU RIP: e030:[<ffffffff8024026a>] <ffffffff8024026a>{netif_poll+1354} RSP: e02b:ffffffff803bbd98 EFLAGS: 00010212 RAX: ffff88003df8d700 RBX: ffff88003d99cbc0 RCX: ffff88003df8d05e RDX: ffff88003df8d700 RSI: 0000000000000002 RDI: ffff88003d99cbc0 RBP: ffff88003d99cbc0 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff80381c20 R11: 0000000000000212 R12: ffff88003fc2e360 R13: ffff8800004dd248 R14: 0000000000000080 R15: 0000000000000000 FS: 00002aaaaade3b00(0000) GS:ffffffff803a7900(0000) knlGS:ffffffff803a7900 CS: e033 DS: 0000 ES: 0000 Process swapper (pid: 0, threadinfo ffffffff803ba000, task ffffffff80307380) Stack: 0000000100000000 0000000100000040 0000000000000001 0000174a0000174a ffffffff803bbe2c ffff88003fc2e000 ffffffff803bbdc8 ffffffff803bbdc8 0000000000000000 ffff8800000cae60 Call Trace:<ffffffff80255859>{net_rx_action+169} <ffffffff8013380b>{__do_softirq+107} <ffffffff801338ad>{do_softirq+61} <ffffffff80114e69>{do_IRQ+57} <ffffffff8010d948>{evtchn_do_upcall+136} <ffffffff80111fb9>{do_hypervisor_callback+17} <ffffffff8010f9f3>{xen_idle+83} <ffffffff8010f9f3>{xen_idle+83} <ffffffff8010fa2f>{cpu_idle+31} <ffffffff803bc6ea>{start_kernel+490} <ffffffff803bc169>{_sinittext+361} Code: c7 00 01 00 00 00 48 8b 83 10 01 00 00 c7 40 04 00 00 00 00 RIP <ffffffff8024026a>{netif_poll+1354} RSP <ffffffff803bbd98> CR2: ffff88003df8d700 <0>Kernel panic - not syncing: Aiee, killing interrupt handler! The "experimental kernel patch" in question is a unionfs patch found at http://permalink.gmane.org/gmane.comp.file-systems.unionfs.general/638, when applied to UnionFS 1.1.1 (a different release than that it was initially developed against, though the patch applies cleanly). The bug is repeatedly observable for me when playing with ifup on a system running said patch with a root filesystem on a unionfs mount. If anyone is interested in reproducing it and is unable to do so on the information I''ve provided so far, let me know and I''d be glad to try to offer additional details. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel