I''m running Xen 3.4.3 final and observing the following kernel panic when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux kernel. grub.conf entry is below, followed by the dump. The kernel crash is immediately after the 9th domU is unpaused. title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen) root (hd0,0) kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all com1=38400 0,8n1 console=com1 sync_console noreboot module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base console=hvc0,38400 norhgb netloop.nloopbacks=100 module /mpp-2.6.18-194.11.1.el5xen.img title CentOS (2.6.18-194.11.1.el5) with MPP root (hd0,0) Unable to handle kernel paging request at ffff88002e864000 RIP: [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3 PGD 11a5067 PUD 11a6067 PMD 131b067 PTE 0 Oops: 0000 [1] SMP last sysfs file: /devices/xen-backend/vbd-2-51712/statistics/wr_sect CPU 2 Modules linked in: xt_tcpudp xt_state ip_conntrack nfnetlink xt_physdev bridge nfs lockd fscache nfs_acl sunrpc iptable_filter ip_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_round_robin dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi ac blkbk netbk blktap pciback parport_pc lp parport joydev cdc_ether serial_core usbnet pcspkr bnx2 ide_cd i2c_i801 e1000e i2c_core cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod mppVhba(U) ata_piix libata shpchp megaraid_sas mppUpper(U) sg sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 0, comm: swapper Tainted: G 2.6.18-194.11.1.el5xen #1 RIP: e030:[<ffffffff8041a95f>] [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3 RSP: e02b:ffff88000112fdd0 EFLAGS: 00010246 RAX: 0000000000000042 RBX: ffff88001ef04480 RCX: 0000000000000b50 RDX: ffff8800238f0d00 RSI: ffff88002e864000 RDI: ffff880027bc0000 RBP: 0000000000000000 R08: ffff8800238f0d10 R09: 0000000000000b92 R10: 0000000000000b50 R11: 0000000000000000 R12: 0000000000000042 R13: 0000000000000042 R14: 0000000000000000 R15: ffff880027bc0000 FS: 00002b2e6ab313f0(0000) GS:ffffffff805d2100(0000) knlGS:0000000000000000 CS: e033 DS: 002b ES: 002b Process swapper (pid: 0, threadinfo ffff880006186000, task ffff8800000657e0) Stack: ffff88002c8b27c0 ffff88002b7c8b80 0000000000000b50 ffff88001e5abe80 ffff8800289cf500 ffff88001ef04480 ffff880001be3200 ffffffff8836b732 ffff8800289cf000 0000000000000042 Call Trace: <IRQ> [<ffffffff8836b732>] :netbk:netif_be_start_xmit+0x241/0x471 [<ffffffff8041fbde>] dev_hard_start_xmit+0x1b7/0x28a [<ffffffff8042fff5>] __qdisc_run+0x136/0x1f9 [<ffffffff803b348c>] unmask_evtchn+0x2d/0xd7 [<ffffffff80420f33>] net_tx_action+0xc9/0xf1 [<ffffffff80212cd3>] __do_softirq+0x8d/0x13b [<ffffffff80260da4>] call_softirq+0x1c/0x278 [<ffffffff8026e0c1>] do_softirq+0x31/0x98 [<ffffffff8026df4d>] do_IRQ+0xec/0xf5 [<ffffffff803b3e14>] evtchn_do_upcall+0x13b/0x1fb [<ffffffff802608d6>] do_hypervisor_callback+0x1e/0x2c <EOI> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000 [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000 [<ffffffff8026f4eb>] raw_safe_halt+0x84/0xa8 [<ffffffff8026ca80>] xen_idle+0x38/0x4a [<ffffffff8024add7>] cpu_idle+0x97/0xba Code: f3 a4 0f 84 a2 00 00 00 45 01 d4 49 89 ff 41 ff c6 45 89 cd RIP [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3 RSP <ffff88000112fdd0> CR2: ffff88002e864000 test6: no IPv6 routers present <0>Kernel panic - not syncing: Fatal exception BUG: warning at arch/x86_64/kernel/genapic_xen.c:92/xen_send_IPI_mask() (Tainted: G ) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote:> I''m running Xen 3.4.3 final and observing the following kernel panic > when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux > kernel. grub.conf entry is below, followed by the dump. The kernel > crash is immediately after the 9th domU is unpaused. > > title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen) > root (hd0,0) > kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all > com1=38400 > 0,8n1 console=com1 sync_console noreboot > module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base > console=hvc0,38400 norhgb netloop.nloopbacks=100 > module /mpp-2.6.18-194.11.1.el5xen.img > title CentOS (2.6.18-194.11.1.el5) with MPP > root (hd0,0) >so I assume this crash doesn''t happen if you use the el5 default xen hypervisor? Have you tried using latest linux-2.6.18-xen (from xen.org) ? -- Pasi> > > Unable to handle kernel paging request at ffff88002e864000 RIP: > [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3 > PGD 11a5067 PUD 11a6067 PMD 131b067 PTE 0 > Oops: 0000 [1] SMP > last sysfs file: /devices/xen-backend/vbd-2-51712/statistics/wr_sect > CPU 2 > Modules linked in: xt_tcpudp xt_state ip_conntrack nfnetlink > xt_physdev bridge nfs lockd fscache nfs_acl sunrpc iptable_filter > ip_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad > ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio > cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 > scsi_transport_iscsi dm_round_robin dm_multipath scsi_dh video > backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery > asus_acpi ac blkbk netbk blktap pciback parport_pc lp parport joydev > cdc_ether serial_core usbnet pcspkr bnx2 ide_cd i2c_i801 e1000e > i2c_core cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache > dm_snapshot dm_zero dm_mirror dm_log dm_mod mppVhba(U) ata_piix libata > shpchp megaraid_sas mppUpper(U) sg sd_mod scsi_mod ext3 jbd uhci_hcd > ohci_hcd ehci_hcd > Pid: 0, comm: swapper Tainted: G 2.6.18-194.11.1.el5xen #1 > RIP: e030:[<ffffffff8041a95f>] [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3 > RSP: e02b:ffff88000112fdd0 EFLAGS: 00010246 > RAX: 0000000000000042 RBX: ffff88001ef04480 RCX: 0000000000000b50 > RDX: ffff8800238f0d00 RSI: ffff88002e864000 RDI: ffff880027bc0000 > RBP: 0000000000000000 R08: ffff8800238f0d10 R09: 0000000000000b92 > R10: 0000000000000b50 R11: 0000000000000000 R12: 0000000000000042 > R13: 0000000000000042 R14: 0000000000000000 R15: ffff880027bc0000 > FS: 00002b2e6ab313f0(0000) GS:ffffffff805d2100(0000) knlGS:0000000000000000 > CS: e033 DS: 002b ES: 002b > Process swapper (pid: 0, threadinfo ffff880006186000, task ffff8800000657e0) > Stack: ffff88002c8b27c0 ffff88002b7c8b80 0000000000000b50 ffff88001e5abe80 > ffff8800289cf500 ffff88001ef04480 ffff880001be3200 ffffffff8836b732 > ffff8800289cf000 0000000000000042 > Call Trace: > <IRQ> [<ffffffff8836b732>] :netbk:netif_be_start_xmit+0x241/0x471 > [<ffffffff8041fbde>] dev_hard_start_xmit+0x1b7/0x28a > [<ffffffff8042fff5>] __qdisc_run+0x136/0x1f9 > [<ffffffff803b348c>] unmask_evtchn+0x2d/0xd7 > [<ffffffff80420f33>] net_tx_action+0xc9/0xf1 > [<ffffffff80212cd3>] __do_softirq+0x8d/0x13b > [<ffffffff80260da4>] call_softirq+0x1c/0x278 > [<ffffffff8026e0c1>] do_softirq+0x31/0x98 > [<ffffffff8026df4d>] do_IRQ+0xec/0xf5 > [<ffffffff803b3e14>] evtchn_do_upcall+0x13b/0x1fb > [<ffffffff802608d6>] do_hypervisor_callback+0x1e/0x2c > <EOI> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000 > [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000 > [<ffffffff8026f4eb>] raw_safe_halt+0x84/0xa8 > [<ffffffff8026ca80>] xen_idle+0x38/0x4a > [<ffffffff8024add7>] cpu_idle+0x97/0xba > > > Code: f3 a4 0f 84 a2 00 00 00 45 01 d4 49 89 ff 41 ff c6 45 89 cd > RIP [<ffffffff8041a95f>] skb_copy_bits+0x114/0x1d3 > RSP <ffff88000112fdd0> > CR2: ffff88002e864000 > test6: no IPv6 routers present > <0>Kernel panic - not syncing: Fatal exception > BUG: warning at > arch/x86_64/kernel/genapic_xen.c:92/xen_send_IPI_mask() (Tainted: G > ) > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Aug 20, 2010 at 7:59 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote: >> I''m running Xen 3.4.3 final and observing the following kernel panic >> when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux >> kernel. grub.conf entry is below, followed by the dump. The kernel >> crash is immediately after the 9th domU is unpaused. >> >> title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen) >> root (hd0,0) >> kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all >> com1=38400 >> 0,8n1 console=com1 sync_console noreboot >> module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base >> console=hvc0,38400 norhgb netloop.nloopbacks=100 >> module /mpp-2.6.18-194.11.1.el5xen.img >> title CentOS (2.6.18-194.11.1.el5) with MPP >> root (hd0,0) >> > > so I assume this crash doesn''t happen if you use the el5 default xen hypervisor? > > Have you tried using latest linux-2.6.18-xen (from xen.org) ? > > -- Pasi >The dom0 kernel is the el5 default, but I have not tried running their hypervisor since it is so dated. I''ll give a shot with the latest xen.org kernel. Thanks, Cris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, Aug 20, 2010 at 08:20:54PM -0400, Cris Daniluk wrote:> On Fri, Aug 20, 2010 at 7:59 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote: > >> I''m running Xen 3.4.3 final and observing the following kernel panic > >> when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux > >> kernel. grub.conf entry is below, followed by the dump. The kernel > >> crash is immediately after the 9th domU is unpaused. > >> > >> title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen) > >> root (hd0,0) > >> kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all > >> com1=38400 > >> 0,8n1 console=com1 sync_console noreboot > >> module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base > >> console=hvc0,38400 norhgb netloop.nloopbacks=100 > >> module /mpp-2.6.18-194.11.1.el5xen.img > >> title CentOS (2.6.18-194.11.1.el5) with MPP > >> root (hd0,0) > >> > > > > so I assume this crash doesn''t happen if you use the el5 default xen hypervisor? > > > > Have you tried using latest linux-2.6.18-xen (from xen.org) ? > > > > -- Pasi > > > > The dom0 kernel is the el5 default, but I have not tried running their > hypervisor since it is so dated. I''ll give a shot with the latest > xen.org kernel. >Well the version number looks old (3.1.2), but it has a lot of fixes and backports from newer xen versions. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sat, Aug 21, 2010 at 6:12 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Fri, Aug 20, 2010 at 08:20:54PM -0400, Cris Daniluk wrote: >> On Fri, Aug 20, 2010 at 7:59 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: >> > On Fri, Aug 20, 2010 at 02:15:25PM -0400, Cris Daniluk wrote: >> >> I''m running Xen 3.4.3 final and observing the following kernel panic >> >> when I fire up the 9th domU. dom0 is CentOS 5.5 running a XenLinux >> >> kernel. grub.conf entry is below, followed by the dump. The kernel >> >> crash is immediately after the 9th domU is unpaused. >> >> >> >> title Xen 3.4.3 CentOS (2.6.18-194.11.1.el5xen) >> >> root (hd0,0) >> >> kernel /xen-3.4.3.gz dom0_mem=768M loglvl=all guest_loglvl=all >> >> com1=38400 >> >> 0,8n1 console=com1 sync_console noreboot >> >> module /vmlinuz-2.6.18-194.11.1.el5xen ro root=/dev/vg_internal/lv_base >> >> console=hvc0,38400 norhgb netloop.nloopbacks=100 >> >> module /mpp-2.6.18-194.11.1.el5xen.img >> >> title CentOS (2.6.18-194.11.1.el5) with MPP >> >> root (hd0,0) >> >> >> > >> > so I assume this crash doesn''t happen if you use the el5 default xen hypervisor? >> > >> > Have you tried using latest linux-2.6.18-xen (from xen.org) ? >> > >> > -- Pasi >> > >> >> The dom0 kernel is the el5 default, but I have not tried running their >> hypervisor since it is so dated. I''ll give a shot with the latest >> xen.org kernel. >> > > Well the version number looks old (3.1.2), but it has a lot of fixes and backports > from newer xen versions. > > -- Pasi > >Having the same issue with the EL5 RPM as well. Latest 2.6.18 on xen.org seems to be a little broken. The LSI controller isn''t getting detected after the megaraid_sas module loads. The initrds look identical between the two, so I''m assuming there is some obscure bug that was fixed and backported into the EL5 kernel but not Jeremy''s. I suppose I can try a newer pvops kernel if you think that might be relevant. Seems to be fairly consistently crashing regardless of Xen version, though. Also interestingly it seems it may be more about the VMs starting at once htan the specific number of VMs.. Cris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 08/22/2010 12:27 AM, Cris Daniluk wrote:> Having the same issue with the EL5 RPM as well. Latest 2.6.18 on > xen.org seems to be a little broken. The LSI controller isn''t getting > detected after the megaraid_sas module loads. The initrds look > identical between the two, so I''m assuming there is some obscure bug > that was fixed and backported into the EL5 kernel but not Jeremy''s. I > suppose I can try a newer pvops kernel if you think that might be > relevant. > > Seems to be fairly consistently crashing regardless of Xen version, > though. Also interestingly it seems it may be more about the VMs > starting at once htan the specific number of VMs..Hi Cris, I am a virtualization engineer at Red Hat. We encountered this bug recently but we are not able to reproduce it consistently. Can you do so? If so, I could try giving you a test EL5 kernel (so that you can boot) with upstream''s netback driver to test it. Thanks, Paolo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
We also experience this same problem with the dom0 crashing when multiple domU''s are starting up at the same time. Our setup consists of a dual pair CentOS 5.5 running LVM/DRBD/Xen and have tried the default EL Xen version 3.0.3-105 including other newer versions (3.4.0 and 3.4.3) from the GITCO repo [ http://www.gitco.de/repo/ ] with the crashing still occurring independent of Xen version.>From the kernel crash output we found a related post here [http://lists.linbit.com/pipermail/drbd-user/2009-March/011652.html ] that discusses a problem with I/O and network interface affecting DRBD. By turning off scatter/gather on the dom0''s network interface we no longer experience crashes in the dom0. We came across this crashing issue when migrating our Xen VM''s to newer servers, but when running on our older servers the dom0 did not crash. As this issue appears to be related to the network driver/chipset, for the record and hopefully our results will be useful to others, the difference between these servers NIC is; ---- Old servers (CentOS 5.3) without dom0 crashing, uses tg3 network driver: $ uname -a Linux ldx12020 2.6.18-128.4.1.el5xen #1 SMP Tue Aug 4 20:51:12 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux $ lspci 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit Ethernet PCI Express 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit Ethernet PCI Express $ ethtool -i eth0 driver: tg3 version: 3.93 firmware-version: 5722-v3.07, ASFIPMI v6.02 bus-info: 0000:03:00.0 ---- New servers with dom0 crashing, uses bnx2 network driver: $ uname -a Linux tnx176 2.6.18-194.26.1.el5xen #1 SMP Tue Nov 9 13:35:30 EST 2010 x86_64 x86_64 x86_64 GNU/Linux $ lspci 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) $ ethtool -i eth0 Cannot get driver information: Operation not supported dmesg output: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.2 (Aug 21, 2009) ---- Hopefully we can see this bug fixed. Cheers, Paul -- View this message in context: http://xen.1045712.n5.nabble.com/Dom0-panic-on-xen-3-4-3-tp2642628p3319865.html Sent from the Xen - Dev mailing list archive at Nabble.com. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, Dec 27, 2010 at 04:50:04PM -0800, prickett233 wrote:> > We also experience this same problem with the dom0 crashing when multiple > domU''s are starting up at the same time. Our setup consists of a dual pair > CentOS 5.5 running LVM/DRBD/Xen and have tried the default EL Xen version > 3.0.3-105 including other newer versions (3.4.0 and 3.4.3) from the GITCO > repo [ http://www.gitco.de/repo/ ] with the crashing still occurring > independent of Xen version. > > >From the kernel crash output we found a related post here [ > http://lists.linbit.com/pipermail/drbd-user/2009-March/011652.html ] that > discusses a problem with I/O and network interface affecting DRBD. By > turning off scatter/gather on the dom0''s network interface we no longer > experience crashes in the dom0. > > We came across this crashing issue when migrating our Xen VM''s to newer > servers, but when running on our older servers the dom0 did not crash. As > this issue appears to be related to the network driver/chipset, for the > record and hopefully our results will be useful to others, the difference > between these servers NIC is; > > ---- > > Old servers (CentOS 5.3) without dom0 crashing, uses tg3 network driver: > > $ uname -a > Linux ldx12020 2.6.18-128.4.1.el5xen #1 SMP Tue Aug 4 20:51:12 EDT 2009 > x86_64 x86_64 x86_64 GNU/Linux > > $ lspci > 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit > Ethernet PCI Express > 04:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722 Gigabit > Ethernet PCI Express > > $ ethtool -i eth0 > driver: tg3 > version: 3.93 > firmware-version: 5722-v3.07, ASFIPMI v6.02 > bus-info: 0000:03:00.0 > > ---- > > New servers with dom0 crashing, uses bnx2 network driver: > > $ uname -a > Linux tnx176 2.6.18-194.26.1.el5xen #1 SMP Tue Nov 9 13:35:30 EST 2010 > x86_64 x86_64 x86_64 GNU/Linux > > $ lspci > 01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 > Gigabit Ethernet (rev 20) > 01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 > Gigabit Ethernet (rev 20) > 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 > Gigabit Ethernet (rev 20) > 02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 > Gigabit Ethernet (rev 20) > > $ ethtool -i eth0 > Cannot get driver information: Operation not supported > > dmesg output: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.0.2 (Aug > 21, 2009) > > ---- > > Hopefully we can see this bug fixed. > >Did you open a bug about this to Redhat bugzilla? Do you have a serial console set up and logging both the hypervisor and dom0 linux kernel, so you can see what errors you get when it crashes? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel