Fischer Udo Attila
2009-Nov-23 13:26 UTC
[Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
Hi all, I have upgraded a test machine from OpenSuSE 11.1 to 11.2. I have found following bug: the server is a 2x quadcore intel box also 2x4=8cpu If you limit the dom0 cpu with dom0-cpus= [1-7]: - [xenwatch_cb] is running 100% cpu and makes var log entry every 65 sec BUG: soft lockup - CPU#X stuck for 61s! - xm commands not work - xend is dead if set dom0-cpus to 0 or 8: - everything looks fine Can somebody else confirm that bug? Best regards Udo Attila Fischer ------------------------------ for example: vcpu=7 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4532 root 15 -5 0 0 0 R 100 0.0 11:14.84 xenwatch_cb # ps aux |grep xen root 39 0.0 0.0 0 0 ? S< 13:03 0:00 [xenwatch] root 40 0.0 0.0 0 0 ? S< 13:03 0:00 [xenbus] root 3791 0.0 0.0 11300 1560 ? S 13:04 0:00 /bin/bash /etc/init.d/xend start root 4209 0.0 0.1 107504 13864 ? S 13:04 0:00 /usr/bin/python2.6 /usr/sbin/xend start root 4446 0.0 0.0 8488 1000 ? S 13:04 0:00 xenstored --pid-file /var/run/xenstore.pid root 4448 0.0 0.0 0 0 ? Z 13:04 0:00 [xenconsoled] <defunct> root 4450 0.0 0.0 0 0 ? Zs 13:04 0:00 [xend] <defunct> root 4451 0.0 0.1 107500 11500 ? S 13:04 0:00 /usr/bin/python2.6 /usr/sbin/xend start root 4453 0.0 0.0 22724 560 ? Sl 13:04 0:00 xenconsoled root 4455 0.0 0.2 148304 16652 ? Sl 13:04 0:00 /usr/bin/python2.6 /usr/sbin/xend start root 4532 100 0.0 0 0 ? R< 13:04 40:35 [xenwatch_cb] root 4533 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] root 4534 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] root 4535 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] root 4536 0.0 0.0 0 0 ? D< 13:04 0:00 [xenwatch_cb] from /var/log/messages every 65 sec Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] BUG: soft lockup - CPU#4 stuck for 61s! [xenwatch_cb:4532] Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: sha1_generic hmac cryptomgr aead pcompress crypto_ blkcipher crypto_hash crypto_algapi drbd netbk blkbk blkback_pagemap blktap xenbus_be binfmt_misc xt_tcpudp ip6t_REJ ECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_physdev xt_state iptable_raw iptable_filter ip6table_man gle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tab les ipv6 bridge stp llc dummy fuse loop dm_mod mptctl iTCO_wdt iTCO_vendor_support i5k_amb sg i5000_edac ppdev 8250_ pnp pcspkr sr_mod edac_core parport_pc shpchp e1000e dcdbas 8250 pci_hotplug tg3 parport serio_raw serial_core butto n usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic piix ide_core ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas thermal processor thermal_sys hwmon Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] CPU 4: Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: sha1_generic hmac cryptomgr aead pcompress crypto_blkcipher crypto_hash crypto_algapi drbd netbk blkbk blkback_pagemap blktap xenbus_be binfmt_misc xt_tcpudp ip6t_REJECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_physdev xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 bridge stp llc dummy fuse loop dm_mod mptctl iTCO_wdt iTCO_vendor_support i5k_amb sg i5000_edac ppdev 8250_pnp pcspkr sr_mod edac_core parport_pc shpchp e1000e dcdbas 8250 pci_hotplug tg3 parport serio_raw serial_core button usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic piix ide_core ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas thermal processor thermal_sys hwmon Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RIP: e030:[<ffffffff8005f07f>] [<ffffffff8005f07f>] lock_timer_base+ 0x7f/0x90 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RSP: e02b:ffff8801e8d0bc10 EFLAGS: 00000246 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff80778370 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RDX: 0000000000000007 RSI: ffff8801e8d0bc50 RDI: ffffc90000075280 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RBP: ffff8801e8d0bc40 R08: ffffffff807813b0 R09: 0000000000000000 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R10: ffff8801e8d0bcf0 R11: 00000000e15cfb6d R12: ffffc90000075280 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R13: ffff8801e8d0bc50 R14: 0000000000000000 R15: ffffffff80778600 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] FS: 00007f53d0abf6f0(0000) GS:ffffc90000040000(0000) knlGS:0000000000000000 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CR2: 00007f53d0691260 CR3: 0000000000003000 CR4: 0000000000002660 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] Call Trace: Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f0bc>] try_to_del_timer_sync+0x2c/0x90 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f14a>] del_timer_sync+0x2a/0x50 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046758f>] mce_cpu_callback+0x122/0x1aa Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80471de7>] notifier_call_chain+0x57/0xb0 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80075a1c>] __raw_notifier_call_chain+0x1c/0x40 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045b90f>] _cpu_down+0xaf/0x310 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045bbf7>] cpu_down+0x87/0xb0 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a42c>] vcpu_hotplug+0xce/0x102 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a4ab>] handle_vcpu_hotplug_event+0x4b/0x61 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80306c4c>] xenwatch_handle_callback+0x2c/0x80 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8006fb96>] kthread+0xb6/0xc0 Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8000d38a>] child_rip+0xa/0x20 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Vladislav Karpenko
2009-Nov-24 15:36 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
Yes have that also, y could try to fix it if u say in boot kernel option vcpu amount mine is: dom0_mem=512M dom0_vcpus_pin dom0_max_vcpus=1 but for now i dont use suse 12.2, its not stable with xen 3.4.1 23 ÎĎŃÂ. 2009, × 15:26, Fischer Udo Attila ÎÁĐÉÓÁĚ(Á):> Hi all, > > I have upgraded a test machine from OpenSuSE 11.1 to 11.2. > I have found following bug: > the server is a 2x quadcore intel box also 2x4=8cpu > > If you limit the dom0 cpu with dom0-cpus= [1-7]: > - [xenwatch_cb] is running 100% cpu and makes var log entry every 65 sec > BUG: soft lockup - CPU#X stuck for 61s! > - xm commands not work > - xend is dead > > > > if set dom0-cpus to 0 or 8: > - everything looks fine > > Can somebody else confirm that bug? > > > Best regards > > Udo Attila Fischer > ------------------------------ > > > > for example: vcpu=7 > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 4532 root 15 -5 0 0 0 R 100 0.0 11:14.84 xenwatch_cb > > > # ps aux |grep xen > root 39 0.0 0.0 0 0 ? S< 13:03 0:00 [xenwatch] > root 40 0.0 0.0 0 0 ? S< 13:03 0:00 [xenbus] > root 3791 0.0 0.0 11300 1560 ? S 13:04 0:00 /bin/bash > /etc/init.d/xend start > root 4209 0.0 0.1 107504 13864 ? S 13:04 0:00 > /usr/bin/python2.6 /usr/sbin/xend start > root 4446 0.0 0.0 8488 1000 ? S 13:04 0:00 xenstored > --pid-file /var/run/xenstore.pid > root 4448 0.0 0.0 0 0 ? Z 13:04 0:00 > [xenconsoled] <defunct> > root 4450 0.0 0.0 0 0 ? Zs 13:04 0:00 [xend] > <defunct> > root 4451 0.0 0.1 107500 11500 ? S 13:04 0:00 > /usr/bin/python2.6 /usr/sbin/xend start > root 4453 0.0 0.0 22724 560 ? Sl 13:04 0:00 xenconsoled > root 4455 0.0 0.2 148304 16652 ? Sl 13:04 0:00 > /usr/bin/python2.6 /usr/sbin/xend start > root 4532 100 0.0 0 0 ? R< 13:04 40:35 > [xenwatch_cb] > root 4533 0.0 0.0 0 0 ? D< 13:04 0:00 > [xenwatch_cb] > root 4534 0.0 0.0 0 0 ? D< 13:04 0:00 > [xenwatch_cb] > root 4535 0.0 0.0 0 0 ? D< 13:04 0:00 > [xenwatch_cb] > root 4536 0.0 0.0 0 0 ? D< 13:04 0:00 > [xenwatch_cb] > > > > from /var/log/messages every 65 sec > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] BUG: soft lockup - CPU#4 > stuck for 61s! [xenwatch_cb:4532] > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: > sha1_generic hmac cryptomgr aead pcompress crypto_ > blkcipher crypto_hash crypto_algapi drbd netbk blkbk blkback_pagemap > blktap xenbus_be binfmt_misc xt_tcpudp ip6t_REJ > ECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_physdev > xt_state iptable_raw iptable_filter ip6table_man > gle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 > ip_tables ip6table_filter ip6_tables x_tab > les ipv6 bridge stp llc dummy fuse loop dm_mod mptctl iTCO_wdt > iTCO_vendor_support i5k_amb sg i5000_edac ppdev 8250_ > pnp pcspkr sr_mod edac_core parport_pc shpchp e1000e dcdbas 8250 > pci_hotplug tg3 parport serio_raw serial_core butto > n usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic > piix ide_core ata_generic ata_piix mptsas > mptscsih mptbase scsi_transport_sas thermal processor thermal_sys hwmon > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] CPU 4: > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: > sha1_generic hmac cryptomgr aead pcompress crypto_blkcipher crypto_hash > crypto_algapi drbd netbk blkbk blkback_pagemap blktap xenbus_be > binfmt_misc xt_tcpudp ip6t_REJECT nf_conntrack_ipv6 ip6table_raw > xt_NOTRACK ipt_REJECT xt_physdev xt_state iptable_raw iptable_filter > ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack > nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 bridge > stp llc dummy fuse loop dm_mod mptctl iTCO_wdt iTCO_vendor_support i5k_amb > sg i5000_edac ppdev 8250_pnp pcspkr sr_mod edac_core parport_pc shpchp > e1000e dcdbas 8250 pci_hotplug tg3 parport serio_raw serial_core button > usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic > piix ide_core ata_generic ata_piix mptsas mptscsih mptbase > scsi_transport_sas thermal processor thermal_sys hwmon > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RIP: > e030:[<ffffffff8005f07f>] [<ffffffff8005f07f>] lock_timer_base+ > 0x7f/0x90 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RSP: e02b:ffff8801e8d0bc10 > EFLAGS: 00000246 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RAX: 0000000000000000 RBX: > 0000000000000000 RCX: ffffffff80778370 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RDX: 0000000000000007 RSI: > ffff8801e8d0bc50 RDI: ffffc90000075280 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RBP: ffff8801e8d0bc40 R08: > ffffffff807813b0 R09: 0000000000000000 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R10: ffff8801e8d0bcf0 R11: > 00000000e15cfb6d R12: ffffc90000075280 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R13: ffff8801e8d0bc50 R14: > 0000000000000000 R15: ffffffff80778600 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] FS: 00007f53d0abf6f0(0000) > GS:ffffc90000040000(0000) knlGS:0000000000000000 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CS: e033 DS: 0000 ES: 0000 > CR0: 000000008005003b > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CR2: 00007f53d0691260 CR3: > 0000000000003000 CR4: 0000000000002660 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] Call Trace: > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f0bc>] > try_to_del_timer_sync+0x2c/0x90 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f14a>] > del_timer_sync+0x2a/0x50 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046758f>] > mce_cpu_callback+0x122/0x1aa > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80471de7>] > notifier_call_chain+0x57/0xb0 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80075a1c>] > __raw_notifier_call_chain+0x1c/0x40 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045b90f>] > _cpu_down+0xaf/0x310 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045bbf7>] > cpu_down+0x87/0xb0 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a42c>] > vcpu_hotplug+0xce/0x102 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a4ab>] > handle_vcpu_hotplug_event+0x4b/0x61 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80306c4c>] > xenwatch_handle_callback+0x2c/0x80 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8006fb96>] > kthread+0xb6/0xc0 > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8000d38a>] > child_rip+0xa/0x20 > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Moi meme
2009-Nov-24 15:52 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
Hello, I get a problem while upgrading to OpenSuse 11.2 : cf : https://bugzilla.novell.com/show_bug.cgi?id=552492#status_changes I get a kernel patch and all is OK now the VMs are running flawlessly, even if my server is a much smaller one. You didn''t say how much RAM is in your system. Regards JPP Le mardi 24 novembre 2009 à 17:36 +0200, Vladislav Karpenko a écrit :> Yes have that also, y could try to fix it if u say in boot kernel option vcpu amount > mine is: > dom0_mem=512M dom0_vcpus_pin dom0_max_vcpus=1 > > but for now i dont use suse 12.2, its not stable with xen 3.4.1 > > > 23 ÎĎŃÂ. 2009, × 15:26, Fischer Udo Attila ÎÁĐÉÓÁĚ(Á): > > > Hi all, > > > > I have upgraded a test machine from OpenSuSE 11.1 to 11.2. > > I have found following bug: > > the server is a 2x quadcore intel box also 2x4=8cpu > > > > If you limit the dom0 cpu with dom0-cpus= [1-7]: > > - [xenwatch_cb] is running 100% cpu and makes var log entry every 65 sec > > BUG: soft lockup - CPU#X stuck for 61s! > > - xm commands not work > > - xend is dead > > > > > > > > if set dom0-cpus to 0 or 8: > > - everything looks fine > > > > Can somebody else confirm that bug? > > > > > > Best regards > > > > Udo Attila Fischer > > ------------------------------ > > > > > > > > for example: vcpu=7 > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 4532 root 15 -5 0 0 0 R 100 0.0 11:14.84 xenwatch_cb > > > > > > # ps aux |grep xen > > root 39 0.0 0.0 0 0 ? S< 13:03 0:00 [xenwatch] > > root 40 0.0 0.0 0 0 ? S< 13:03 0:00 [xenbus] > > root 3791 0.0 0.0 11300 1560 ? S 13:04 0:00 /bin/bash > > /etc/init.d/xend start > > root 4209 0.0 0.1 107504 13864 ? S 13:04 0:00 > > /usr/bin/python2.6 /usr/sbin/xend start > > root 4446 0.0 0.0 8488 1000 ? S 13:04 0:00 xenstored > > --pid-file /var/run/xenstore.pid > > root 4448 0.0 0.0 0 0 ? Z 13:04 0:00 > > [xenconsoled] <defunct> > > root 4450 0.0 0.0 0 0 ? Zs 13:04 0:00 [xend] > > <defunct> > > root 4451 0.0 0.1 107500 11500 ? S 13:04 0:00 > > /usr/bin/python2.6 /usr/sbin/xend start > > root 4453 0.0 0.0 22724 560 ? Sl 13:04 0:00 xenconsoled > > root 4455 0.0 0.2 148304 16652 ? Sl 13:04 0:00 > > /usr/bin/python2.6 /usr/sbin/xend start > > root 4532 100 0.0 0 0 ? R< 13:04 40:35 > > [xenwatch_cb] > > root 4533 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > root 4534 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > root 4535 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > root 4536 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > > > > > > > from /var/log/messages every 65 sec > > > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] BUG: soft lockup - CPU#4 > > stuck for 61s! [xenwatch_cb:4532] > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: > > sha1_generic hmac cryptomgr aead pcompress crypto_ > > blkcipher crypto_hash crypto_algapi drbd netbk blkbk blkback_pagemap > > blktap xenbus_be binfmt_misc xt_tcpudp ip6t_REJ > > ECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_physdev > > xt_state iptable_raw iptable_filter ip6table_man > > gle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 > > ip_tables ip6table_filter ip6_tables x_tab > > les ipv6 bridge stp llc dummy fuse loop dm_mod mptctl iTCO_wdt > > iTCO_vendor_support i5k_amb sg i5000_edac ppdev 8250_ > > pnp pcspkr sr_mod edac_core parport_pc shpchp e1000e dcdbas 8250 > > pci_hotplug tg3 parport serio_raw serial_core butto > > n usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic > > piix ide_core ata_generic ata_piix mptsas > > mptscsih mptbase scsi_transport_sas thermal processor thermal_sys hwmon > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] CPU 4: > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: > > sha1_generic hmac cryptomgr aead pcompress crypto_blkcipher crypto_hash > > crypto_algapi drbd netbk blkbk blkback_pagemap blktap xenbus_be > > binfmt_misc xt_tcpudp ip6t_REJECT nf_conntrack_ipv6 ip6table_raw > > xt_NOTRACK ipt_REJECT xt_physdev xt_state iptable_raw iptable_filter > > ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack > > nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 bridge > > stp llc dummy fuse loop dm_mod mptctl iTCO_wdt iTCO_vendor_support i5k_amb > > sg i5000_edac ppdev 8250_pnp pcspkr sr_mod edac_core parport_pc shpchp > > e1000e dcdbas 8250 pci_hotplug tg3 parport serio_raw serial_core button > > usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic > > piix ide_core ata_generic ata_piix mptsas mptscsih mptbase > > scsi_transport_sas thermal processor thermal_sys hwmon > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RIP: > > e030:[<ffffffff8005f07f>] [<ffffffff8005f07f>] lock_timer_base+ > > 0x7f/0x90 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RSP: e02b:ffff8801e8d0bc10 > > EFLAGS: 00000246 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RAX: 0000000000000000 RBX: > > 0000000000000000 RCX: ffffffff80778370 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RDX: 0000000000000007 RSI: > > ffff8801e8d0bc50 RDI: ffffc90000075280 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RBP: ffff8801e8d0bc40 R08: > > ffffffff807813b0 R09: 0000000000000000 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R10: ffff8801e8d0bcf0 R11: > > 00000000e15cfb6d R12: ffffc90000075280 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R13: ffff8801e8d0bc50 R14: > > 0000000000000000 R15: ffffffff80778600 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] FS: 00007f53d0abf6f0(0000) > > GS:ffffc90000040000(0000) knlGS:0000000000000000 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CS: e033 DS: 0000 ES: 0000 > > CR0: 000000008005003b > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CR2: 00007f53d0691260 CR3: > > 0000000000003000 CR4: 0000000000002660 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR0: 0000000000000000 DR1: > > 0000000000000000 DR2: 0000000000000000 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR3: 0000000000000000 DR6: > > 00000000ffff0ff0 DR7: 0000000000000400 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] Call Trace: > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f0bc>] > > try_to_del_timer_sync+0x2c/0x90 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f14a>] > > del_timer_sync+0x2a/0x50 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046758f>] > > mce_cpu_callback+0x122/0x1aa > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80471de7>] > > notifier_call_chain+0x57/0xb0 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80075a1c>] > > __raw_notifier_call_chain+0x1c/0x40 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045b90f>] > > _cpu_down+0xaf/0x310 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045bbf7>] > > cpu_down+0x87/0xb0 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a42c>] > > vcpu_hotplug+0xce/0x102 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a4ab>] > > handle_vcpu_hotplug_event+0x4b/0x61 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80306c4c>] > > xenwatch_handle_callback+0x2c/0x80 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8006fb96>] > > kthread+0xb6/0xc0 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8000d38a>] > > child_rip+0xa/0x20 > > > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Boris Derzhavets
2009-Nov-24 18:05 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
> I get a kernel patch and all is OK now the VMs are running flawlessly, > even if my server is a much smaller one.Could you try install F12 PV DomU (minimal set of packages) ? Boris. --- On Tue, 11/24/09, Moi meme <storm66@club-internet.fr> wrote: From: Moi meme <storm66@club-internet.fr> Subject: Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead To: "Vladislav Karpenko" <vladislav@karpenko.od.ua> Cc: xen-users@lists.xensource.com Date: Tuesday, November 24, 2009, 10:52 AM Hello, I get a problem while upgrading to OpenSuse 11.2 : cf : https://bugzilla.novell.com/show_bug.cgi?id=552492#status_changes I get a kernel patch and all is OK now the VMs are running flawlessly, even if my server is a much smaller one. You didn''t say how much RAM is in your system. Regards JPP Le mardi 24 novembre 2009 à 17:36 +0200, Vladislav Karpenko a écrit :> Yes have that also, y could try to fix it if u say in boot kernel option vcpu amount > mine is: > dom0_mem=512M dom0_vcpus_pin dom0_max_vcpus=1 > > but for now i dont use suse 12.2, its not stable with xen 3.4.1 > > > 23 ÎĎŃÂ. 2009, × 15:26, Fischer Udo Attila ÎÁĐÉÓÁĚ(Á): > > > Hi all, > > > > I have upgraded a test machine from OpenSuSE 11.1 to 11.2. > > I have found following bug: > > the server is a 2x quadcore intel box also 2x4=8cpu > > > > If you limit the dom0 cpu with dom0-cpus= [1-7]: > > - [xenwatch_cb] is running 100% cpu and makes var log entry every 65 sec > > BUG: soft lockup - CPU#X stuck for 61s! > > - xm commands not work > > - xend is dead > > > > > > > > if set dom0-cpus to 0 or 8: > > - everything looks fine > > > > Can somebody else confirm that bug? > > > > > > Best regards > > > > Udo Attila Fischer > > ------------------------------ > > > > > > > > for example: vcpu=7 > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 4532 root 15 -5 0 0 0 R 100 0.0 11:14.84 xenwatch_cb > > > > > > # ps aux |grep xen > > root 39 0.0 0.0 0 0 ? S< 13:03 0:00 [xenwatch] > > root 40 0.0 0.0 0 0 ? S< 13:03 0:00 [xenbus] > > root 3791 0.0 0.0 11300 1560 ? S 13:04 0:00 /bin/bash > > /etc/init.d/xend start > > root 4209 0.0 0.1 107504 13864 ? S 13:04 0:00 > > /usr/bin/python2.6 /usr/sbin/xend start > > root 4446 0.0 0.0 8488 1000 ? S 13:04 0:00 xenstored > > --pid-file /var/run/xenstore.pid > > root 4448 0.0 0.0 0 0 ? Z 13:04 0:00 > > [xenconsoled] <defunct> > > root 4450 0.0 0.0 0 0 ? Zs 13:04 0:00 [xend] > > <defunct> > > root 4451 0.0 0.1 107500 11500 ? S 13:04 0:00 > > /usr/bin/python2.6 /usr/sbin/xend start > > root 4453 0.0 0.0 22724 560 ? Sl 13:04 0:00 xenconsoled > > root 4455 0.0 0.2 148304 16652 ? Sl 13:04 0:00 > > /usr/bin/python2.6 /usr/sbin/xend start > > root 4532 100 0.0 0 0 ? R< 13:04 40:35 > > [xenwatch_cb] > > root 4533 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > root 4534 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > root 4535 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > root 4536 0.0 0.0 0 0 ? D< 13:04 0:00 > > [xenwatch_cb] > > > > > > > > from /var/log/messages every 65 sec > > > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] BUG: soft lockup - CPU#4 > > stuck for 61s! [xenwatch_cb:4532] > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: > > sha1_generic hmac cryptomgr aead pcompress crypto_ > > blkcipher crypto_hash crypto_algapi drbd netbk blkbk blkback_pagemap > > blktap xenbus_be binfmt_misc xt_tcpudp ip6t_REJ > > ECT nf_conntrack_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_physdev > > xt_state iptable_raw iptable_filter ip6table_man > > gle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 > > ip_tables ip6table_filter ip6_tables x_tab > > les ipv6 bridge stp llc dummy fuse loop dm_mod mptctl iTCO_wdt > > iTCO_vendor_support i5k_amb sg i5000_edac ppdev 8250_ > > pnp pcspkr sr_mod edac_core parport_pc shpchp e1000e dcdbas 8250 > > pci_hotplug tg3 parport serio_raw serial_core butto > > n usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic > > piix ide_core ata_generic ata_piix mptsas > > mptscsih mptbase scsi_transport_sas thermal processor thermal_sys hwmon > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] CPU 4: > > Nov 23 13:55:14 dom0-u2 kernel: [ 3112.781517] Modules linked in: > > sha1_generic hmac cryptomgr aead pcompress crypto_blkcipher crypto_hash > > crypto_algapi drbd netbk blkbk blkback_pagemap blktap xenbus_be > > binfmt_misc xt_tcpudp ip6t_REJECT nf_conntrack_ipv6 ip6table_raw > > xt_NOTRACK ipt_REJECT xt_physdev xt_state iptable_raw iptable_filter > > ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack > > nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 bridge > > stp llc dummy fuse loop dm_mod mptctl iTCO_wdt iTCO_vendor_support i5k_amb > > sg i5000_edac ppdev 8250_pnp pcspkr sr_mod edac_core parport_pc shpchp > > e1000e dcdbas 8250 pci_hotplug tg3 parport serio_raw serial_core button > > usbhid hid uhci_hcd ehci_hcd xenblk cdrom xennet edd fan ide_pci_generic > > piix ide_core ata_generic ata_piix mptsas mptscsih mptbase > > scsi_transport_sas thermal processor thermal_sys hwmon > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RIP: > > e030:[<ffffffff8005f07f>] [<ffffffff8005f07f>] lock_timer_base+ > > 0x7f/0x90 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RSP: e02b:ffff8801e8d0bc10 > > EFLAGS: 00000246 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RAX: 0000000000000000 RBX: > > 0000000000000000 RCX: ffffffff80778370 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RDX: 0000000000000007 RSI: > > ffff8801e8d0bc50 RDI: ffffc90000075280 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] RBP: ffff8801e8d0bc40 R08: > > ffffffff807813b0 R09: 0000000000000000 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R10: ffff8801e8d0bcf0 R11: > > 00000000e15cfb6d R12: ffffc90000075280 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] R13: ffff8801e8d0bc50 R14: > > 0000000000000000 R15: ffffffff80778600 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] FS: 00007f53d0abf6f0(0000) > > GS:ffffc90000040000(0000) knlGS:0000000000000000 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CS: e033 DS: 0000 ES: 0000 > > CR0: 000000008005003b > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] CR2: 00007f53d0691260 CR3: > > 0000000000003000 CR4: 0000000000002660 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR0: 0000000000000000 DR1: > > 0000000000000000 DR2: 0000000000000000 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] DR3: 0000000000000000 DR6: > > 00000000ffff0ff0 DR7: 0000000000000400 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] Call Trace: > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f0bc>] > > try_to_del_timer_sync+0x2c/0x90 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8005f14a>] > > del_timer_sync+0x2a/0x50 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046758f>] > > mce_cpu_callback+0x122/0x1aa > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80471de7>] > > notifier_call_chain+0x57/0xb0 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80075a1c>] > > __raw_notifier_call_chain+0x1c/0x40 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045b90f>] > > _cpu_down+0xaf/0x310 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8045bbf7>] > > cpu_down+0x87/0xb0 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a42c>] > > vcpu_hotplug+0xce/0x102 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8046a4ab>] > > handle_vcpu_hotplug_event+0x4b/0x61 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff80306c4c>] > > xenwatch_handle_callback+0x2c/0x80 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8006fb96>] > > kthread+0xb6/0xc0 > > Nov 23 13:54:09 dom0-u2 kernel: [ 3047.280855] [<ffffffff8000d38a>] > > child_rip+0xa/0x20 > > > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fischer Udo Attila
2009-Nov-30 17:20 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
Hi all, I had only now time to test the settings. I tried both methods: limiting to 4Gb via mem=4G - same result xen dead, memory is limited to 4G, so it is not the same issue as the X server one second method: dom0_mem=512M dom0_vcpus_pin dom0_max_vcpus=1 - does not limit the cpus- still 8 cpus used, tested :( I have opened a bug at opensuse.org http://bugzilla.novell.com/show_bug.cgi?id=558663 and today they reassigned the bug, so I hope there will be some progress in the near future. Best regards, Udo Attila Fischer _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Boris Derzhavets
2009-Nov-30 18:15 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
My experience is a bit different :- Original issue with X-Server crash under xenified kernel 2.6.31.5 was patched pretty quickly. Xen Host on top of Suse 11.2 seems to be alive, however :- https://bugzilla.novell.com/show_bug.cgi?id=553690 https://bugzilla.novell.com/show_bug.cgi?id=555181 are still opened. Boris. ________________________________ From: Fischer Udo Attila <udo@udo.hu> Cc: xen-users@lists.xensource.com Sent: Mon, November 30, 2009 8:20:03 PM Subject: Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead Hi all, I had only now time to test the settings. I tried both methods: limiting to 4Gb via mem=4G - same result xen dead, memory is limited to 4G, so it is not the same issue as the X server one second method: dom0_mem=512M dom0_vcpus_pin dom0_max_vcpus=1 - does not limit the cpus- still 8 cpus used, tested :( I have opened a bug at opensuse.org http://bugzilla.novell.com/show_bug.cgi?id=558663 and today they reassigned the bug, so I hope there will be some progress in the near future. Best regards, Udo Attila Fischer _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi All, I need some advice. I have been running Xen under Ubuntu LTS 8.0.4 for about 18 months. The original install gave me the 2.6.24-19-xen kernel. There were a couple of updates and all my machines are currently running 2.6.24-23-xen. Since Ubuntu decided to go with KVM I imagine that there will be no further updates. Recently I had read that in the future Remus is going to be added to the kernel. HA is very desirable for me as I have a number of HA web servers that I handle with load balancing in front of virtualized images spread over multiple servers. The addition of synchronized failover images would be huge! So I started looking around at other distros that are going to continue to support Xen. I downloaded the latest OpenSuSE and installed on a spare HP DL380. I was able to get it up and running on 2.6.31.5-0.1-xen. After testing for a few days I decided to start upgrading my production machines. To take a step back for a moment, my server farm consists of mostly HP DL380 G3''s with 8 GB of RAM and dual 3.06 GHz Xeons. I also have a couple of HP DL580 with 32 GB RAM and Quad Xeons. While this gear is not the latest and greatest they are real workhorses and I have a lot of them. They run mostly mail, web, dns, etc. No desktops or Windows so PV is fine. So I decided to upgrade one of the DL580''s. The install seems to go OK but upon restart it continuously hangs when bringing up the BR1 interface. I installed the standard kernel and the interfaces come up fine. Furthermore I added an additional gigabit Broadcom based HP Ethernet card. I got it configured as eth2 and again it works fine with the non Xen kernel. Even if I disable the onboard eth1 and assign br1 to eth2 it continues to hang on boot. Frustrated after two days of messing around I installed SuSE on a brand new IBM x3500 M2 w/ dual Xeon 2.66 GHz X5550 4C''s. Again the install went fine. However almost daily the box just completely hangs. I can''t seem to find any log entries to figure out why this is happening. I can create a hang by just untarring a large file in either the DomU or a DomO. Plenty of memory and free processors so I don''t know why it continues to die. As you can guess my current experience with SuSE has left a bad taste in my mouth on both old and new hardware. I was about to download Fedora and go it again but before spending the time I thought I''d ask the opinion of the list as to what they would do/recommend. My gut tells me to just bit the bullet and start compiling my own kernels and stay with Ubuntu. Ideas, comments, suggestions? Thanks, Dana _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Mon, Nov 30, 2009 at 9:09 PM, Dana Rawding <dana@twc-inc.net> wrote:> Hi All, > > I need some advice. I have been running Xen under Ubuntu LTS 8.0.4 for about 18 months. The original install gave me the 2.6.24-19-xen kernel. > There were a couple of updates and all my machines are currently running 2.6.24-23-xen. Since Ubuntu decided to go with KVM I imagine that there will be no further updates. > > Recently I had read that in the future Remus is going to be added to the kernel. HA is very desirable for me as I have a number of HA web servers that I handle with load balancing in front of virtualized images spread over multiple servers. The addition of synchronized failover images would be huge! So I started looking around at other distros that are going to continue to support Xen. I downloaded the latest OpenSuSE and installed on a spare HP DL380. I was able to get it up and running on 2.6.31.5-0.1-xen. After testing for a few days I decided to start upgrading my production machines. > > To take a step back for a moment, my server farm consists of mostly HP DL380 G3''s with 8 GB of RAM and dual 3.06 GHz Xeons. I also have a couple of HP DL580 with 32 GB RAM and Quad Xeons. While this gear is not the latest and greatest they are real workhorses and I have a lot of them. They run mostly mail, web, dns, etc. No desktops or Windows so PV is fine. > > So I decided to upgrade one of the DL580''s. The install seems to go OK but upon restart it continuously hangs when bringing up the BR1 interface. I installed the standard kernel and the interfaces come up fine. Furthermore I added an additional gigabit Broadcom based HP Ethernet card. I got it configured as eth2 and again it works fine with the non Xen kernel. Even if I disable the onboard eth1 and assign br1 to eth2 it continues to hang on boot. Frustrated after two days of messing around I installed SuSE on a brand new IBM x3500 M2 w/ dual Xeon 2.66 GHz X5550 4C''s. Again the install went fine. However almost daily the box just completely hangs. I can''t seem to find any log entries to figure out why this is happening. I can create a hang by just untarring a large file in either the DomU or a DomO. Plenty of memory and free processors so I don''t know why it continues to die. > > As you can guess my current experience with SuSE has left a bad taste in my mouth on both old and new hardware. I was about to download Fedora and go it again but before spending the time I thought I''d ask the opinion of the list as to what they would do/recommend. My gut tells me to just bit the bullet and start compiling my own kernels and stay with Ubuntu. Ideas, comments, suggestions? > > Thanks, > Dana > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >Dana, I have had problems with dom0 locking up or crashing when running a openSUSE dom0 kernel >= 2.6.30, 2.6.29 and prior are very stable but anything after that seems to suffer from this problem, from what I''ve read it doesn''t seem to affect all systems but all 3 of my Xen servers suffer from this problem, so I''m currently running 2.6.29 and regularly hitting 100+ days uptime. Can you setup a serial console on these servers? You can use it to gather any crash messages that are logged, and if dom0 does lock up completely you can dump its state using ''0'' while serial console is connected to Xen. I wouldn''t let it put you off, imho openSUSE is the best choice if you want to run Xen with a up to date kernel, Jan Beulich is working on this problem, I exchanged several emails with him about this today and other people have also provided diagnostic data, so hopefully it will be fixed soon, if I were you I would wait until this bug is fixed and try again, in my experience the openSUSE Xen kernels are usually very stable. Fedora are switching to kvm, as have Ubuntu, debian dom0 kernels are pretty old, so I think openSUSE really is the best supported for Xen. Andy _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I had a lot of such problems i m not an expert user but by reading on threads on this community and asking questions I learnt a lot I will suggest you to use CentOS I have tested here 2 Dell Laptops and three separate machines OpenSuse,Debian,Ubuntu Fedora and now CentOS . My experience with CentOS is good for Xen since some or the other way I got some sort of error on my machines right now I am still compiling a Dom0 kernel along with Xen on Debian machine. On my latest machine in spite of the errors CentOS was able to work. On Tue, Dec 1, 2009 at 2:39 AM, Dana Rawding <dana@twc-inc.net> wrote:> Hi All, > > I need some advice. I have been running Xen under Ubuntu LTS 8.0.4 for about 18 months. The original install gave me the 2.6.24-19-xen kernel. > There were a couple of updates and all my machines are currently running 2.6.24-23-xen. Since Ubuntu decided to go with KVM I imagine that there will be no further updates. > > Recently I had read that in the future Remus is going to be added to the kernel. HA is very desirable for me as I have a number of HA web servers that I handle with load balancing in front of virtualized images spread over multiple servers. The addition of synchronized failover images would be huge! So I started looking around at other distros that are going to continue to support Xen. I downloaded the latest OpenSuSE and installed on a spare HP DL380. I was able to get it up and running on 2.6.31.5-0.1-xen. After testing for a few days I decided to start upgrading my production machines. > > To take a step back for a moment, my server farm consists of mostly HP DL380 G3''s with 8 GB of RAM and dual 3.06 GHz Xeons. I also have a couple of HP DL580 with 32 GB RAM and Quad Xeons. While this gear is not the latest and greatest they are real workhorses and I have a lot of them. They run mostly mail, web, dns, etc. No desktops or Windows so PV is fine. > > So I decided to upgrade one of the DL580''s. The install seems to go OK but upon restart it continuously hangs when bringing up the BR1 interface. I installed the standard kernel and the interfaces come up fine. Furthermore I added an additional gigabit Broadcom based HP Ethernet card. I got it configured as eth2 and again it works fine with the non Xen kernel. Even if I disable the onboard eth1 and assign br1 to eth2 it continues to hang on boot. Frustrated after two days of messing around I installed SuSE on a brand new IBM x3500 M2 w/ dual Xeon 2.66 GHz X5550 4C''s. Again the install went fine. However almost daily the box just completely hangs. I can''t seem to find any log entries to figure out why this is happening. I can create a hang by just untarring a large file in either the DomU or a DomO. Plenty of memory and free processors so I don''t know why it continues to die. > > As you can guess my current experience with SuSE has left a bad taste in my mouth on both old and new hardware. I was about to download Fedora and go it again but before spending the time I thought I''d ask the opinion of the list as to what they would do/recommend. My gut tells me to just bit the bullet and start compiling my own kernels and stay with Ubuntu. Ideas, comments, suggestions? > > Thanks, > Dana > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- http://www.abhitech.com _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Mon, Nov 30, 2009 at 04:09:50PM -0500, Dana Rawding wrote:> Hi All, > > I need some advice. I have been running Xen under Ubuntu LTS 8.0.4 for about 18 months. The original install gave me the 2.6.24-19-xen kernel. > There were a couple of updates and all my machines are currently running 2.6.24-23-xen. Since Ubuntu decided to go with KVM I imagine that there will be no further updates. > > Recently I had read that in the future Remus is going to be added to the kernel. HA is very desirable for me as I have a number of HA web servers that I handle with load balancing in front of virtualized images spread over multiple servers. The addition of synchronized failover images would be huge! So I started looking around at other distros that are going to continue to support Xen. I downloaded the latest OpenSuSE and installed on a spare HP DL380. I was able to get it up and running on 2.6.31.5-0.1-xen. After testing for a few days I decided to start upgrading my production machines. > > To take a step back for a moment, my server farm consists of mostly HP DL380 G3''s with 8 GB of RAM and dual 3.06 GHz Xeons. I also have a couple of HP DL580 with 32 GB RAM and Quad Xeons. While this gear is not the latest and greatest they are real workhorses and I have a lot of them. They run mostly mail, web, dns, etc. No desktops or Windows so PV is fine. > > So I decided to upgrade one of the DL580''s. The install seems to go OK but upon restart it continuously hangs when bringing up the BR1 interface. I installed the standard kernel and the interfaces come up fine. Furthermore I added an additional gigabit Broadcom based HP Ethernet card. I got it configured as eth2 and again it works fine with the non Xen kernel. Even if I disable the onboard eth1 and assign br1 to eth2 it continues to hang on boot. Frustrated after two days of messing around I installed SuSE on a brand new IBM x3500 M2 w/ dual Xeon 2.66 GHz X5550 4C''s. Again the install went fine. However almost daily the box just completely hangs. I can''t seem to find any log entries to figure out why this is happening. I can create a hang by just untarring a large file in either the DomU or a DomO. Plenty of memory and free processors so I don''t know why it continues to die. > > As you can guess my current experience with SuSE has left a bad taste in my mouth on both old and new hardware. I was about to download Fedora and go it again but before spending the time I thought I''d ask the opinion of the list as to what they would do/recommend. My gut tells me to just bit the bullet and start compiling my own kernels and stay with Ubuntu. Ideas, comments, suggestions? >Remember it''s not only the dom0 Linux kernel, you also need to have the actual Xen hypervisor. Compiling from the source is always an option.. Xen hypervisor 3.4.2 got released two weeks ago or so. Xen 3.4.x releases still use linux-2.6.18-xen.hg as the default and officially supported dom0 kernel, so if 2.6.18 has all the drivers for your hardware that''s what you should use. There are other options aswell. Check this wiki page for the other available dom0 kernels: http://wiki.xensource.com/xenwiki/XenDom0Kernels Personally I''ve used Fedora on my testing boxes, and it has been OK. Fedora 12 includes Xen hypervisor/tools 3.4.1 (with some fixes/patches), and all the libvirt/virsh/virt-install/virt-manager stuff. But F12 lacks dom0 kernel.. it''s not included out-of-the-box. You can grab the xendom0 testing rpms, or compile your own dom0 kernel. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Mon, Nov 30, 2009 at 04:09:50PM -0500, Dana Rawding wrote:> Hi All, > > I need some advice. I have been running Xen under Ubuntu LTS 8.0.4 for about 18 months. The original install gave me the 2.6.24-19-xen kernel. > There were a couple of updates and all my machines are currently running 2.6.24-23-xen. Since Ubuntu decided to go with KVM I imagine that there will be no further updates. > > Recently I had read that in the future Remus is going to be added to the kernel. HA is very desirable for me as I have a number of HA web servers that I handle with load balancing in front of virtualized images spread over multiple servers. The addition of synchronized failover images would be huge! >Remus was committed to xen-unstable already, so it''ll be part of Xen 4.0 release in 2010.> So I started looking around at other distros that are going to continue to support Xen. I downloaded the latest OpenSuSE and installed on a spare HP DL380. I was able to get it up and running on 2.6.31.5-0.1-xen. After testing for a few days I decided to start upgrading my production machines. >Redhat''s RHEL5 (and thus CentOS5) will support Xen until 2014, which is when RHEL5 goes EOL. So that''s one option.. it doesn''t have the latest Xen versions, but it works pretty OK, and is actively maintained. No idea if remus will make it to RHEL5 though.. probably not. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Dec 1, 2009 at 8:08 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> No idea if remus will make it to RHEL5 though.. probably not.does Remus change anything in dom0/domU kernel? _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Dec 01, 2009 at 08:17:41PM +0700, Fajar A. Nugraha wrote:> On Tue, Dec 1, 2009 at 8:08 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > > No idea if remus will make it to RHEL5 though.. probably not. > > does Remus change anything in dom0/domU kernel?I''m not sure.. you might want to read the xen-devel archives for Remus related discussions and patches. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fischer Udo Attila
2009-Dec-01 14:02 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
Dear all, FYI http://bugzilla.novell.com/show_bug.cgi?id=558663 --- Comment #1 from Jan Beulich xxxxx@xxxxxxx 2009-12-01 11:44:56 UTC --- This appears to also be a problem in native code (introduced in 2.6.30): If a CPU gets hot plugged while check_interval is zero (modifiable to zero via /sys, defaulting to zero on Xen), mce_timer will never get set up, and a subsequent del_timer() can''t lock the timer as its base is NULL. Hence I''ll get a patch submitted upstream first. Best regards, Udo Attila Fischer _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, 2009-12-01 at 20:17 +0700, Fajar A. Nugraha wrote:> On Tue, Dec 1, 2009 at 8:08 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > > No idea if remus will make it to RHEL5 though.. probably not. > > does Remus change anything in dom0/domU kernel?Yes. It has some kernel patches for the suspend event channel, I believe it is required for both dom0 and domU as I never got pygrub working properly while testing with remus. I am also doubtful it will make it to RHEL5 as even though Xen may have the patches, the code for RHEL xen kernels is missing quite a few things that the linux-2.6.18 code from xen.org has at this point. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2009-Dec-01 15:25 UTC
Re: [Xen-users] Need advice regarding distro, Remus HA requirements
On Tue, Dec 01, 2009 at 09:55:01AM -0500, Tait Clarridge wrote:> On Tue, 2009-12-01 at 20:17 +0700, Fajar A. Nugraha wrote: > > On Tue, Dec 1, 2009 at 8:08 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > > > > No idea if remus will make it to RHEL5 though.. probably not. > > > > does Remus change anything in dom0/domU kernel? > > Yes. It has some kernel patches for the suspend event channel, I believe > it is required for both dom0 and domU as I never got pygrub working > properly while testing with remus. I am also doubtful it will make it to > RHEL5 as even though Xen may have the patches, the code for RHEL xen > kernels is missing quite a few things that the linux-2.6.18 code from > xen.org has at this point. >This email (from Remus developer) sums it up: http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00727.html short answer: http://xenbits.xen.org/linux-2.6.18-xen.hg dom0 kernel is required (atm) for Remus. kernel requirements for Remus: - dom0: IMQ support/patch - dom0: blktap2 support - domU: support for suspend over dedicated event channel. This is not an absolute requirement, but makes Remus perform better/faster. All of that currently exists only in linux-2.6.18-xen.hg. pv_ops dom0 (2.6.3x) will/should catch up in the near future. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen
2009-Dec-01 17:42 UTC
Re: [Xen-users] Need advice regarding distro, Remus HA requirements
On Tue, Dec 01, 2009 at 05:25:55PM +0200, Pasi Kärkkäinen wrote:> On Tue, Dec 01, 2009 at 09:55:01AM -0500, Tait Clarridge wrote: > > On Tue, 2009-12-01 at 20:17 +0700, Fajar A. Nugraha wrote: > > > On Tue, Dec 1, 2009 at 8:08 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote: > > > > > > > No idea if remus will make it to RHEL5 though.. probably not. > > > > > > does Remus change anything in dom0/domU kernel? > > > > Yes. It has some kernel patches for the suspend event channel, I believe > > it is required for both dom0 and domU as I never got pygrub working > > properly while testing with remus. I am also doubtful it will make it to > > RHEL5 as even though Xen may have the patches, the code for RHEL xen > > kernels is missing quite a few things that the linux-2.6.18 code from > > xen.org has at this point. > > > > This email (from Remus developer) sums it up: > http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00727.html > > short answer: http://xenbits.xen.org/linux-2.6.18-xen.hg dom0 kernel is > required (atm) for Remus. > > kernel requirements for Remus: > - dom0: IMQ support/patch > - dom0: blktap2 support > - domU: support for suspend over dedicated event channel. This is not an > absolute requirement, but makes Remus perform better/faster. > > All of that currently exists only in linux-2.6.18-xen.hg. > > pv_ops dom0 (2.6.3x) will/should catch up in the near future. >Replying to myself.. It seems current pv_ops dom0 git tree already has IMQ support/patch in: http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=9ffd27c570b50c00101c9c34e5f6a3850e206ada So pv_ops dom0 kernel is only missing blktap2 support to support running Remus HA. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Fischer Udo Attila
2009-Dec-02 10:42 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
Hi all, I have tested again Vladislav parameters, and now they working for limiting the VCPU for dom0. dom0_mem=512M dom0_vcpus_pin dom0_max_vcpus=1 Thanks Vladislav! Best regards, Udo Attila Fischer _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Vladislav Karpenko
2009-Dec-02 11:56 UTC
Re: [Xen-users] OpenSuSE 11.2 bug, dom0-cpus limit causes xenwatch_cb running 100% and xm command freeze and xend dead
You are wellcome :) Glad to help 2 дек. 2009, в 12:42, Fischer Udo Attila написал(а):> > Hi all, > > I have tested again Vladislav parameters, and now they working for limiting the VCPU for dom0. > > dom0_mem=512M dom0_vcpus_pin dom0_max_vcpus=1 > > > Thanks Vladislav! > > Best regards, > Udo Attila Fischer > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Seemingly Similar Threads
- A function that can modify an object? Or at least shows principles how to modify an object?
- optimal control, maximization with several variables?
- creating vector os zeros for simulations (beginner's question)
- Anomalous outputs from rbeta when using two different random number seeds
- grid 4.0 generates wrong results when adding two complex units by sum()