Hello, with xen 3.4.0-rc2-pre and linux-2.6-xen (push2/xen/dom0/master tree) I experienced an unstability which leads to crashing domUs. At the moment I''m not sure what causes the processes to crash. One of the two domU''s crash every 24h, some kind of cron initiated remote access seem to bring this domU down. The other domU crashes after 24-48 hours of uptime. My first idea was that this has something to do with ram cosumption because I noticed that the crash never happens as long as the swap space is unused. The hardware consists of an intel core 2 quad (2.8GHz) with 8GB ram which is alloted as follows: dom0: 2vcpus 512MB dumU 1: 4vcpus 2GB dumU 2: 4vcpus 2GB If a domU crash happens all I can do is to destroy this guest because it is unaccessable in this state. My questions are: is there an other kernel tree I can try? What can I do to figure out what the problem is or to fix this? This is a part of what the console spits out if a domU crashes: Pid: 27918, comm: sh Tainted: G D (2.6.29-tip #1) EIP: 0061:[<c01013a7>] EFLAGS: 00000202 CPU: 2 EIP is at _stext+0x3a7/0x1001 EAX: 00000000 EBX: 00000003 ECX: e190dee4 EDX: 00020010 ESI: 0000001d EDI: e200c3aa EBP: 00000000 ESP: e190dee0 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 CR0: 8005003b CR2: b7f39890 CR3: 0f4c1000 CR4: 00002660 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Call Trace: [<c0216835>] ? xen_poll_irq+0x4a/0x59 [<c0105c91>] ? xen_spin_lock_slow+0x7f/0xd5 [<c0105d57>] ? xen_spin_lock+0x2f/0x58 [<c02c0197>] ? _spin_lock+0x5/0x7 [<c0163b95>] ? vma_link+0x22/0x72 [<c01642a4>] ? mmap_region+0x22f/0x338 [<c010a47d>] ? sys_mmap2+0x59/0x77 [<c010720d>] ? syscall_call+0x7/0xb BUG: soft lockup - CPU#3 stuck for 61s! [nagios:27916] Modules linked in: ipv6 authenc xfrm4_mode_transport xfrm4_mode_tunnel xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 deflate zlib_deflate zlib_inflate twofish twofish_common serpent aes_i586 aes_generic blowfish des_generic cbc sha256_generic sha1_generic crypto_null af_key pcspkr rtc_core rtc_lib Pid: 27916, comm: nagios Tainted: G D (2.6.29-tip #1) EIP: 0061:[<c01013a7>] EFLAGS: 00000202 CPU: 3 EIP is at _stext+0x3a7/0x1001 EAX: 00000000 EBX: 00000003 ECX: cf555e94 EDX: 00030016 ESI: 00000023 EDI: e200c3aa EBP: 00000000 ESP: cf555e90 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 CR0: 8005003b CR2: b7c4ee74 CR3: 00380000 CR4: 00002660 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Call Trace: [<c0216835>] ? xen_poll_irq+0x4a/0x59 [<c0105c91>] ? xen_spin_lock_slow+0x7f/0xd5 [<c0105d57>] ? xen_spin_lock+0x2f/0x58 [<c02c0197>] ? _spin_lock+0x5/0x7 [<c0163411>] ? unlink_file_vma+0x1d/0x36 [<c0166b1c>] ? anon_vma_unlink+0x38/0x55 [<c01617c1>] ? free_pgtables+0x4e/0x76 [<c01631f3>] ? exit_mmap+0xb8/0x108 [<c0126dc5>] ? mmput+0x20/0x82 [<c012a025>] ? exit_mm+0xd7/0xdf [<c014777a>] ? acct_collect+0x150/0x155 [<c012b26c>] ? do_exit+0x14e/0x64e [<c012b7d0>] ? do_group_exit+0x64/0x8b [<c012b804>] ? sys_exit_group+0xd/0x10 [<c010720d>] ? syscall_call+0x7/0xb Greetings jon _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 21/04/2009 12:26, "jon.hart" <jon.hart@web.de> wrote:> If a domU crash happens all I can do is to destroy this guest because it > is unaccessable in this state. > > My questions are: is there an other kernel tree I can try? What can I do > to figure out what the problem is or to fix this?You could try http://xenbits.xensource.com/linux-2.6.18-xen.hg in a domU and see if that stabilises that particular domU. The bug looks possibly like a spinlock deadlock within the domU kernel, or possibly a bug in the paravirtualised spinlock implementation. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Looks likely the same problem I reported yesterday: http://lists.xensource.com/archives/html/xen-devel/2009-04/msg00742.html However my stack dump did NOT report xen_poll_irq or xen_spin_lock at the top of the dump... I would have remembered or commented on that. Unfortunately I didn''t save the stack dump.>From googling a bit, I suspect this might be a Linuxproblem, perhaps exacerbated by Xen or 2.6.29-pv. Dan> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Tuesday, April 21, 2009 5:51 AM > To: jon.hart; xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] experienced unstability > > > On 21/04/2009 12:26, "jon.hart" <jon.hart@web.de> wrote: > > > If a domU crash happens all I can do is to destroy this > guest because it > > is unaccessable in this state. > > > > My questions are: is there an other kernel tree I can try? > What can I do > > to figure out what the problem is or to fix this? > > You could try > http://xenbits.xensource.com/linux-2.6.18-xen.hg in a domU and > see if that stabilises that particular domU. The bug looks > possibly like a > spinlock deadlock within the domU kernel, or possibly a bug in the > paravirtualised spinlock implementation. > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel