I''m seeing this identical problem on both my machines. One is a Dell
2650 with 4G of memory. The other is a Dell 2850 with 8G of memory
(and a PAE kernel). Both are running Xen 2.0.1 with Linux
2.6.12.6. Each machine has 2 cpus (which each support two
hyperthreads, so it appears to be 4 cpus).
I see this on all dom''s. I can mitigate it by assigning one vcpu for
each dom (including dom0) and putting no more then 4 dom''s on each
machine.
-Jeff
On Sun, May 21, 2006 at 11:56:02PM -0400, Jonathon Jones
wrote:> I am trying to troubleshoot an odd problem I am having and my main question
> is this: How can I force dom0 to use cpu #1 not cpu #0? So far I have
only
> seen that setting the flag dom0-cpus allows you to control how many cpus
are
> used, but not which ones specifically. Please point me in the right
> direction (file and settings).
>
> Also, here is the symptom of my problem in case any of you can help with
> this. 1 week ago I started having a vps randomly about every day or day
and
> a half stop responding to apache requests as well as some other services
but
> ssh would still run fine. I would reboot the vps using ssh and all was
good
> for another day or two. Tonight this happened in the dom0 and both the
dom0
> and that vps were unresponsive at that point to any services. In the domU
I
> was able to load the hosting control panel but when I tried to login it too
> stopped responding.
>
> Here are samples of the log entries that happen when this problem shows up.
>
> >From the dom0 failure:
>
> May 21 21:57:21 balder007 kernel: ------------[ cut here ]------------
> May 21 21:57:21 balder007 kernel: kernel BUG at <bad filename>:63723!
> May 21 21:57:21 balder007 kernel: invalid operand: 0000 [#9]
> May 21 21:57:21 balder007 kernel: SMP
> May 21 21:57:21 balder007 kernel: Modules linked in: ipt_physdev
> iptable_filter ip_tables bridge ipv6 autofs4 sunrpc af_packet binfmt_misc
> quota_v2 dm_mirror dm_mod video thermal processor fan button battery ac md
> i2c_nforce2 i2c_core shpchp pci_hotplug tg3 sata_sil unix ext3 jbd sata_via
> sd_mod sata_nv libata scsi_mod
> May 21 21:57:21 balder007 kernel: CPU: 1
> May 21 21:57:21 balder007 kernel: EIP: 0061:[<c01182b6>]
Tainted: GF
> VLI
> May 21 21:57:21 balder007 kernel: EFLAGS: 00010282 (2.6.12.6-xen)
> May 21 21:57:21 balder007 kernel: EIP is at pgd_ctor+0x26/0x30
> May 21 21:57:21 balder007 kernel: eax: fffffff4 ebx: 00000001 ecx:
> f577e040 edx: 00000000
> May 21 21:57:21 balder007 kernel: esi: c066de00 edi: ec6e6818 ebp:
> ec6e6800 esp: d77eddc4
> May 21 21:57:21 balder007 kernel: ds: 007b es: 007b ss: 0069
> May 21 21:57:21 balder007 kernel: Process anacron (pid: 6349,
> threadinfo=d77ec000 task=ed712020)
> May 21 21:57:21 balder007 kernel: Stack: d7008000 00000000 00000020
c014dcc1
> d7008000 c066de00 00000001 ec6e6800
> May 21 21:57:21 balder007 kernel: d7008000 c066de00 00000000
c014de3d
> c066de00 ec6e6800 00000001 000000d0
> May 21 21:57:21 balder007 kernel: c066de60 00000001 000000d0
c066b600
> 0000000c 000000d0 00000000 c014e04b
> May 21 21:57:21 balder007 kernel: Call Trace:
> May 21 21:57:21 balder007 kernel: [<c014dcc1>]
cache_init_objs+0x71/0x80
> May 21 21:57:21 balder007 kernel: [<c014de3d>]
cache_grow+0x10d/0x1a0
> May 21 21:57:21 balder007 kernel: [<c014e04b>]
> cache_alloc_refill+0x17b/0x220
> May 21 21:57:22 balder007 kernel: [<c014e30f>]
kmem_cache_alloc+0x7f/0x90
> May 21 21:57:22 balder007 kernel: [<c011833d>] pgd_alloc+0x1d/0x310
> May 21 21:57:22 balder007 kernel: [<c011c840>]
activate_task+0x90/0xb0
> May 21 21:57:22 balder007 kernel: [<c01216fe>] mm_init+0xce/0x100
> May 21 21:57:22 balder007 kernel: [<c0121766>] mm_alloc+0x36/0x50
> May 21 21:57:22 balder007 kernel: [<c017410c>] do_execve+0x7c/0x270
> May 21 21:57:22 balder007 kernel: [<c0109066>] sys_execve+0x46/0xa0
> May 21 21:57:22 balder007 kernel: [<c010a65d>] syscall_call+0x7/0xb
> May 21 21:57:22 balder007 kernel: Code: 00 f3 ab 5f c3 83 ec 0c b8 20 00 00
> 00 89 44 24 08 31 c0 89 44 24 04 8b 44 24 10 89 04 24 e8 d2 2b 00 00 85 c0
> 75 04 83 c4 0c c3 <0f> 0b eb f8 8d b6 00 00 00 00 83 ec 08 b8 f8 1e
36 c0 89
> 5c 24
>
> Fron the domU failure:
>
> May 21 22:32:57 secure kernel: ------------[ cut here ]------------
> May 21 22:32:57 secure kernel: kernel BUG at <bad filename>:63723!
> May 21 22:32:57 secure kernel: invalid operand: 0000 [#1]
> May 21 22:32:57 secure kernel: SMP
> May 21 22:32:57 secure kernel: Modules linked in: ipv6 ipt_TOS
> iptable_mangle ip_conntrack_ftp ip_conntrack_irc ipt_REJECT ipt_LOG
> ipt_limit iptable_filter ipt_multiport ipt_state ip_conntrack ip_tables
> i2c_dev i2c_core af_packet binfmt_misc quota_v2 ext3 jbd dm_mod unix
> May 21 22:32:57 secure kernel: CPU: 0
> May 21 22:32:57 secure kernel: EIP: 0061:[<c01182b6>] Tainted:
GF
> VLI
> May 21 22:32:57 secure kernel: EFLAGS: 00210282 (2.6.12.6-xen)
> May 21 22:32:57 secure kernel: EIP is at pgd_ctor+0x26/0x30
> May 21 22:32:57 secure kernel: eax: fffffff4 ebx: 00000001 ecx:
f577e000
> edx: 00000000
> May 21 22:32:57 secure kernel: esi: c0626e00 edi: eb01f6d8 ebp:
eb01f6c0
> esp: e9489dc4
> May 21 22:32:57 secure kernel: ds: 007b es: 007b ss: 0069
> May 21 22:32:57 secure kernel: Process httpd (pid: 32157,
> threadinfo=e9488000 task=ea19c060)
> May 21 22:32:57 secure kernel: Stack: e9b2d000 00000000 00000020 c014dcc1
> e9b2d000 c0626e00 00000001 eb01f6c0
> May 21 22:32:57 secure kernel: e9b2d000 c0626e00 00000000 c014de3d
> c0626e00 eb01f6c0 00000001 000000d0
> May 21 22:32:57 secure kernel: c0626e60 00000001 000000d0 c0624980
> 0000000c 000000d0 00000000 c014e04b
> May 21 22:32:57 secure kernel: Call Trace:
> May 21 22:32:57 secure kernel: [<c014dcc1>]
cache_init_objs+0x71/0x80
> May 21 22:32:57 secure kernel: [<c014de3d>] cache_grow+0x10d/0x1a0
> May 21 22:32:57 secure kernel: [<c014e04b>]
cache_alloc_refill+0x17b/0x220
> May 21 22:32:57 secure kernel: [<c014e30f>]
kmem_cache_alloc+0x7f/0x90
> May 21 22:32:57 secure kernel: [<c011833d>] pgd_alloc+0x1d/0x310
> May 21 22:32:57 secure kernel: [<c01809c3>] dput+0x33/0x1d0
> May 21 22:32:57 secure kernel: [<c01216fe>] mm_init+0xce/0x100
> May 21 22:32:57 secure kernel: [<c0121766>] mm_alloc+0x36/0x50
> May 21 22:32:57 secure kernel: [<c017410c>] do_execve+0x7c/0x270
> May 21 22:32:57 secure kernel: [<c0109066>] sys_execve+0x46/0xa0
> May 21 22:32:57 secure kernel: [<c010a65d>] syscall_call+0x7/0xb
> May 21 22:32:57 secure kernel: Code: 00 f3 ab 5f c3 83 ec 0c b8 20 00 00 00
> 89 44 24 08 31 c0 89 44 24 04 8b 44 24 10 89 04 24 e8 d2 2b 00 00 85 c0 75
> 04 83 c4 0c c3 <0f> 0b eb f8 8d b6 00 00 00 00 83 ec 08 b8 f8 1e 36
c0 89 5c
> 24
>
>
> Any ideas at all? I feel over my head here....
>
> Thanks,
>
> Jon
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
--
=======================================================================Jeffrey
I. Schiller
MIT Network Manager
Information Services and Technology
Massachusetts Institute of Technology
77 Massachusetts Avenue Room W92-190
Cambridge, MA 02139-4307
617.253.0161 - Voice
jis@mit.edu
=======================================================================
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users