thr3ads.net - Xen devel - [Xen-devel] dom0 crash with unstable [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Bryan D. Payne

2009-Sep-30 18:05 UTC

[Xen-devel] dom0 crash with unstable

I''m trying to install Xen-unstable on a new machine.  At first boot,
I''m getting a dom0 crash.  The crash output is shown below.  The
complete console log is attached.

(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times
to switch input to Xe)
(XEN) Freed 148kB init memory.
mapping kernel into physical memory
Xen: setup ISA identity maps
(XEN) traps.c:466:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0 [ec=000]
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-3.5-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e033:[<ffffffff819067de>]
(XEN) RFLAGS: 0000000000000212   EM: 1   CONTEXT: pv guest
(XEN) rax: ffffffff865ce000   rbx: ffffffff81001000   rcx: 0000000000000006
(XEN) rdx: 0000000000800000   rsi: 00000000deadbeef   rdi: 00000000deadbeef
(XEN) rbp: ffffffff81807fa8   rsp: ffffffff81807f68   r8:  000000000000001d
(XEN) r9:  ffffffff81807fd8   r10: 0000000006606000   r11: 00000000818ec000
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 0000000919001000   cr2: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
(XEN) Guest stack trace from rsp=ffffffff81807f68:
(XEN)    0000000000000006 00000000818ec000 ffffffff819067de 000000010000e030
(XEN)    0000000000010012 ffffffff81807fa8 000000000000e02b ffffffff81906d64
(XEN)    ffffffff81807ff8 ffffffff81906166 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000001 0000000000000000
(XEN)    0000000000000000 0000000000000000 0000000000000000
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

I pulled from Xen unstable yesterday (bd376919f03a tip), and built it
simply using "make world".  The machine has four Intel E5540
processors and 36GB of RAM.  In the BIOS, I have VT-d disabled, and
VT-x enabled.  Any suggestions on what steps I should take for
debugging this problem and getting dom0 to boot?

Thanks,
bryan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Sep-30 23:47 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

On 09/30/09 11:05, Bryan D. Payne wrote:> I''m trying to install Xen-unstable on a new machine.  At first
boot,
> I''m getting a dom0 crash.  The crash output is shown below.  The
> complete console log is attached.
>
> (XEN) Std. Loglevel: All
> (XEN) Guest Loglevel: All
> (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three
times to switch input to Xe)
> (XEN) Freed 148kB init memory.
> mapping kernel into physical memory
> Xen: setup ISA identity maps
> (XEN) traps.c:466:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0
[ec=000]
> (XEN) domain_crash_sync called from entry.S
> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-3.5-unstable  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e033:[<ffffffff819067de>]
> (XEN) RFLAGS: 0000000000000212   EM: 1   CONTEXT: pv guest
> (XEN) rax: ffffffff865ce000   rbx: ffffffff81001000   rcx: 0000000000000006
> (XEN) rdx: 0000000000800000   rsi: 00000000deadbeef   rdi: 00000000deadbeef
> (XEN) rbp: ffffffff81807fa8   rsp: ffffffff81807f68   r8:  000000000000001d
> (XEN) r9:  ffffffff81807fd8   r10: 0000000006606000   r11: 00000000818ec000
> (XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000026f0
> (XEN) cr3: 0000000919001000   cr2: 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
> (XEN) Guest stack trace from rsp=ffffffff81807f68:
> (XEN)    0000000000000006 00000000818ec000 ffffffff819067de
000000010000e030
> (XEN)    0000000000010012 ffffffff81807fa8 000000000000e02b
ffffffff81906d64
> (XEN)    ffffffff81807ff8 ffffffff81906166 0000000000000000
0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000001
0000000000000000
> (XEN)    0000000000000000 0000000000000000 0000000000000000
> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.
>
> I pulled from Xen unstable yesterday (bd376919f03a tip), and built it
> simply using "make world".  The machine has four Intel E5540
> processors and 36GB of RAM.  In the BIOS, I have VT-d disabled, and
> VT-x enabled.  Any suggestions on what steps I should take for
> debugging this problem and getting dom0 to boot?
>   
It looks like something has hit a BUG_ON.  The first step is to try to
identify which one:

$ gdb vmlinux
(gdb) x/i 0xffffffff819067de
(gdb) x/i 0xffffffff81906d64
(gdb) x/i 0xffffffff81906166

should give a first clue.

Thanks,
    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Bryan D. Payne

2009-Oct-01 13:35 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

> $ gdb vmlinux
> (gdb) x/i 0xffffffff819067de
> (gdb) x/i 0xffffffff81906d64
> (gdb) x/i 0xffffffff81906166
(gdb) x/i 0xffffffff819067de
0xffffffff819067de <xen_fix_mfn_list+38>:	ud2a
(gdb) x/i 0xffffffff81906d64
0xffffffff81906d64 <xen_ident_map_ISA+128>:	leaveq
(gdb) x/i 0xffffffff81906166
0xffffffff81906166 <xen_start_kernel+1343>:	
    mov    0xa028b(%rip),%rax        # 0xffffffff819a63f8 <xen_start_info>

Thanks,
-bryan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Oct-01 21:01 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

On 10/01/09 06:35, Bryan D. Payne wrote:> (gdb) x/i 0xffffffff819067de
> 0xffffffff819067de <xen_fix_mfn_list+38>:	ud2a
> (gdb) x/i 0xffffffff81906d64
> 0xffffffff81906d64 <xen_ident_map_ISA+128>:	leaveq
> (gdb) x/i 0xffffffff81906166
> 0xffffffff81906166 <xen_start_kernel+1343>:	
>     mov    0xa028b(%rip),%rax        # 0xffffffff819a63f8
<xen_start_info>
>   
It looks like its encountering a pfn (page number) that''s greater than
the total number of pages given to the domain. 

Aaah, 36GB memory.  What happens if you configure
CONFIG_XEN_MAX_DOMAIN_MEMORY larger to match, or set dom0_mem to less
than 32GB?

(It shouldn''t crash regardless, but it will confirm the diagnosis.)

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Bryan D. Payne

2009-Oct-06 15:13 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

> It looks like its encountering a pfn (page number) that''s greater
than
> the total number of pages given to the domain.
>
> Aaah, 36GB memory.  What happens if you configure
> CONFIG_XEN_MAX_DOMAIN_MEMORY larger to match, or set dom0_mem to less
> than 32GB?
>
> (It shouldn''t crash regardless, but it will confirm the
diagnosis.)
Ok, so I tried setting dom0_mem to a variety of values less than 32GB.
 I''m still getting a crash, but now it is more random.  Basically,
I''m
watching the boot process via a serial line, and instead of seeing the
dom0 crash output that I posted before, I''m simply seeing the output
stop, and the boot process hanging, in a different spot with each
boot.  Removing the dom0_mem value beings me back to the behavior I
had before, where the dom0 crash output shows up reliably each time.

-bryan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Oct-06 16:31 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

On 10/06/09 08:13, Bryan D. Payne wrote:> Ok, so I tried setting dom0_mem to a variety of values less than 32GB.
>  I''m still getting a crash, but now it is more random.  Basically,
I''m
> watching the boot process via a serial line, and instead of seeing the
> dom0 crash output that I posted before, I''m simply seeing the
output
> stop, and the boot process hanging, in a different spot with each
> boot.  Removing the dom0_mem value beings me back to the behavior I
> had before, where the dom0 crash output shows up reliably each time.
>   
That''s mysterious.  My first thought is that this is a separate
problem.

Is it ever stable?  What happens if you set dom0_mem to 4G or less?

Is the console responsive when the kernel hangs?  That is, can you type
"Ctrl-A Ctrl-A Ctrl-A" to get Xen, then enter debug keys? 
''0'' (zero)
should dump the context for dom0 and give some clue about where it is dying.

Thanks,
    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Bryan D. Payne

2009-Oct-06 17:23 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

> That''s mysterious.  My first thought is that this is a separate
problem.I agree... and thanks for your help in diagnosing this issue.
> Is it ever stable?  What happens if you set dom0_mem to 4G or less?It still crashes, even with 2G of dom0 memory.
> Is the console responsive when the kernel hangs?  That is, can you type
> "Ctrl-A Ctrl-A Ctrl-A" to get Xen, then enter debug keys?
 ''0'' (zero)
> should dump the context for dom0 and give some clue about where it is
dying.Yes, this seems to work.  Typing 0 dumped info for each of the cores
(i.e., 15 vcpu states).  I''ve attached the output.

-bryan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Oct-06 17:54 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

On 10/06/09 10:23, Bryan D. Payne wrote:>> Is it ever stable?  What happens if you set dom0_mem to 4G or less?
>>     
> It still crashes, even with 2G of dom0 memory.
>   
What console output do you get?
>> Is the console responsive when the kernel hangs?  That is, can you type
>> "Ctrl-A Ctrl-A Ctrl-A" to get Xen, then enter debug keys? 
''0'' (zero)
>> should dump the context for dom0 and give some clue about where it is
dying.
>>     
> Yes, this seems to work.  Typing 0 dumped info for each of the cores
> (i.e., 15 vcpu states).  I''ve attached the output.
>   
Could you map the RIP values to a symbol with gdb?  From a quick look,
it seems that most or all of the cpus are at the same place.  No, vcpu 8
is doing something else at least.

What happens if you boot dom0 with fewer cpus?

Do you have CONFIG_PARAVIRT_SPINLOCKS enabled?

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Bryan D. Payne

2009-Oct-06 18:25 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

> What console output do you get?
It''s the same as starting dom0 with 4G - 32G of memory, just freezes
at a different place in the boot each time.
> Could you map the RIP values to a symbol with gdb?  From a quick look,
> it seems that most or all of the cpus are at the same place.  No, vcpu 8
> is doing something else at least.
(gdb) x/i 0xffffffff8100930a
0xffffffff8100930a <hypercall_page+778>:	add    %al,(%rax)
(gdb) x/i 0xffffffff811fea48
0xffffffff811fea48 <delay_tsc+62>:	
    cmpq   $0x0,0x632538(%rip)        # 0xffffffff81830f88
<pv_cpu_ops+264>
> What happens if you boot dom0 with fewer cpus?
I tried adding "maxcpus=1" to the linux kernel line in grub.  I used
this in conjunction with the "dom0_mem=2G" option for xen.  Dom0 still
crashes the same as without the maxcpus option.

Just for kicks, I tried a few other options...

* setting mem=4G with dom0_mem=2G seemed still resulted in random dom0 crashing
* setting noapic resulted in a consistent crash within Xen:

(XEN) ----[ Xen-3.5-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48014fe29>] add_pin_to_irq+0x24/0xcc
(XEN) RFLAGS: 0000000000010296   CONTEXT: hypervisor
> Do you have CONFIG_PARAVIRT_SPINLOCKS enabled?
No.

-bryan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Oct-06 18:56 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

On 10/06/09 11:25, Bryan D. Payne wrote:> (gdb) x/i 0xffffffff8100930a
> 0xffffffff8100930a <hypercall_page+778>:	add    %al,(%rax)
>   
778/32 = hypercall 24 = vcpuop.  Probably idling.> (gdb) x/i 0xffffffff811fea48
> 0xffffffff811fea48 <delay_tsc+62>:	
>     cmpq   $0x0,0x632538(%rip)        # 0xffffffff81830f88
<pv_cpu_ops+264>
>   
That''s almost certainly a kernel panic of some kind.  Working out where
it came from will be rather tedious: you need to look through the stack
dump to find code-ish looking addresses then x/i them (they''ll be the
same basic format as 0xffffffff8xxxxxxx).

(I really need to work out why they tend not to get printed.)
>> What happens if you boot dom0 with fewer cpus?
>>     
> I tried adding "maxcpus=1" to the linux kernel line in grub.  I
used
> this in conjunction with the "dom0_mem=2G" option for xen.  Dom0
still
> crashes the same as without the maxcpus option.
>
> Just for kicks, I tried a few other options...
>
> * setting mem=4G with dom0_mem=2G seemed still resulted in random dom0
crashing
> * setting noapic resulted in a consistent crash within Xen:
>
> (XEN) ----[ Xen-3.5-unstable  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82c48014fe29>] add_pin_to_irq+0x24/0xcc
> (XEN) RFLAGS: 0000000000010296   CONTEXT: hypervisor
>   
That shouldn''t happen.  Sounds like it might be fall-out from the
recent
interrupt changes in Xen.  Did you supply "noapic" to Xen, dom0 or
both?

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Bryan D. Payne

2009-Oct-06 19:02 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

> That''s almost certainly a kernel panic of some kind.  Working out
where
> it came from will be rather tedious: you need to look through the stack
> dump to find code-ish looking addresses then x/i them (they''ll be
the
> same basic format as 0xffffffff8xxxxxxx).
Nice ;-)  I''ll let you know what I find out.
>> (XEN) ----[ Xen-3.5-unstable  x86_64  debug=y  Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) RIP:    e008:[<ffff82c48014fe29>] add_pin_to_irq+0x24/0xcc
>> (XEN) RFLAGS: 0000000000010296   CONTEXT: hypervisor
>
> That shouldn''t happen.  Sounds like it might be fall-out from the
recent
> interrupt changes in Xen.  Did you supply "noapic" to Xen, dom0
or both?
Just Xen.  Should it go to both?

-bryan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Oct-06 19:37 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

On 10/06/09 12:02, Bryan D. Payne wrote:> Just Xen.  Should it go to both?
>   
Yeah, they need to agree what interrupt model they''re using.  Still
shouldn''t crash, regardless.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jan Beulich

2009-Oct-07 11:47 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

>>> Jeremy Fitzhardinge <jeremy@goop.org> 06.10.09 21:37
>>>
>On 10/06/09 12:02, Bryan D. Payne wrote:
>> Just Xen.  Should it go to both?
>>   
>
>Yeah, they need to agree what interrupt model they''re using.  Still
>shouldn''t crash, regardless.
Not really - Xen automatically passes noapic to the kernel when it had
been passed that option (see the dom0_cmdline handling in
__start_xen()).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2009-Oct-07 19:09 UTC

head link

Re: [Xen-devel] dom0 crash with unstable

On 10/07/09 04:47, Jan Beulich wrote:> Not really - Xen automatically passes noapic to the kernel when it had
> been passed that option (see the dom0_cmdline handling in
> __start_xen()).
>   

Ah, OK.  I hadn''t realized that.

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Sep 2009 - dom0 crash with unstable

[Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable

Re: [Xen-devel] dom0 crash with unstable