thr3ads.net - Xen devel - lastest xen unstable crash [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Francisco Rocha

2012-Apr-05 17:37 UTC

lastest xen unstable crash

Hi everyone,

I was trying to build a new machine but the system keeps rebooting.
I used the lasted unstable version from xen-unstable.hg.

I have tried with Fedora 16 (kernel 3.3.0-8) and Xubuntu 11.10
(3.0.0.17-generic).

The output to my serial console is attached.

Cheers,
Francisco

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Andrew Cooper

2012-Apr-05 17:44 UTC

head link

Re: lastest xen unstable crash

On 05/04/12 18:37, Francisco Rocha wrote:> Hi everyone,
>
> I was trying to build a new machine but the system keeps rebooting.
> I used the lasted unstable version from xen-unstable.hg.
>
> I have tried with Fedora 16 (kernel 3.3.0-8) and Xubuntu 11.10
(3.0.0.17-generic).
>
> The output to my serial console is attached.
>
> Cheers,
> Francisco
What is your Linux command line? does it include "console=hvc0"? 
Perhaps some early_printk settings are required.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

Jan Beulich

2012-Apr-10 11:08 UTC

head link

Re: lastest xen unstable crash

>>> On 05.04.12 at 19:37, Francisco Rocha
<f.e.liberal-rocha@newcastle.ac.uk> wrote:
> I was trying to build a new machine but the system keeps rebooting.
> I used the lasted unstable version from xen-unstable.hg.
> 
> I have tried with Fedora 16 (kernel 3.3.0-8) and Xubuntu 11.10 
> (3.0.0.17-generic).
> 
> The output to my serial console is attached.
So as already said by someone else, this is a fault on an XSETBV
instruction. In the kernel this immediately follows the setting of
CR4.OSXSAVE, yet in Xen''s emulation code the only way to get
#UD here is that (virtual) CR4 bit is not set; all other failure
paths result in #GP.

The emulation code handling the setting of this CR4 bit, however,
would issue a warning if the kernel was attempting to set a bit
that the hypervisor doesn''t allow to be set, yet no such warning
is present in the log you provided (and you''re already running at
the highest logging level).

In any case, a fundamental question is whether your CPU has
XSAVE support in the first place, and whether kernel and
hypervisor disagree about that for some reason. Could you
for that purpose post /proc/cpuinfo contents from when running
a native kernel?

Beyond that, adding some tracing to the hypervisor may be
necessary to monitor the Dom0 CR4 writes and maybe how
XSAVE support gets initialized in Xen. Would you be able to do
so on your own, and post the results?

Jan

Jan Beulich

2012-Apr-10 11:20 UTC

head link

Re: lastest xen unstable crash

>>> On 10.04.12 at 13:08, "Jan Beulich"
<JBeulich@suse.com> wrote:
> In any case, a fundamental question is whether your CPU has
> XSAVE support in the first place, and whether kernel and
> hypervisor disagree about that for some reason. Could you
> for that purpose post /proc/cpuinfo contents from when running
> a native kernel?
Just realized that this question is answered by the log you provided:

(XEN) xstate_init: using cntxt_size: 0x340 and states: 0x7

so indeed the fastest approach (short of someone seeing something
obviously wrong with the code) appears to be to add some tracing to
the CR4 handling (pv_guest_cr4_fixup() and the XSETBV handling in
emulate_privileged_op()), particularly also because the register dump
indicates that the relevant bit was not set in CR4 at the point where
the XSETBV faulted.

Jan

Francisco Rocha

2012-Apr-10 12:23 UTC

head link

Re: lastest xen unstable crash

________________________________________
From: Jan Beulich [JBeulich@suse.com]
Sent: 10 April 2012 12:20
To: Francisco Rocha
Cc: xen-devel@lists.xen.org
Subject: Re: [Xen-devel] lastest xen unstable crash
>>> On 10.04.12 at 13:08, "Jan Beulich"
<JBeulich@suse.com> wrote:
> In any case, a fundamental question is whether your CPU has
> XSAVE support in the first place, and whether kernel and
> hypervisor disagree about that for some reason. Could you
> for that purpose post /proc/cpuinfo contents from when running
> a native kernel?
Just realized that this question is answered by the log you provided:

(XEN) xstate_init: using cntxt_size: 0x340 and states: 0x7

so indeed the fastest approach (short of someone seeing something
obviously wrong with the code) appears to be to add some tracing to
the CR4 handling (pv_guest_cr4_fixup() and the XSETBV handling in
emulate_privileged_op()), particularly also because the register dump
indicates that the relevant bit was not set in CR4 at the point where
the XSETBV faulted.

Jan

I have added some prints in the functions you mentioned. Is this what you need? 
These are the new lines in the dmesg, the attached file contains the rest.

(XEN) domain.c:691:d0 @pv_guest_cr4_fixup-start: id=0 hv_cr4: 00002660 ->
guest_cr4:00002660
(XEN) domain.c:707:d0 @pv_guest_cr4_fixup-end: id=0 hv_cr4: 00002660 guest_cr4:
00002660 return: 00002660
(XEN) domain.c:691:d0 @pv_guest_cr4_fixup-start: id=0 hv_cr4: 00002660 ->
guest_cr4:00002660
(XEN) domain.c:707:d0 @pv_guest_cr4_fixup-end: id=0 hv_cr4: 00002660 guest_cr4:
00002660 return: 00002660
(XEN) domain.c:691:d0 @pv_guest_cr4_fixup-start: id=0 hv_cr4: 00002660 ->
guest_cr4:00002660
(XEN) domain.c:707:d0 @pv_guest_cr4_fixup-end: id=0 hv_cr4: 00002660 guest_cr4:
00002660 return: 00002660
(XEN) traps.c:2243:d0 @XSETBV: new_xfeature: 0000000000000007
(XEN) traps.c:2246:d0 @XSETBV: (v->arch.pv_vcpu.ctrlreg[4] &
X86_CR4_OSXSAVE): 0000000000000000

Here is the /proc/cpuinfo running on a native kernel:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz
stepping	: 7
microcode	: 0x25
cpu MHz		: 800.000
cache size	: 4096 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm
ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 5382.77
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

and /proc/cpuinfo with dom0 running with xsave=0:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz
stepping	: 7
microcode	: 0x23
cpu MHz		: 800.000
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de tsc msr pae cx8 apic sep cmov pat clflush acpi mmx fxsr sse sse2
ss ht syscall nx lm constant_tsc rep_good nopl nonstop_tsc aperfmperf pni
pclmulqdq est ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes
hypervisor lahf_lm ida arat epb pln pts dts
bogomips	: 5382.58
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Cheers,
Francisco

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Jan Beulich

2012-Apr-10 14:32 UTC

head link

Re: lastest xen unstable crash

>>> On 10.04.12 at 14:23, Francisco Rocha
<f.e.liberal-rocha@newcastle.ac.uk> wrote:
> I have added some prints in the functions you mentioned. Is this what you 
> need? 
Yes.
> These are the new lines in the dmesg, the attached file contains the rest.
> 
> (XEN) domain.c:691:d0 @pv_guest_cr4_fixup-start: id=0 hv_cr4: 00002660
-> guest_cr4:00002660
> (XEN) domain.c:707:d0 @pv_guest_cr4_fixup-end: id=0 hv_cr4: 00002660
guest_cr4: 00002660 return: 00002660
> (XEN) domain.c:691:d0 @pv_guest_cr4_fixup-start: id=0 hv_cr4: 00002660
-> guest_cr4:00002660
> (XEN) domain.c:707:d0 @pv_guest_cr4_fixup-end: id=0 hv_cr4: 00002660
guest_cr4: 00002660 return: 00002660
> (XEN) domain.c:691:d0 @pv_guest_cr4_fixup-start: id=0 hv_cr4: 00002660
-> guest_cr4:00002660
> (XEN) domain.c:707:d0 @pv_guest_cr4_fixup-end: id=0 hv_cr4: 00002660
guest_cr4: 00002660 return: 00002660
> (XEN) traps.c:2243:d0 @XSETBV: new_xfeature: 0000000000000007
> (XEN) traps.c:2246:d0 @XSETBV: (v->arch.pv_vcpu.ctrlreg[4] &
X86_CR4_OSXSAVE): 0000000000000000
So as far as Xen is concerned, there''s not even an attempt from the
Dom0 kernel to set bit 18. That''s rather odd given that the only
instance of XSETBV should sit right ahead of the CR4 write. You
may want to verify that this is the case in the kernel binary, and if
so you may need to also add tracing at the kernel side (e.g. in
set_in_cr4()).

Jan

Maybe Matching Threads

Search for more maybe matching threads

Xen devel - Apr 2012 - lastest xen unstable crash

lastest xen unstable crash

Re: lastest xen unstable crash

Re: lastest xen unstable crash

Re: lastest xen unstable crash

Re: lastest xen unstable crash

Re: lastest xen unstable crash

Maybe Matching Threads