Pasi Kärkkäinen
2009-Sep-21 19:57 UTC
[Fedora-xen] Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
Hello Jeremy and others, I friend of mine is trying to install x86_64 Fedora 11 Xen PV domU on CentOS 5.4 x86_64 dom0. He''s getting a kernel crash.. so I was wondering if you''ve seen this one before. A couple of kernel boot/crash logs: http://v6.fi/misc/f11_64_kernel.debug2.txt http://v6.fi/misc/f11_64_kernel.debug4.txt (early) Kernel command line: root=/dex/xvda1 ro earlyprintk=xen (early) Initializing CPU#0 (early) invalid opcode: 0000 [#1] (early) SMP (early) (early) last sysfs file: (early) CPU 0 (early) (early) Modules linked in:(early) (early) Pid: 0, comm: swapper Not tainted 2.6.29.4-167.fc11.x86_64 #1 (early) RIP: e030:[<ffffffff81397963>] (early) [<ffffffff81397963>] xsave_cntxt_init+0xba/0x199 [root@dom0 RH]# gdb vmlinux (gdb) x/i 0xffffffff81397963 0xffffffff81397963 <xsave_cntxt_init+186>: xsetbv (gdb) x/60i xsave_cntxt_init 0xffffffff813978a9 <xsave_cntxt_init>: push %rbp 0xffffffff813978aa <xsave_cntxt_init+1>: mov %rsp,%rbp 0xffffffff813978ad <xsave_cntxt_init+4>: push %r14 0xffffffff813978af <xsave_cntxt_init+6>: push %r13 0xffffffff813978b1 <xsave_cntxt_init+8>: push %r12 0xffffffff813978b3 <xsave_cntxt_init+10>: push %rbx 0xffffffff813978b4 <xsave_cntxt_init+11>: sub $0x10,%rsp 0xffffffff813978b8 <xsave_cntxt_init+15>: callq 0xffffffff81011000 <mcount> 0xffffffff813978bd <xsave_cntxt_init+20>: lea -0x24(%rbp),%rbx 0xffffffff813978c1 <xsave_cntxt_init+24>: lea -0x28(%rbp),%r12 0xffffffff813978c5 <xsave_cntxt_init+28>: lea -0x2c(%rbp),%r13 0xffffffff813978c9 <xsave_cntxt_init+32>: lea -0x30(%rbp),%r14 0xffffffff813978cd <xsave_cntxt_init+36>: movl $0xd,-0x24(%rbp) 0xffffffff813978d4 <xsave_cntxt_init+43>: movl $0x0,-0x2c(%rbp) 0xffffffff813978db <xsave_cntxt_init+50>: mov %rbx,%rdi 0xffffffff813978de <xsave_cntxt_init+53>: mov %r12,%rsi 0xffffffff813978e1 <xsave_cntxt_init+56>: mov %r13,%rdx 0xffffffff813978e4 <xsave_cntxt_init+59>: mov %r14,%rcx 0xffffffff813978e7 <xsave_cntxt_init+62>: callq *0x1f24b3(%rip) # 0xffffffff81589da0 <pv_cpu_ops+224> 0xffffffff813978ed <xsave_cntxt_init+68>: mov -0x30(%rbp),%eax 0xffffffff813978f0 <xsave_cntxt_init+71>: mov -0x24(%rbp),%esi 0xffffffff813978f3 <xsave_cntxt_init+74>: shl $0x20,%rax 0xffffffff813978f7 <xsave_cntxt_init+78>: lea (%rax,%rsi,1),%rsi 0xffffffff813978fb <xsave_cntxt_init+82>: mov %rsi,%rax 0xffffffff813978fe <xsave_cntxt_init+85>: mov %rsi,0x4357ab(%rip) # 0xffffffff817cd0b0 <pcntxt_mask> 0xffffffff81397905 <xsave_cntxt_init+92>: and $0x3,%eax 0xffffffff81397908 <xsave_cntxt_init+95>: cmp $0x3,%rax 0xffffffff8139790c <xsave_cntxt_init+99>: je 0xffffffff81397920 <xsave_cntxt_init+119> 0xffffffff8139790e <xsave_cntxt_init+101>: mov $0xffffffff814ca823,%rdi 0xffffffff81397915 <xsave_cntxt_init+108>: xor %eax,%eax 0xffffffff81397917 <xsave_cntxt_init+110>: callq 0xffffffff813a95be <printk> 0xffffffff8139791c <xsave_cntxt_init+115>: ud2a 0xffffffff8139791e <xsave_cntxt_init+117>: jmp 0xffffffff8139791e <xsave_cntxt_init+117> ---Type <return> to continue, or q <return> to quit--- 0xffffffff81397920 <xsave_cntxt_init+119>: testb $0x4,0x26a700(%rip) # 0xffffffff81602027 <boot_cpu_data+39> 0xffffffff81397927 <xsave_cntxt_init+126>: movq $0x3,0x43577e(%rip) # 0xffffffff817cd0b0 <pcntxt_mask> 0xffffffff81397932 <xsave_cntxt_init+137>: je 0xffffffff81397966 <xsave_cntxt_init+189> 0xffffffff81397934 <xsave_cntxt_init+139>: orq $0x40000,0x423721(%rip) # 0xffffffff817bb060 <mmu_cr4_features> 0xffffffff8139793f <xsave_cntxt_init+150>: callq *0x1f23ab(%rip) # 0xffffffff81589cf0 <pv_cpu_ops+48> 0xffffffff81397945 <xsave_cntxt_init+156>: mov %eax,%edi 0xffffffff81397947 <xsave_cntxt_init+158>: or $0x40000,%edi 0xffffffff8139794d <xsave_cntxt_init+164>: callq *0x1f23a5(%rip) # 0xffffffff81589cf8 <pv_cpu_ops+56> 0xffffffff81397953 <xsave_cntxt_init+170>: mov 0x435756(%rip),%rax # 0xffffffff817cd0b0 <pcntxt_mask> 0xffffffff8139795a <xsave_cntxt_init+177>: xor %ecx,%ecx 0xffffffff8139795c <xsave_cntxt_init+179>: mov %rax,%rdx 0xffffffff8139795f <xsave_cntxt_init+182>: shr $0x20,%rdx 0xffffffff81397963 <xsave_cntxt_init+186>: xsetbv 0xffffffff81397966 <xsave_cntxt_init+189>: movl $0xd,-0x24(%rbp) 0xffffffff8139796d <xsave_cntxt_init+196>: movl $0x0,-0x2c(%rbp) 0xffffffff81397974 <xsave_cntxt_init+203>: mov %rbx,%rdi 0xffffffff81397977 <xsave_cntxt_init+206>: mov %r12,%rsi 0xffffffff8139797a <xsave_cntxt_init+209>: mov %r13,%rdx 0xffffffff8139797d <xsave_cntxt_init+212>: mov %r14,%rcx 0xffffffff81397980 <xsave_cntxt_init+215>: callq *0x1f241a(%rip) # 0xffffffff81589da0 <pv_cpu_ops+224> 0xffffffff81397986 <xsave_cntxt_init+221>: mov $0xffffffff817cd0c0,%rsi 0xffffffff8139798d <xsave_cntxt_init+228>: xor %eax,%eax 0xffffffff8139798f <xsave_cntxt_init+230>: mov $0xc,%ecx 0xffffffff81397994 <xsave_cntxt_init+235>: mov %rsi,%rdi 0xffffffff81397997 <xsave_cntxt_init+238>: mov -0x28(%rbp),%ebx 0xffffffff8139799a <xsave_cntxt_init+241>: rep stos %eax,%es:(%rdi) 0xffffffff8139799c <xsave_cntxt_init+243>: mov 0x43570d(%rip),%rax # 0xffffffff817cd0b0 <pcntxt_mask> (gdb) -- Pasi
Jeremy Fitzhardinge
2009-Sep-21 20:29 UTC
[Fedora-xen] Re: Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
On 09/21/09 12:57, Pasi Kärkkäinen wrote:> Hello Jeremy and others, > > I friend of mine is trying to install x86_64 Fedora 11 Xen PV domU on > CentOS 5.4 x86_64 dom0. He''s getting a kernel crash.. so I was wondering > if you''ve seen this one before. > > A couple of kernel boot/crash logs: > http://v6.fi/misc/f11_64_kernel.debug2.txt > http://v6.fi/misc/f11_64_kernel.debug4.txt > > (early) Kernel command line: root=/dex/xvda1 ro earlyprintk=xen > (early) Initializing CPU#0 > (early) invalid opcode: 0000 [#1] (early) SMP (early) > (early) last sysfs file: > (early) CPU 0 (early) > (early) Modules linked in:(early) > (early) Pid: 0, comm: swapper Not tainted 2.6.29.4-167.fc11.x86_64 #1 > (early) RIP: e030:[<ffffffff81397963>] (early) [<ffffffff81397963>] xsave_cntxt_init+0xba/0x199 >This looks like the bug where Xen doesn''t filter out the xsave feature flag and the kernel tries to use it, fixed in change ef7616ff29ad. Kernel change e826fe1ba15 works around the Xen bug by doing its own masking (I think it got into 2.6.30). There''s also a "noxsave" kernel parameter, but that may have been added after 2.6.29. J
Pasi Kärkkäinen
2009-Sep-21 20:44 UTC
[Fedora-xen] Re: Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
On Mon, Sep 21, 2009 at 01:29:21PM -0700, Jeremy Fitzhardinge wrote:> On 09/21/09 12:57, Pasi Kärkkäinen wrote: > > Hello Jeremy and others, > > > > I friend of mine is trying to install x86_64 Fedora 11 Xen PV domU on > > CentOS 5.4 x86_64 dom0. He''s getting a kernel crash.. so I was wondering > > if you''ve seen this one before. > > > > A couple of kernel boot/crash logs: > > http://v6.fi/misc/f11_64_kernel.debug2.txt > > http://v6.fi/misc/f11_64_kernel.debug4.txt > > > > (early) Kernel command line: root=/dex/xvda1 ro earlyprintk=xen > > (early) Initializing CPU#0 > > (early) invalid opcode: 0000 [#1] (early) SMP (early) > > (early) last sysfs file: > > (early) CPU 0 (early) > > (early) Modules linked in:(early) > > (early) Pid: 0, comm: swapper Not tainted 2.6.29.4-167.fc11.x86_64 #1 > > (early) RIP: e030:[<ffffffff81397963>] (early) [<ffffffff81397963>] xsave_cntxt_init+0xba/0x199 > > > > This looks like the bug where Xen doesn''t filter out the xsave feature > flag and the kernel tries to use it, fixed in change ef7616ff29ad. > Kernel change e826fe1ba15 works around the Xen bug by doing its own > masking (I think it got into 2.6.30). > > There''s also a "noxsave" kernel parameter, but that may have been added > after 2.6.29. >Thanks for the fast reply. I''ll ask him to try "noxsave". Other than that not much to do about it.. can''t really update the F11 GA installer kernel.. (and I don''t think doing own custom re-spin of F11 is good solution for him). -- Pasi
Jeremy Fitzhardinge
2009-Sep-21 21:10 UTC
[Fedora-xen] Re: [Xen-devel] Re: Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
On 09/21/09 13:44, Pasi Kärkkäinen wrote:> Thanks for the fast reply. I''ll ask him to try "noxsave". > > Other than that not much to do about it.. can''t really update the F11 GA > installer kernel.. (and I don''t think doing own custom re-spin of F11 is > good solution for him). >Oh, its a domU install? You can mask it in the domain config with the oh-so-intuitive cpuid=[] setting (leaf 1, ecx bit 26). I think that''s: cpuid=[''1:ecx=xxxxx0xxxxxxxxxxxxxxxxxxxxxxxxxx, eax=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'' ] (I''m assuming it works for PV as well as HVM domains.) J
Pasi Kärkkäinen
2009-Sep-21 21:15 UTC
Re: [Fedora-xen] Re: Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
On Mon, Sep 21, 2009 at 11:44:44PM +0300, Pasi Kärkkäinen wrote:> On Mon, Sep 21, 2009 at 01:29:21PM -0700, Jeremy Fitzhardinge wrote: > > On 09/21/09 12:57, Pasi Kärkkäinen wrote: > > > Hello Jeremy and others, > > > > > > I friend of mine is trying to install x86_64 Fedora 11 Xen PV domU on > > > CentOS 5.4 x86_64 dom0. He''s getting a kernel crash.. so I was wondering > > > if you''ve seen this one before. > > > > > > A couple of kernel boot/crash logs: > > > http://v6.fi/misc/f11_64_kernel.debug2.txt > > > http://v6.fi/misc/f11_64_kernel.debug4.txt > > > > > > (early) Kernel command line: root=/dex/xvda1 ro earlyprintk=xen > > > (early) Initializing CPU#0 > > > (early) invalid opcode: 0000 [#1] (early) SMP (early) > > > (early) last sysfs file: > > > (early) CPU 0 (early) > > > (early) Modules linked in:(early) > > > (early) Pid: 0, comm: swapper Not tainted 2.6.29.4-167.fc11.x86_64 #1 > > > (early) RIP: e030:[<ffffffff81397963>] (early) [<ffffffff81397963>] xsave_cntxt_init+0xba/0x199 > > > > > > > This looks like the bug where Xen doesn''t filter out the xsave feature > > flag and the kernel tries to use it, fixed in change ef7616ff29ad. > > Kernel change e826fe1ba15 works around the Xen bug by doing its own > > masking (I think it got into 2.6.30). > > > > There''s also a "noxsave" kernel parameter, but that may have been added > > after 2.6.29. > > > > Thanks for the fast reply. I''ll ask him to try "noxsave". >It seems "noxsave" kernel parameter was added in Linux 2.6.30, so that doesn''t help for 2.6.29. -- Pasi
Pasi Kärkkäinen
2009-Sep-21 21:34 UTC
[Fedora-xen] Re: [Xen-devel] Re: Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
On Mon, Sep 21, 2009 at 02:10:51PM -0700, Jeremy Fitzhardinge wrote:> On 09/21/09 13:44, Pasi Kärkkäinen wrote: > > Thanks for the fast reply. I''ll ask him to try "noxsave". > > > > Other than that not much to do about it.. can''t really update the F11 GA > > installer kernel.. (and I don''t think doing own custom re-spin of F11 is > > good solution for him). > > > > Oh, its a domU install? You can mask it in the domain config with the > oh-so-intuitive cpuid=[] setting (leaf 1, ecx bit 26). I think that''s: > > cpuid=[''1:ecx=xxxxx0xxxxxxxxxxxxxxxxxxxxxxxxxx, > eax=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'' ] > > (I''m assuming it works for PV as well as HVM domains.) >Hmm.. yeah. Except I think the hypervisor in RHEL5/CentOS5 is too old to support cpuid masking.. Thanks for tip anyway :) -- Pasi
Jeremy Fitzhardinge
2009-Sep-21 21:42 UTC
[Fedora-xen] Re: [Xen-devel] Re: Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
On 09/21/09 14:34, Pasi Kärkkäinen wrote:> Hmm.. yeah. Except I think the hypervisor in RHEL5/CentOS5 is too old to > support cpuid masking.. > > Thanks for tip anyway :)BIOS options to disable the feature? Install on another machine? J
Pasi Kärkkäinen
2009-Sep-22 12:43 UTC
[Fedora-xen] Re: [Xen-devel] Re: Fedora 11 PV Xen domU (2.6.29.4-167.fc11.x86_64) crash with invalid opcode in xsave_cntxt_init (xsetbv)
On Mon, Sep 21, 2009 at 02:42:15PM -0700, Jeremy Fitzhardinge wrote:> On 09/21/09 14:34, Pasi Kärkkäinen wrote: > > Hmm.. yeah. Except I think the hypervisor in RHEL5/CentOS5 is too old to > > support cpuid masking.. > > > > Thanks for tip anyway :) > > BIOS options to disable the feature? Install on another machine? >Didn''t find anything matching from BIOS.. Installing+updating on another machine is an option, since f11 updates do contain 2.6.30 kernel, and that contains the bugfix.. Thanks! -- Pasi