Rafal Wojtczuk
2010-Nov-22 13:02 UTC
[Xen-devel] Bug: Xen panics with #UD in acpi_processor_idle()
Hello, There seems to be an issue with handling of MSR_IA32_MISC_ENABLE_MONITOR_ENABLE bit in MSR_IA32_MISC_ENABLE. The effect is that Xen panics because of #UD after trying to execute "monitor" instruction. Most likely Xen does not bother to set this bit, leaving it to BIOS. If I understand correctly, it is mere reliability issue, not triggerable from any domain. The problem can be reproduced with Xen-4.0.1 or tip of xen-4.0-testing.hg, dom0 being 2.6.32.23-170.xendom0.fc12.x86_64 taken from http://fedorapeople.org/~myoung/dom0. Processor is i5-650 on Intel Q57. Panic happens just after starting dom0. I have verified (by adding rdmsr(MSR_IA32_MISC_ENABLE) in the panic code) that indeed MSR_IA32_MISC_ENABLE_MONITOR_ENABLE is not set (it is 810088). The same value is present in the msr when in init_done() function. If I hardcode setting this bit just before __monitor(), all is fine. So it looks like my crappy BIOS has not set this bit, and Xen has not either, yet it tried to use monitor/mwait ? Somehow offtopic: what is the purpose of cheating about MSR_IA32_MISC_ENABLE_MONITOR_ENABLE state to dom0 in rdmsr emulation ? xen-4.0-testing.hg/xen/arch/x86/traps.c:2364 case MSR_IA32_MISC_ENABLE: if ( rdmsr_safe(regs->ecx, regs->eax, regs->edx) ) goto fail; regs->eax &= ~(MSR_IA32_MISC_ENABLE_PERF_AVAIL | MSR_IA32_MISC_ENABLE_MONITOR_ENABLE); regs->eax |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL | MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | MSR_IA32_MISC_ENABLE_XTPR_DISABLE; break; Regards, Rafal Wojtczuk (XEN) ----[ Xen-4.0.2-rc1-pre x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c48018db7f>] acpi_processor_idle+0x3cf/0x810 (XEN) RFLAGS: 0000000000010086 CONTEXT: hypervisor (XEN) rax: ffff8300cacfc000 rbx: ffff830121634840 rcx: 0000000000000000 (XEN) rdx: 0000000000000000 rsi: 0000000000000020 rdi: 0000000000000000 (XEN) rbp: ffff830121634900 rsp: ffff82c480367e80 r8: 000000020bc8ef41 (XEN) r9: 0000000000000004 r10: ffff82c480397120 r11: 0000000000000001 (XEN) r12: 00000000000000c0 r13: 0000000000f977e3 r14: ffff8300cacfc000 (XEN) r15: 0000000000000002 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 0000000119001000 cr2: ffff880000736860 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c480367e80: (XEN) ffffffffffffffff ffff82c48025f100 ffff82c48011f537 ffff8300cacfc000 (XEN) ffffffffffffffff ffffffff817db6b0 0000000000000000 0000000000000000 (XEN) 000000008037a980 000003c7000003c7 ffff82c48037e980 ffff82c480367f28 (XEN) ffff82c48025fb00 ffff82c480367f28 ffff82c480367e18 ffff8300cacfc000 (XEN) 0000000000000002 ffff82c48014f02e ffff8300cacfc000 ffff8300cb4f8000 (XEN) 00000000ffffffff 0000000000000000 ffffffffffffffff 0000000000000000 (XEN) ffffffff817db6b0 ffffffff81683f10 ffffffff81682000 0000000000000246 (XEN) 0000000000015740 ffff8800077e6160 0000000000000000 0000000000000000 (XEN) ffffffff810093aa 0000000000000246 0000000000000000 0000000000000001 (XEN) 0000010000000000 ffffffff810093aa 000000000000e033 0000000000000246 (XEN) ffffffff81683ef8 000000000000e02b 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff8300cacfc000 (XEN) Xen call trace: (XEN) [<ffff82c48018db7f>] acpi_processor_idle+0x3cf/0x810 (XEN) [<ffff82c48011f537>] timer_softirq_action+0x227/0x370 (XEN) [<ffff82c48014f02e>] idle_loop+0x2e/0x90 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL TRAP: vector = 6 (invalid opcode) (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... [root@f13q45 xen]# gdb xen-syms Reading symbols from /root/xen/xen-4.0-testing.hg/xen/xen-syms...done. (gdb) x/2i 0xffff82c48018db7f 0xffff82c48018db7f <acpi_processor_idle+975>: monitor %rax,%rcx,%rdx 0xffff82c48018db82 <acpi_processor_idle+978>: mfence _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Nov-22 14:04 UTC
Re: [Xen-devel] Bug: Xen panics with #UD in acpi_processor_idle()
>>> On 22.11.10 at 14:02, Rafal Wojtczuk <rafal@invisiblethingslab.com> wrote: > Somehow offtopic: what is the purpose of cheating about > MSR_IA32_MISC_ENABLE_MONITOR_ENABLE state to dom0 in rdmsr emulation ? > xen-4.0-testing.hg/xen/arch/x86/traps.c:2364 > case MSR_IA32_MISC_ENABLE: > if ( rdmsr_safe(regs->ecx, regs->eax, regs->edx) ) > goto fail; > regs->eax &= ~(MSR_IA32_MISC_ENABLE_PERF_AVAIL | > MSR_IA32_MISC_ENABLE_MONITOR_ENABLE); > regs->eax |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL | > MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | > MSR_IA32_MISC_ENABLE_XTPR_DISABLE; > break;This is dealing with bits corresponding to features no domain should use (directly). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Nov-22 15:15 UTC
Re: [Xen-devel] Bug: Xen panics with #UD in acpi_processor_idle()
On 22/11/2010 13:02, "Rafal Wojtczuk" <rafal@invisiblethingslab.com> wrote:> Hello, > > There seems to be an issue with handling of > MSR_IA32_MISC_ENABLE_MONITOR_ENABLE bit in MSR_IA32_MISC_ENABLE. The effect > is that Xen panics because of #UD after trying to execute "monitor" > instruction. Most likely Xen does not bother to set this bit, leaving > it to BIOS. If I understand correctly, it is mere reliability issue, not > triggerable from any domain.The issue appears to be be that we do not check for MWAIT support in CPUID, same as native Linux would do (the flag in MISC_ENABLE MSR also affects CPUID appropriately). Can you please try the attached patch? If it works I will apply it to all our trees. Thanks, Keir> The problem can be reproduced with Xen-4.0.1 or tip of xen-4.0-testing.hg, > dom0 being 2.6.32.23-170.xendom0.fc12.x86_64 taken from > http://fedorapeople.org/~myoung/dom0. Processor is i5-650 on Intel Q57. > Panic happens just after starting dom0. > I have verified (by adding rdmsr(MSR_IA32_MISC_ENABLE) in the panic code) > that indeed MSR_IA32_MISC_ENABLE_MONITOR_ENABLE is not set (it is 810088). > The same value is present in the msr when in init_done() function. > If I hardcode setting this bit just before __monitor(), all is fine. So it > looks like my crappy BIOS has not set this bit, and Xen has not either, yet > it tried to use monitor/mwait ? > > Somehow offtopic: what is the purpose of cheating about > MSR_IA32_MISC_ENABLE_MONITOR_ENABLE state to dom0 in rdmsr emulation ? > xen-4.0-testing.hg/xen/arch/x86/traps.c:2364 > case MSR_IA32_MISC_ENABLE: > if ( rdmsr_safe(regs->ecx, regs->eax, regs->edx) ) > goto fail; > regs->eax &= ~(MSR_IA32_MISC_ENABLE_PERF_AVAIL | > MSR_IA32_MISC_ENABLE_MONITOR_ENABLE); > regs->eax |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL | > MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | > MSR_IA32_MISC_ENABLE_XTPR_DISABLE; > break; > > Regards, > Rafal Wojtczuk > > (XEN) ----[ Xen-4.0.2-rc1-pre x86_64 debug=n Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82c48018db7f>] acpi_processor_idle+0x3cf/0x810 > (XEN) RFLAGS: 0000000000010086 CONTEXT: hypervisor > (XEN) rax: ffff8300cacfc000 rbx: ffff830121634840 rcx: 0000000000000000 > (XEN) rdx: 0000000000000000 rsi: 0000000000000020 rdi: 0000000000000000 > (XEN) rbp: ffff830121634900 rsp: ffff82c480367e80 r8: 000000020bc8ef41 > (XEN) r9: 0000000000000004 r10: ffff82c480397120 r11: 0000000000000001 > (XEN) r12: 00000000000000c0 r13: 0000000000f977e3 r14: ffff8300cacfc000 > (XEN) r15: 0000000000000002 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 0000000119001000 cr2: ffff880000736860 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c480367e80: > (XEN) ffffffffffffffff ffff82c48025f100 ffff82c48011f537 ffff8300cacfc000 > (XEN) ffffffffffffffff ffffffff817db6b0 0000000000000000 0000000000000000 > (XEN) 000000008037a980 000003c7000003c7 ffff82c48037e980 ffff82c480367f28 > (XEN) ffff82c48025fb00 ffff82c480367f28 ffff82c480367e18 ffff8300cacfc000 > (XEN) 0000000000000002 ffff82c48014f02e ffff8300cacfc000 ffff8300cb4f8000 > (XEN) 00000000ffffffff 0000000000000000 ffffffffffffffff 0000000000000000 > (XEN) ffffffff817db6b0 ffffffff81683f10 ffffffff81682000 0000000000000246 > (XEN) 0000000000015740 ffff8800077e6160 0000000000000000 0000000000000000 > (XEN) ffffffff810093aa 0000000000000246 0000000000000000 0000000000000001 > (XEN) 0000010000000000 ffffffff810093aa 000000000000e033 0000000000000246 > (XEN) ffffffff81683ef8 000000000000e02b 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff8300cacfc000 > (XEN) Xen call trace: > (XEN) [<ffff82c48018db7f>] acpi_processor_idle+0x3cf/0x810 > (XEN) [<ffff82c48011f537>] timer_softirq_action+0x227/0x370 > (XEN) [<ffff82c48014f02e>] idle_loop+0x2e/0x90 > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 0: > (XEN) FATAL TRAP: vector = 6 (invalid opcode) > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > [root@f13q45 xen]# gdb xen-syms > Reading symbols from /root/xen/xen-4.0-testing.hg/xen/xen-syms...done. > (gdb) x/2i 0xffff82c48018db7f > 0xffff82c48018db7f <acpi_processor_idle+975>: monitor %rax,%rcx,%rdx > 0xffff82c48018db82 <acpi_processor_idle+978>: mfence > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Rafal Wojtczuk
2010-Nov-22 15:43 UTC
Re: [Xen-devel] Bug: Xen panics with #UD in acpi_processor_idle()
On Mon, Nov 22, 2010 at 03:15:03PM +0000, Keir Fraser wrote:> On 22/11/2010 13:02, "Rafal Wojtczuk" <rafal@invisiblethingslab.com> wrote: > > > Hello, > > > > There seems to be an issue with handling of > > MSR_IA32_MISC_ENABLE_MONITOR_ENABLE bit in MSR_IA32_MISC_ENABLE. The effect > > is that Xen panics because of #UD after trying to execute "monitor" > > instruction. Most likely Xen does not bother to set this bit, leaving > > it to BIOS. If I understand correctly, it is mere reliability issue, not > > triggerable from any domain. > > The issue appears to be be that we do not check for MWAIT support in CPUID, > same as native Linux would do (the flag in MISC_ENABLE MSR also affects > CPUID appropriately). Can you please try the attached patch? If it works I > will apply it to all our trees.Yes, with this patch the test machine boots fine. Regards, Rafal Wojtczuk _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel