Ke, Liping
2009-Jun-09 05:10 UTC
[Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Hi, all This patch is to support MCA info logging in DOM0. When an MCE/CMCI error happens (or by polling), the related error info will be sent to DOM0 by XEN. This patch will help to fetch the xen-logged information by hypercall and then convert XEN-format log into Linux format MCELOG. So we can reuse current available mcelog tools for native Linux now. With this patch, after mce/cmci error log information is sent to DOM0, running mcelog tools in DOM0, you will get same detailed decoded mce information as in Native Linux. We test this patch on latest Intel platform with newest version of mcelog getting from git://git.kernel.org/pub/scm/utils/cpu/mce/mcelog.git Thanks& Regards, Criping _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jun-09 09:16 UTC
Re: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
>>> "Ke, Liping" <liping.ke@intel.com> 09.06.09 07:10 >>> >This patch is to support MCA info logging in DOM0. When an MCE/CMCI error happens > (or by polling), the related error info will be sent to DOM0 by XEN. This patch will help >to fetch the xen-logged information by hypercall and then convert XEN-format log into >Linux format MCELOG. So we can reuse current available mcelog tools for native Linux >now.Could this patch please be re-worked to not needlessly touch non-Xen code? This would include a separate config option for the new code, while disabling the native sub-options (which at once would avoid the hacks you appear to need to make mce_amd.c build). Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jun-09 09:20 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
sure. I will resend the patch a little later. Thanks a lot! Criping -----Original Message----- From: Jan Beulich [mailto:JBeulich@novell.com] Sent: 2009年6月9日 17:17 To: Ke, Liping Cc: Keir Fraser; xen-devel Subject: Re: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0>>> "Ke, Liping" <liping.ke@intel.com> 09.06.09 07:10 >>> >This patch is to support MCA info logging in DOM0. When an MCE/CMCI error happens > (or by polling), the related error info will be sent to DOM0 by XEN. This patch will help >to fetch the xen-logged information by hypercall and then convert XEN-format log into >Linux format MCELOG. So we can reuse current available mcelog tools for native Linux >now.Could this patch please be re-worked to not needlessly touch non-Xen code? This would include a separate config option for the new code, while disabling the native sub-options (which at once would avoid the hacks you appear to need to make mce_amd.c build). Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2009-Jun-09 09:30 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
xen-devel-bounces@lists.xensource.com wrote:>>>> "Ke, Liping" <liping.ke@intel.com> 09.06.09 07:10 >>> >> This patch is to support MCA info logging in DOM0. When an MCE/CMCI >> error happens (or by polling), the related error info will be sent >> to DOM0 by XEN. This patch will help to fetch the xen-logged >> information by hypercall and then convert XEN-format log into Linux >> format MCELOG. So we can reuse current available mcelog tools for >> native Linux now. > > Could this patch please be re-worked to not needlessly touch non-Xen > code? This would include a separate config option for the new code, > while disabling the native sub-options (which at once would avoid the > hacks you appear to need to make mce_amd.c build).Jan, I think the native sub-options (X86_MCE_INTEL and X86_MCE_AMD) needs be enabled still. When a MCE happens, and dom0 is effected, that MCE will be injected to dom0 as virtual MCE, and that requires to enable these otions. When a MCE happens and dom0 is not effected, that MCE will be sent to dom0 as vIRQ so that dom0 can log that event for analysis. But yes, maybe following hunk can be done in xen-code to avoid needless touch. + + /*Register vIRQ handler for MCE LOG processing*/ +#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) + printk(KERN_DEBUG "MCE: bind virq for DOM0 Logging\n"); + bind_virq_for_mce(); +#endif + return err; And following code can be removed, although it may cause dom0''s needless polling. +#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) +static int check_interval = 0; /* disable polling */ +#else static int check_interval = 5 * 60; /* 5 minutes */ +#endif + Thanks Yunhong Jiang> > Thanks, Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jun-09 12:25 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 09.06.09 11:30 >>> >xen-devel-bounces@lists.xensource.com wrote: >>>>> "Ke, Liping" <liping.ke@intel.com> 09.06.09 07:10 >>> >>> This patch is to support MCA info logging in DOM0. When an MCE/CMCI >>> error happens (or by polling), the related error info will be sent >>> to DOM0 by XEN. This patch will help to fetch the xen-logged >>> information by hypercall and then convert XEN-format log into Linux >>> format MCELOG. So we can reuse current available mcelog tools for >>> native Linux now. >> >> Could this patch please be re-worked to not needlessly touch non-Xen >> code? This would include a separate config option for the new code, >> while disabling the native sub-options (which at once would avoid the >> hacks you appear to need to make mce_amd.c build). > >Jan, I think the native sub-options (X86_MCE_INTEL and X86_MCE_AMD) needs be enabled still. >When a MCE happens, and dom0 is effected, that MCE will be injected to dom0 as virtual MCE, and that requires to enable these >otions. >When a MCE happens and dom0 is not effected, that MCE will be sent to dom0 as vIRQ so that dom0 can log that event for >analysis.Oh, okay, if the code is indeed needed, than that''s fine. But the hacks needed to get mce_amd.c to compile look suspicious. As much as e.g. changing the defaults for the MCE Kconfig sub-options - those should be kept as is, and if the new code has any kind of dependency on them, the to-be-added new Kconfig option should express this accurately. Basically, behavior for a native kernel built from the same sources should not be modified at all.>But yes, maybe following hunk can be done in xen-code to avoid needless touch. > >+ >+ /*Register vIRQ handler for MCE LOG processing*/ >+#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >+ printk(KERN_DEBUG "MCE: bind virq for DOM0 Logging\n"); >+ bind_virq_for_mce(); >+#endif >+ > return err; > >And following code can be removed, although it may cause dom0''s needless polling. > >+#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >+static int check_interval = 0; /* disable polling */ >+#else > static int check_interval = 5 * 60; /* 5 minutes */ >+#endif >+Actually, these two hunks actually represent examples of what I''m not concerned about . Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2009-Jun-10 02:45 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Jan Beulich wrote:>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 09.06.09 11:30 >>> >> xen-devel-bounces@lists.xensource.com wrote: >>>>>> "Ke, Liping" <liping.ke@intel.com> 09.06.09 07:10 >>> >>>> This patch is to support MCA info logging in DOM0. When an MCE/CMCI >>>> error happens (or by polling), the related error info will be sent >>>> to DOM0 by XEN. This patch will help to fetch the xen-logged >>>> information by hypercall and then convert XEN-format log into Linux >>>> format MCELOG. So we can reuse current available mcelog tools for >>>> native Linux now. >>> >>> Could this patch please be re-worked to not needlessly touch non-Xen >>> code? This would include a separate config option for the new code, >>> while disabling the native sub-options (which at once would avoid >>> the hacks you appear to need to make mce_amd.c build). >> >> Jan, I think the native sub-options (X86_MCE_INTEL and > X86_MCE_AMD) needs be enabled still. >> When a MCE happens, and dom0 is effected, that MCE will be > injected to dom0 as virtual MCE, and that requires to enable these >> otions. >> When a MCE happens and dom0 is not effected, that MCE will be > sent to dom0 as vIRQ so that dom0 can log that event for >> analysis. > > Oh, okay, if the code is indeed needed, than that''s fine. But > the hacks needed > to get mce_amd.c to compile look suspicious. As much as e.g. changing > the defaults for the MCE Kconfig sub-options - those should be > kept as is, and if > the new code has any kind of dependency on them, the to-be-added new > Kconfig option should express this accurately.Maybe we should change the "add" to "copy back". That two code are from native kernel, but it is missed when copy the apic.c to apic-xen.c, we just add it back. (After all, these hunks just change xen specific code, apic-xen.c and mach-xen/asm/hw-irq.h)> > Basically, behavior for a native kernel built from the same > sources should not > be modified at all. > >> But yes, maybe following hunk can be done in xen-code to avoid >> needless touch. >> >> + >> + /*Register vIRQ handler for MCE LOG processing*/ >> +#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >> + printk(KERN_DEBUG "MCE: bind virq for DOM0 Logging\n"); + >> bind_virq_for_mce(); +#endif >> + >> return err; >> >> And following code can be removed, although it may cause dom0''s >> needless polling. >> >> +#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >> +static int check_interval = 0; /* disable polling */ +#else >> static int check_interval = 5 * 60; /* 5 minutes */ +#endif >> + > > Actually, these two hunks actually represent examples of what I''m not > concerned about .Remove it to xen-specific file will help us in future for the PV_ops dom0.> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jun-10 06:52 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Hi, Jan and yunhong According to the discussion result, I modified the patch 1) use Kconfig to identify when mce_dom.c and mce_XXX.c will be built. mce_dom.c will only be build when it is under XEN MCE environment. 2) virq bind function will only be called when mce_dom.c is used. I test the compiling both under native/dom0 with all possible combinations. Thanks& Regards, Criping -----Original Message----- From: Jiang, Yunhong Sent: 2009年6月10日 10:46 To: Jan Beulich; Ke, Liping Cc: Keir Fraser; xen-devel Subject: RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0 Jan Beulich wrote:>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 09.06.09 11:30 >>> >> xen-devel-bounces@lists.xensource.com wrote: >>>>>> "Ke, Liping" <liping.ke@intel.com> 09.06.09 07:10 >>> >>>> This patch is to support MCA info logging in DOM0. When an MCE/CMCI >>>> error happens (or by polling), the related error info will be sent >>>> to DOM0 by XEN. This patch will help to fetch the xen-logged >>>> information by hypercall and then convert XEN-format log into Linux >>>> format MCELOG. So we can reuse current available mcelog tools for >>>> native Linux now. >>> >>> Could this patch please be re-worked to not needlessly touch non-Xen >>> code? This would include a separate config option for the new code, >>> while disabling the native sub-options (which at once would avoid >>> the hacks you appear to need to make mce_amd.c build). >> >> Jan, I think the native sub-options (X86_MCE_INTEL and > X86_MCE_AMD) needs be enabled still. >> When a MCE happens, and dom0 is effected, that MCE will be > injected to dom0 as virtual MCE, and that requires to enable these >> otions. >> When a MCE happens and dom0 is not effected, that MCE will be > sent to dom0 as vIRQ so that dom0 can log that event for >> analysis. > > Oh, okay, if the code is indeed needed, than that's fine. But > the hacks needed > to get mce_amd.c to compile look suspicious. As much as e.g. changing > the defaults for the MCE Kconfig sub-options - those should be > kept as is, and if > the new code has any kind of dependency on them, the to-be-added new > Kconfig option should express this accurately.Maybe we should change the "add" to "copy back". That two code are from native kernel, but it is missed when copy the apic.c to apic-xen.c, we just add it back. (After all, these hunks just change xen specific code, apic-xen.c and mach-xen/asm/hw-irq.h)> > Basically, behavior for a native kernel built from the same > sources should not > be modified at all. > >> But yes, maybe following hunk can be done in xen-code to avoid >> needless touch. >> >> + >> + /*Register vIRQ handler for MCE LOG processing*/ >> +#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >> + printk(KERN_DEBUG "MCE: bind virq for DOM0 Logging\n"); + >> bind_virq_for_mce(); +#endif >> + >> return err; >> >> And following code can be removed, although it may cause dom0's >> needless polling. >> >> +#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >> +static int check_interval = 0; /* disable polling */ +#else >> static int check_interval = 5 * 60; /* 5 minutes */ +#endif >> + > > Actually, these two hunks actually represent examples of what I'm not > concerned about .Remove it to xen-specific file will help us in future for the PV_ops dom0.> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jun-10 06:53 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 10.06.09 04:45 >>> >Jan Beulich wrote: >>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 09.06.09 11:30 >>> >>> Jan, I think the native sub-options (X86_MCE_INTEL and >> X86_MCE_AMD) needs be enabled still. >>> When a MCE happens, and dom0 is effected, that MCE will be >> injected to dom0 as virtual MCE, and that requires to enable these >>> otions. >>> When a MCE happens and dom0 is not effected, that MCE will be >> sent to dom0 as vIRQ so that dom0 can log that event for >>> analysis. >> >> Oh, okay, if the code is indeed needed, than that''s fine. But >> the hacks needed >> to get mce_amd.c to compile look suspicious. As much as e.g. changing >> the defaults for the MCE Kconfig sub-options - those should be >> kept as is, and if >> the new code has any kind of dependency on them, the to-be-added new >> Kconfig option should express this accurately. > >Maybe we should change the "add" to "copy back". That two code are from >native kernel, but it is missed when copy the apic.c to apic-xen.c, we just >add it back. (After all, these hunks just change xen specific code, >apic-xen.c and mach-xen/asm/hw-irq.h)But both changes are wrong for Xen: Neither can you do APIC writes in Dom0 (or setup an APIC local vector entry in particular), nor do vectors 0xf9 and 0xfa have any significance there. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2009-Jun-10 06:58 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Jan Beulich wrote:>>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 10.06.09 04:45 >>> >> Jan Beulich wrote: >>>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 09.06.09 11:30 >>> >>>> Jan, I think the native sub-options (X86_MCE_INTEL and >>> X86_MCE_AMD) needs be enabled still. >>>> When a MCE happens, and dom0 is effected, that MCE will be >>> injected to dom0 as virtual MCE, and that requires to enable these >>>> otions. >>>> When a MCE happens and dom0 is not effected, that MCE will be >>> sent to dom0 as vIRQ so that dom0 can log that event for >>>> analysis. >>> >>> Oh, okay, if the code is indeed needed, than that''s fine. But >>> the hacks needed >>> to get mce_amd.c to compile look suspicious. As much as e.g. >>> changing the defaults for the MCE Kconfig sub-options - those >>> should be >>> kept as is, and if >>> the new code has any kind of dependency on them, the to-be-added new >>> Kconfig option should express this accurately. >> >> Maybe we should change the "add" to "copy back". That two code are >> from native kernel, but it is missed when copy the apic.c to >> apic-xen.c, we just add it back. (After all, these hunks just change >> xen specific code, apic-xen.c and mach-xen/asm/hw-irq.h) > > But both changes are wrong for Xen: Neither can you do APIC writes in > Dom0 (or setup an APIC local vector entry in particular), nor do > vectors 0xf9 and 0xfa have any significance there.Aha, yes. We need check our patch again for similar potential issue. Thanks for pointing that.> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2009-Jun-10 07:00 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
The issue here is, Kernel has same config option/file for MCE/Thermal. Although Xen HV provide virtual MCE to dom0 so that we don''t need anything changes for dom0, but that''s not ture forthermal. We will consider that. Really thanks for pointing that. --jyh Jan Beulich wrote:>>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 10.06.09 04:45 >>> >> Jan Beulich wrote: >>>>>> "Jiang, Yunhong" <yunhong.jiang@intel.com> 09.06.09 11:30 >>> >>>> Jan, I think the native sub-options (X86_MCE_INTEL and >>> X86_MCE_AMD) needs be enabled still. >>>> When a MCE happens, and dom0 is effected, that MCE will be >>> injected to dom0 as virtual MCE, and that requires to enable these >>>> otions. >>>> When a MCE happens and dom0 is not effected, that MCE will be >>> sent to dom0 as vIRQ so that dom0 can log that event for >>>> analysis. >>> >>> Oh, okay, if the code is indeed needed, than that''s fine. But >>> the hacks needed >>> to get mce_amd.c to compile look suspicious. As much as e.g. >>> changing the defaults for the MCE Kconfig sub-options - those >>> should be >>> kept as is, and if >>> the new code has any kind of dependency on them, the to-be-added new >>> Kconfig option should express this accurately. >> >> Maybe we should change the "add" to "copy back". That two code are >> from native kernel, but it is missed when copy the apic.c to >> apic-xen.c, we just add it back. (After all, these hunks just change >> xen specific code, apic-xen.c and mach-xen/asm/hw-irq.h) > > But both changes are wrong for Xen: Neither can you do APIC writes in > Dom0 (or setup an APIC local vector entry in particular), nor do > vectors 0xf9 and 0xfa have any significance there. > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jun-10 08:17 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
>>> "Ke, Liping" <liping.ke@intel.com> 10.06.09 08:52 >>> >According to the discussion result, I modified the patch >1) use Kconfig to identify when mce_dom.c and mce_XXX.c will be built. mce_dom.c > will only be build when it is under XEN MCE environment. >2) virq bind function will only be called when mce_dom.c is used. > >I test the compiling both under native/dom0 with all possible combinations.I think that''s not covering all the issue I pointed out:>--- a/arch/x86_64/Kconfig Tue Jun 09 09:58:11 2009 +0800 >+++ b/arch/x86_64/Kconfig Wed Jun 10 14:12:55 2009 +0800 >@@ -472,7 +472,6 @@ > > config X86_MCE > bool "Machine check support" if EMBEDDED >- depends on !X86_64_XEN > default y > help > Include a machine check error handler to report hardware errors.Shouldn''t this rather change the dependency to XEN_PRIVILEGED_GUEST?>@@ -483,7 +482,7 @@ > config X86_MCE_INTEL > bool "Intel MCE features" > depends on X86_MCE && X86_LOCAL_APIC >- default y >+ default n > help > Additional support for intel specific MCE features such as > the thermal monitor.This hunk should go.>@@ -491,11 +490,15 @@ > config X86_MCE_AMD > bool "AMD MCE features" > depends on X86_MCE && X86_LOCAL_APIC >- default y >+ default n > help > Additional support for AMD specific MCE features such as > the DRAM Error Threshold.And likewise this part.> >+config X86_64_XEN_MCE >+ def_bool y >+ depends on X86_64_XEN && (X86_MCE_INTEL || X86_MCE_AMD) >+I wonder if this wouldn''t better be named X86_XEN_MCE (for consistency with a potential 32-bit implementation).> config KEXEC > bool "kexec system call (EXPERIMENTAL)" > depends on EXPERIMENTAL && !XEN_UNPRIVILEGED_GUEST >... >--- a/arch/x86_64/kernel/apic-xen.c Tue Jun 09 09:58:11 2009 +0800 >+++ b/arch/x86_64/kernel/apic-xen.c Wed Jun 10 14:12:55 2009 +0800 >@@ -195,3 +195,13 @@ > > return 1; > } >+ >+/* Pull back for compiling arch/x86_64/kernel/mce_amd.c */ >+void setup_APIC_extened_lvt(unsigned char lvt_off, unsigned char vector, >+ unsigned char msg_type, unsigned char mask) >+{ >+ unsigned long reg = (lvt_off << 4) + K8_APIC_EXT_LVT_BASE; >+ unsigned int v = (mask << 16) | (msg_type << 8) | vector; >+ apic_write(reg, v); >+}This continues to be wrong (yes, I realize arch/x86_64/kernel/apic-xen.c is full of such broken code, but that doesn''t mean more should be added).>... >--- a/arch/x86_64/kernel/mce.c Tue Jun 09 09:58:11 2009 +0800 >+++ b/arch/x86_64/kernel/mce.c Wed Jun 10 14:12:55 2009 +0800 >@@ -165,7 +165,7 @@ > * The actual machine check handler > */ > >-void do_machine_check(struct pt_regs * regs, long error_code) >+asmlinkage void do_machine_check(struct pt_regs * regs, long error_code) > { > struct mce m, panicm; > int nowayout = (tolerant < 1);Why?>@@ -276,9 +276,16 @@ > > /* > * Periodic polling timer for "silent" machine check errors. >- */ >+ * We will disable polling in DOM0 since all CMCI/Polling >+ * mechanism will be done in XEN for Intel CPUs >+*/ > >+#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >+static int check_interval = 0; /* disable polling */ >+#else > static int check_interval = 5 * 60; /* 5 minutes */ >+#endif >+ > static void mcheck_timer(void *data); > static DECLARE_WORK(mcheck_work, mcheck_timer, NULL);Shouldn''t that now simply be #ifdef CONFIG_X86_64_XEN_MCE?>... >--- a/include/asm-x86_64/mach-xen/asm/hw_irq.h Tue Jun 09 09:58:11 2009 +0800 >+++ b/include/asm-x86_64/mach-xen/asm/hw_irq.h Wed Jun 10 14:12:55 2009 +0800 >@@ -60,6 +60,9 @@ > #define NUM_INVALIDATE_TLB_VECTORS 8 > #endif > >+/* Pull back for compiling arch/x86_64/kernel/mce_amd.c */ >+#define THRESHOLD_APIC_VECTOR 0xf9 >+#define THERMAL_APIC_VECTOR 0xfa > /* > * Local APIC timer IRQ vector is on a different priority level, > * to work around the ''lost local interrupt if more than 2 IRQThese vectors mean nothing under Xen. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jun-10 09:14 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Hi, Jan When enabing mce in DOM0, we only want to fully reuse machine check handler in mce.c as guest virtual mca handler. mce_intel.c and mce_amd.c contain two AMD and Intel hardware features which is no use to (both mce virq log and virtual mca enabling) actually. Problem is that mce_XXX_feature_init in (mce_intel.c and mce_amd.c) is called in mce.c when do mce init. So how about: 1) Make those two files only compile under native linux 2) in mce.c, don''t call mce_XXX_feature_init under XEN envrionment? Then many of problems will not exist, please see below answers. Thanks! Criping Jan Beulich wrote:>>>> "Ke, Liping" <liping.ke@intel.com> 10.06.09 08:52 >>> >> According to the discussion result, I modified the patch >> 1) use Kconfig to identify when mce_dom.c and mce_XXX.c will be >> built. mce_dom.c will only be build when it is under XEN MCE >> environment. 2) virq bind function will only be called when >> mce_dom.c is used. >> >> I test the compiling both under native/dom0 with all possible >> combinations. > > I think that''s not covering all the issue I pointed out: > >> --- a/arch/x86_64/Kconfig Tue Jun 09 09:58:11 2009 +0800 >> +++ b/arch/x86_64/Kconfig Wed Jun 10 14:12:55 2009 +0800 @@ -472,7 >> +472,6 @@ >> >> config X86_MCE >> bool "Machine check support" if EMBEDDED >> - depends on !X86_64_XEN >> default y >> help >> Include a machine check error handler to report hardware errors. > > Shouldn''t this rather change the dependency to XEN_PRIVILEGED_GUEST? >Native depends on this config too, so have about change it like this depends on !X86_64_XEN || ( X86_64_XEN && XEN_PRIVILEGED_GUEST)>> @@ -483,7 +482,7 @@ >> config X86_MCE_INTEL >> bool "Intel MCE features" >> depends on X86_MCE && X86_LOCAL_APIC >> - default y >> + default n >> help >> Additional support for intel specific MCE features such as >> the thermal monitor. > > This hunk should go.This option will not appear under XEN, only for native? depends on X86_MCE && X86_LOCAL_APIC && !X86_64_XEN> >> @@ -491,11 +490,15 @@ >> config X86_MCE_AMD >> bool "AMD MCE features" >> depends on X86_MCE && X86_LOCAL_APIC >> - default y >> + default n >> help >> Additional support for AMD specific MCE features such as >> the DRAM Error Threshold. > > And likewise this part. >This option will not appear under XEN, only for native. depends on X86_MCE && X86_LOCAL_APIC && !X86_64_XEN So above two files will not be compiled for DOM0. Compiling hunks for them will not be needed any more.>> >> +config X86_64_XEN_MCE >> + def_bool y >> + depends on X86_64_XEN && (X86_MCE_INTEL || X86_MCE_AMD) >> + > > I wonder if this wouldn''t better be named X86_XEN_MCE (for consistency > with a potential 32-bit implementation). >This option will depends on X86_64_XEN && X86_MCE. Yes, we could rename it. Yet this config file is under X86_64 sub dir, also, seems 32bit system will not have mca support?>> config KEXEC >> bool "kexec system call (EXPERIMENTAL)" >> depends on EXPERIMENTAL && !XEN_UNPRIVILEGED_GUEST ... >> --- a/arch/x86_64/kernel/apic-xen.c Tue Jun 09 09:58:11 2009 +0800 >> +++ b/arch/x86_64/kernel/apic-xen.c Wed Jun 10 14:12:55 2009 +0800 >> @@ -195,3 +195,13 @@ >> >> return 1; >> } >> + >> +/* Pull back for compiling arch/x86_64/kernel/mce_amd.c */ >> +void setup_APIC_extened_lvt(unsigned char lvt_off, unsigned char >> vector, + unsigned char msg_type, unsigned char mask) +{ >> + unsigned long reg = (lvt_off << 4) + K8_APIC_EXT_LVT_BASE; >> + unsigned int v = (mask << 16) | (msg_type << 8) | vector; >> + apic_write(reg, v); +} > > This continues to be wrong (yes, I realize > arch/x86_64/kernel/apic-xen.c is full of such broken code, but that > doesn''t mean more should be added). >We no long need this change.>> ... >> --- a/arch/x86_64/kernel/mce.c Tue Jun 09 09:58:11 2009 +0800 >> +++ b/arch/x86_64/kernel/mce.c Wed Jun 10 14:12:55 2009 +0800 @@ >> -165,7 +165,7 @@ >> * The actual machine check handler >> */ >> >> -void do_machine_check(struct pt_regs * regs, long error_code) >> +asmlinkage void do_machine_check(struct pt_regs * regs, long >> error_code) { struct mce m, panicm; >> int nowayout = (tolerant < 1); > > Why?Yes, this change is not necessary, I will remove it.> >> @@ -276,9 +276,16 @@ >> >> /* >> * Periodic polling timer for "silent" machine check errors. - */ >> + * We will disable polling in DOM0 since all CMCI/Polling >> + * mechanism will be done in XEN for Intel CPUs >> +*/ >> >> +#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL) >> +static int check_interval = 0; /* disable polling */ +#else >> static int check_interval = 5 * 60; /* 5 minutes */ +#endif >> + >> static void mcheck_timer(void *data); >> static DECLARE_WORK(mcheck_work, mcheck_timer, NULL); > > Shouldn''t that now simply be #ifdef CONFIG_X86_64_XEN_MCE? >yes, you''re right. Missed this chunk.>> ... >> --- a/include/asm-x86_64/mach-xen/asm/hw_irq.h Tue Jun 09 09:58:11 >> 2009 +0800 +++ b/include/asm-x86_64/mach-xen/asm/hw_irq.h Wed Jun 10 >> 14:12:55 2009 +0800 @@ -60,6 +60,9 @@ #define >> NUM_INVALIDATE_TLB_VECTORS 8 #endif >> >> +/* Pull back for compiling arch/x86_64/kernel/mce_amd.c */ >> +#define THRESHOLD_APIC_VECTOR 0xf9 >> +#define THERMAL_APIC_VECTOR 0xfa >> /* >> * Local APIC timer IRQ vector is on a different priority level, >> * to work around the ''lost local interrupt if more than 2 IRQ > > These vectors mean nothing under Xen.no longer need this change.> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jun-11 02:42 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Hi, Jan Thanks a lot for the detailed review! I resend the verified patch according to the below agreement. Regards, Criping Jan Beulich wrote:>>>> --- a/arch/x86_64/Kconfig Tue Jun 09 09:58:11 2009 +0800 >>>> +++ b/arch/x86_64/Kconfig Wed Jun 10 14:12:55 2009 +0800 @@ -472,7 >>>> +472,6 @@ >>>> >>>> config X86_MCE >>>> bool "Machine check support" if EMBEDDED >>>> - depends on !X86_64_XEN >>>> default y >>>> help >>>> Include a machine check error handler to report hardware >>>> errors. >>> >>> Shouldn''t this rather change the dependency to XEN_PRIVILEGED_GUEST? >>> >> Native depends on this config too, so have about change it like this >> depends on !X86_64_XEN || ( X86_64_XEN && XEN_PRIVILEGED_GUEST) > > Oh, right. But then why not simply using !XEN_UNPRIVILEGED_GUEST?Yes, it''s very nice.> >>>> @@ -483,7 +482,7 @@ >>>> config X86_MCE_INTEL >>>> bool "Intel MCE features" >>>> depends on X86_MCE && X86_LOCAL_APIC >>>> - default y >>>> + default n >>>> help >>>> Additional support for intel specific MCE features such as >>>> the thermal monitor. >>> >>> This hunk should go. >> >> This option will not appear under XEN, only for native? >> depends on X86_MCE && X86_LOCAL_APIC && !X86_64_XEN > > Whether it appears is not the question - you''re changing the default > value for no good reason. >Yes. I need not change the default value now, it is true only in Native. We will not build those two specifc files.>>>> >>>> +config X86_64_XEN_MCE >>>> + def_bool y >>>> + depends on X86_64_XEN && (X86_MCE_INTEL || X86_MCE_AMD) >>>> + >>> >>> I wonder if this wouldn''t better be named X86_XEN_MCE (for >>> consistency with a potential 32-bit implementation). >>> >> >> This option will depends on X86_64_XEN && X86_MCE. >> Yes, we could rename it. Yet this config file is under X86_64 sub >> dir, also, seems 32bit system will not have mca support? > > They do - see arch/i386/kernel/cpu/mcheck/.Actually I mean 32 bit system might have rare hardware support. But for consistency, yes, it''s better to rename it as X86_XEN_MCE.> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jun-15 07:13 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
>>> "Ke, Liping" <liping.ke@intel.com> 11.06.09 04:42 >>>A few small issue remain:>--- a/arch/x86_64/Kconfig Thu Jun 11 10:14:50 2009 +0800 >+++ b/arch/x86_64/Kconfig Thu Jun 11 10:25:16 2009 +0800 >@@ -471,9 +471,9 @@ > bool > > config X86_MCE >- bool "Machine check support" if EMBEDDED >- depends on !X86_64_XEN >- default y >+ bool "Machine check support" >+ depends on (!XEN_UNPRIVILEGED_GUEST) >+ default n > help > Include a machine check error handler to report hardware errors. > This version will require the mcelog utility to decode someOnce again you change a default value here for no apparent reason.>--- a/arch/x86_64/kernel/mce.c Thu Jun 11 10:14:50 2009 +0800 >+++ b/arch/x86_64/kernel/mce.c Thu Jun 11 10:25:16 2009 +0800 >... >@@ -395,7 +403,9 @@ > return; > > mce_init(NULL); >+#ifndef CONFIG_X86_64_XEN > mce_cpu_features(c); >+#endif > } > > /*This conditional can be avoided if the preceding one would be moved inside the function (to cover just the entire function body).>@@ -649,6 +659,8 @@ > }; > #endif > >+struct mc_info *g_mi; >+extern void bind_virq_for_mce(void); > static __init int mce_init_device(void) > { > int err;g_mi appears to be no longer used outside of arch/x86_64/kernel/mce_dom0.c, so it could be (statically) defined there instead of here. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jun-15 07:49 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Hi, Jan Thanks a lot! I modified them and resent it. Regards, Criping>> config X86_MCE >> - bool "Machine check support" if EMBEDDED >> - depends on !X86_64_XEN >> - default y >> + bool "Machine check support" >> + depends on (!XEN_UNPRIVILEGED_GUEST) >> + default n >> help >> Include a machine check error handler to report hardware errors. >> This version will require the mcelog utility to decode some > > Once again you change a default value here for no apparent reason.I originally plan not enable this option by default. But yes, it''s fine to enable it.>> +#ifndef CONFIG_X86_64_XEN >> mce_cpu_features(c); >> +#endif >> } >> >> /* > > This conditional can be avoided if the preceding one would be moved > inside the function (to cover just the entire function body).Yes. I will change it.> > g_mi appears to be no longer used outside of > arch/x86_64/kernel/mce_dom0.c, so it could be (statically) defined > there instead of here. >Yes. I need to clean this code after changes.> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jun-15 08:11 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Thanks - looks goo now to me (i.e. Acked-by: Jan Beulich <jbeulich@novell.com> if we have such a concept in Xen). Jan>>> "Ke, Liping" <liping.ke@intel.com> 15.06.09 09:49 >>>Hi, Jan Thanks a lot! I modified them and resent it. Regards, Criping>> config X86_MCE >> - bool "Machine check support" if EMBEDDED >> - depends on !X86_64_XEN >> - default y >> + bool "Machine check support" >> + depends on (!XEN_UNPRIVILEGED_GUEST) >> + default n >> help >> Include a machine check error handler to report hardware errors. >> This version will require the mcelog utility to decode some > > Once again you change a default value here for no apparent reason.I originally plan not enable this option by default. But yes, it''s fine to enable it.>> +#ifndef CONFIG_X86_64_XEN >> mce_cpu_features(c); >> +#endif >> } >> >> /* > > This conditional can be avoided if the preceding one would be moved > inside the function (to cover just the entire function body).Yes. I will change it.> > g_mi appears to be no longer used outside of > arch/x86_64/kernel/mce_dom0.c, so it could be (statically) defined > there instead of here. >Yes. I need to clean this code after changes.> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jun-15 08:44 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Hi, keir This is the final patch for adding MCA logging support in DOM0 which is Acked by Jan. And thanks for Jan''s help! Regards, Criping Jan Beulich wrote:> Thanks - looks goo now to me (i.e. Acked-by: Jan Beulich > <jbeulich@novell.com> > if we have such a concept in Xen). > > Jan >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jun-30 15:55 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Actually, there are still two issues with this: First, shouldn''t bind_virq_for_mce() and the associated message printing really only happen on Dom0 (i.e. be guarded by is_initial_xendomain()), just like the name suggests? Second, I''m now seeing (XEN) traps.c:2263:d0 Domain attempted WRMSR 000000000000017b from 00000000:0000003f to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR 0000000000000400 from 00000000:000000ff to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR 0000000000000404 from 00000000:ffffffff to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR 0000000000000408 from 00000000:00000fff to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR 000000000000040c from 00000000:00000003 to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR 0000000000000410 from 00000000:3fffffff to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR 0000000000000414 from 00000000:00000001 to ffffffff:ffffffff. which isn''t really nice. Jan>>> "Ke, Liping" <liping.ke@intel.com> 15.06.09 10:44 >>>Hi, keir This is the final patch for adding MCA logging support in DOM0 which is Acked by Jan. And thanks for Jan''s help! Regards, Criping Jan Beulich wrote:> Thanks - looks goo now to me (i.e. Acked-by: Jan Beulich > <jbeulich@novell.com> > if we have such a concept in Xen). > > Jan >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jul-01 01:42 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Hi, Jan Please see below answers. Regards, Criping Jan Beulich wrote:> Actually, there are still two issues with this: > > First, shouldn''t bind_virq_for_mce() and the associated message > printing > really only happen on Dom0 (i.e. be guarded by > is_initial_xendomain()), > just like the name suggests?Yes. DOM0 is the only one who''s responsible for MCE logging (Even the error address belongs to other Guest). Currently mce_dom0.o is only compiled when it is a privileged guest [depends on (!XEN_UNPRIVILEGED_GUEST) according to Kconfig]. bind_virq_for_mce() is also protected by the same thing [#if defined(CONFIG_X86_XEN_MCE)] You mean we still need other guard?> > Second, I''m now seeing > > (XEN) traps.c:2263:d0 Domain attempted WRMSR 000000000000017b from > 00000000:0000003f to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain > attempted WRMSR 0000000000000400 from 00000000:000000ff to > ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR > 0000000000000404 from 00000000:ffffffff to ffffffff:ffffffff. (XEN) > traps.c:2263:d0 Domain attempted WRMSR 0000000000000408 from > 00000000:00000fff to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain > attempted WRMSR 000000000000040c from 00000000:00000003 to > ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR > 0000000000000410 from 00000000:3fffffff to ffffffff:ffffffff. (XEN) > traps.c:2263:d0 Domain attempted WRMSR 0000000000000414 from > 00000000:00000001 to ffffffff:ffffffff. > > which isn''t really nice.I looked into the code, current CONFIG_X86_MCE is defaultly ''y'', so mce.o DOM0 will be compiled by default. Yet we now only support intel 64 bit for DOM0 vMSR. I guess this is the reason why you see those printings. Shall I sent a patch to make stricker limitations for compiling mce.o?> > Jan > >>>> "Ke, Liping" <liping.ke@intel.com> 15.06.09 10:44 >>> Hi, keir > > This is the final patch for adding MCA logging support in DOM0 which > is Acked by Jan. > And thanks for Jan''s help! > > Regards, > Criping > > Jan Beulich wrote: >> Thanks - looks goo now to me (i.e. Acked-by: Jan Beulich >> <jbeulich@novell.com> if we have such a concept in Xen). >> >> Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2009-Jul-01 06:42 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
>>> "Ke, Liping" <liping.ke@intel.com> 01.07.09 03:42 >>> >Jan Beulich wrote: >> Actually, there are still two issues with this: >> >> First, shouldn''t bind_virq_for_mce() and the associated message >> printing >> really only happen on Dom0 (i.e. be guarded by >> is_initial_xendomain()), >> just like the name suggests? > >Yes. DOM0 is the only one who''s responsible for MCE logging (Even >the error address belongs to other Guest). Currently mce_dom0.o is >only compiled when it is a privileged guest [depends on (!XEN_UNPRIVILEGED_GUEST) >according to Kconfig]. bind_virq_for_mce() is also protected by the same thing > [#if defined(CONFIG_X86_XEN_MCE)] You mean we still need other guard?!XEN_UNPRIVILEGED_GUEST doesn''t mean the kernel can''t run as DomU, it''s only the other way around. Hence a build-time check only isn''t sufficient.>> >> Second, I''m now seeing >> >> (XEN) traps.c:2263:d0 Domain attempted WRMSR 000000000000017b from >> 00000000:0000003f to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain >> attempted WRMSR 0000000000000400 from 00000000:000000ff to >> ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR >> 0000000000000404 from 00000000:ffffffff to ffffffff:ffffffff. (XEN) >> traps.c:2263:d0 Domain attempted WRMSR 0000000000000408 from >> 00000000:00000fff to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain >> attempted WRMSR 000000000000040c from 00000000:00000003 to >> ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR >> 0000000000000410 from 00000000:3fffffff to ffffffff:ffffffff. (XEN) >> traps.c:2263:d0 Domain attempted WRMSR 0000000000000414 from >> 00000000:00000001 to ffffffff:ffffffff. >> >> which isn''t really nice. > >I looked into the code, current CONFIG_X86_MCE is defaultly ''y'', so mce.o >DOM0 will be compiled by default. Yet we now only support intel 64 bit for >DOM0 vMSR. I guess this is the reason why you see those printings. >Shall I sent a patch to make stricker limitations for compiling mce.o?I''m not sure if you can avoid compiling mce.o altogether - if that''s possible, it would certainly be the best approach. If not, some other mechanism to suppress the actual hardware touching bits would be desirable. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ke, Liping
2009-Jul-01 07:01 UTC
RE: [Xen-devel] [PATCH] DOM0: Adding MCA Loging support in DOM0
Jan Beulich wrote:>>>> "Ke, Liping" <liping.ke@intel.com> 01.07.09 03:42 >>> >> Jan Beulich wrote: >>> Actually, there are still two issues with this: >>> >>> First, shouldn''t bind_virq_for_mce() and the associated message >>> printing really only happen on Dom0 (i.e. be guarded by >>> is_initial_xendomain()), >>> just like the name suggests? >> >> Yes. DOM0 is the only one who''s responsible for MCE logging (Even >> the error address belongs to other Guest). Currently mce_dom0.o is >> only compiled when it is a privileged guest [depends on >> (!XEN_UNPRIVILEGED_GUEST) according to Kconfig]. bind_virq_for_mce() >> is also protected by the same thing [#if >> defined(CONFIG_X86_XEN_MCE)] You mean we still need other guard? > > !XEN_UNPRIVILEGED_GUEST doesn''t mean the kernel can''t run as DomU, > it''s only the other way around. Hence a build-time check only isn''t > sufficient. >Ok, I will send out a patch adding guard before calling bind_virq_for_mce.>>> >>> Second, I''m now seeing >>> >>> (XEN) traps.c:2263:d0 Domain attempted WRMSR 000000000000017b from >>> 00000000:0000003f to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain >>> attempted WRMSR 0000000000000400 from 00000000:000000ff to >>> ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR >>> 0000000000000404 from 00000000:ffffffff to ffffffff:ffffffff. (XEN) >>> traps.c:2263:d0 Domain attempted WRMSR 0000000000000408 from >>> 00000000:00000fff to ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain >>> attempted WRMSR 000000000000040c from 00000000:00000003 to >>> ffffffff:ffffffff. (XEN) traps.c:2263:d0 Domain attempted WRMSR >>> 0000000000000410 from 00000000:3fffffff to ffffffff:ffffffff. (XEN) >>> traps.c:2263:d0 Domain attempted WRMSR 0000000000000414 from >>> 00000000:00000001 to ffffffff:ffffffff. >>> >>> which isn''t really nice. >> >> I looked into the code, current CONFIG_X86_MCE is defaultly ''y'', so >> mce.o >> DOM0 will be compiled by default. Yet we now only support intel 64 >> bit for >> DOM0 vMSR. I guess this is the reason why you see those printings. >> Shall I sent a patch to make stricker limitations for compiling >> mce.o? > > I''m not sure if you can avoid compiling mce.o altogether - if that''s > possible, it would certainly be the best approach. If not, some other > mechanism to suppress the actual hardware touching bits would be > desirable.I plan first send a patch making mce.o compiled for x86_64 Intel arch. Since we do plan to make AMD and Intel have a common MCA handling logic. Both AMD and Intel arch shall support vMSR then. This common handler might need some time. How do you think?> > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel