SUZUKI Kazuhiro
2008-Sep-26 06:04 UTC
[Xen-devel] [RFC][PATCH 0/2] MCA support for Intel64
Hi, I am interested in MCA/MCE on x86 system. But I don''t have AMD machine, so I tried to port MCA handler on AMD k8 to Intel p4, I attach it. Additionally, I attach a patch that supports MCA handler for linux-2.6.18-xen/x86_64. Unfortunately, I don''t know how to test MCA/MCE on Intel p4 CPU and chipset, so I could not test with real MCA only in testing with fake mode(just function calling). Is there anyone who knows and teaches me how to test the real MCA? [1/2] xen part: mca-support-for-intel-xen.patch [2/2] linux/x86_64 part: mca-support-for-intel-linux.patch Signed-off-by: Kazuhiro Suzuki <kaz@jp.fujitsu.com> Thanks, KAZ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2008-Sep-26 14:35 UTC
RE: [Xen-devel] [RFC][PATCH 0/2] MCA support for Intel64
Glad to know you are working on MCA too. There are some discussion already on it, I google and get the link http://article.gmane.org/gmane.comp.emulators.xen.devel/56284 for some discussion. You can check that thread. Some point need discussion including: 1) How to detect the impacted components in Xen side? As the above mail thread discussed, maybe some improvement can be added to check the context more precisely. For example, if the stack tell the MCE happens when Xen is running, even if current is idle, we still think Xen is impacted. Also we can check the page owner to decide impacted guest. Personally I don''t think "current" is very helpful. 2) How to split the effort between Xen and Dom0? Maybe we can re-use dom0''s MCE handler as much as possible, Xen will only do some initial containment, especially considering Linux is also enhancing MCE handler. Of course, another option is to place all handler in Xen side. 3) We need consider what will happen if multiple MCA happen at multiple CPU simuatanously, maybe some monarch algrithom is needed. Hope your input on it. BTW, I checked you patch to kernel of mce-xen.c, seems not much change to it. You will add more changes on it, right? Thanks Yunhong Jiang xen-devel-bounces@lists.xensource.com <> wrote:> Hi, > > I am interested in MCA/MCE on x86 system. But I don''t have AMD > machine, so I tried to port MCA handler on AMD k8 to Intel p4, I > attach it. Additionally, I attach a patch that supports MCA handler for > linux-2.6.18-xen/x86_64. > > Unfortunately, I don''t know how to test MCA/MCE on Intel p4 CPU and > chipset, so I could not test with real MCA only in testing with fake > mode(just function calling). Is there anyone who knows and teaches me how > to test the real MCA? > > [1/2] xen part: mca-support-for-intel-xen.patch > [2/2] linux/x86_64 part: mca-support-for-intel-linux.patch > > Signed-off-by: Kazuhiro Suzuki <kaz@jp.fujitsu.com> > > Thanks, > KAZ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
SUZUKI Kazuhiro
2008-Oct-02 07:27 UTC
Re: [Xen-devel] [RFC][PATCH 0/2] MCA support for Intel64
Hi, Thank you for your comment and information. But, I wonder where to start about these issues.> BTW, I checked you patch to kernel of mce-xen.c, seems not much change to it. You will add more changes on it, right?Yes, I plan to implement memory offlining, if the impacted physical memory is detected by MCA, it will be offlined and never reused. Then Xen notifies it to Dom0/DomU. Dom0/DomU''s handler should kill such a process or do something. Thanks, KAZ From: "Jiang, Yunhong" <yunhong.jiang@intel.com> Subject: RE: [Xen-devel] [RFC][PATCH 0/2] MCA support for Intel64 Date: Fri, 26 Sep 2008 22:35:55 +0800> Glad to know you are working on MCA too. > There are some discussion already on it, I google and get the link http://article.gmane.org/gmane.comp.emulators.xen.devel/56284 for some discussion. You can check that thread. > > Some point need discussion including: > 1) How to detect the impacted components in Xen side? As the above mail thread discussed, maybe some improvement can be added to check the context more precisely. For example, if the stack tell the MCE happens when Xen is running, even if current is idle, we still think Xen is impacted. Also we can check the page owner to decide impacted guest. Personally I don''t think "current" is very helpful. > 2) How to split the effort between Xen and Dom0? Maybe we can re-use dom0''s MCE handler as much as possible, Xen will only do some initial containment, especially considering Linux is also enhancing MCE handler. Of course, another option is to place all handler in Xen side. > 3) We need consider what will happen if multiple MCA happen at multiple CPU simuatanously, maybe some monarch algrithom is needed. > > Hope your input on it. > > BTW, I checked you patch to kernel of mce-xen.c, seems not much change to it. You will add more changes on it, right? > > Thanks > Yunhong Jiang > > xen-devel-bounces@lists.xensource.com <> wrote: > > Hi, > > > > I am interested in MCA/MCE on x86 system. But I don''t have AMD > > machine, so I tried to port MCA handler on AMD k8 to Intel p4, I > > attach it. Additionally, I attach a patch that supports MCA handler for > > linux-2.6.18-xen/x86_64. > > > > Unfortunately, I don''t know how to test MCA/MCE on Intel p4 CPU and > > chipset, so I could not test with real MCA only in testing with fake > > mode(just function calling). Is there anyone who knows and teaches me how > > to test the real MCA? > > > > [1/2] xen part: mca-support-for-intel-xen.patch > > [2/2] linux/x86_64 part: mca-support-for-intel-linux.patch > > > > Signed-off-by: Kazuhiro Suzuki <kaz@jp.fujitsu.com> > > > > Thanks, > > KAZ > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jiang, Yunhong
2008-Oct-06 03:19 UTC
RE: [Xen-devel] [RFC][PATCH 0/2] MCA support for Intel64
SUZUKI Kazuhiro <mailto:kaz@jp.fujitsu.com> wrote:> Hi, > > Thank you for your comment and information. > But, I wonder where to start about these issues. > >> BTW, I checked you patch to kernel of mce-xen.c, seems not > much change to it. You will add more changes on it, right? > > Yes, I plan to implement memory offlining, if the impacted physical > memory is detected by MCA, it will be offlined and never reused. Then > Xen notifies it to Dom0/DomU. Dom0/DomU''s handler should kill such > a process or do something.Sorry for slow response because PRC holiday last week. What I mean is, dom0''s MCA handler need more changes. For example, it need check who is the owner of the memory. As for memory offlineing, we are also discussing it internally, and we can work together to add the support. Followed are some idea, hope your feedback on it: 1) memory offlining can be used not only for MCA, but also for other purpose, like memory PM, memory hot add/remove etc, so maybe we can implement it as a generic feature (although I suspect the MCE will be first user) 2) I think there are two types memory offlining requirement for MCA. For correctable error that happen to the same page multiple times, it may need to offline the page(AFAIK, solaris has such feature). In such situation, the pages can still be accessed, and hypervisor can replace the page transparently to guest. For non-correctable error (i.e. triggered through MCE#), maybe the page can''t be accessed any more (like data poisoning situtaion). 3) It may be difficult to offline all type of pages, so we need category page usage type and support some of them.Currently we category page usages as: free pages/non critical pages/critical pages. Non critical pages is memory used as guest''s RAM, ciritcal pages are pages for xen''s usage, including xen''s data/code, and pages used to control guest, like p2m, EPT table etc. It will be much simple in frist stage to support only free/non ciritcal pages. Any idea? 4) We need consider device assigend guest also, so that the error will not be propgated to permanent storage. Thanks Yunhong Jiang> > Thanks, > KAZ > > > From: "Jiang, Yunhong" <yunhong.jiang@intel.com> > Subject: RE: [Xen-devel] [RFC][PATCH 0/2] MCA support for Intel64 > Date: Fri, 26 Sep 2008 22:35:55 +0800 > >> Glad to know you are working on MCA too. >> There are some discussion already on it, I google and get > the link > http://article.gmane.org/gmane.comp.emulators.xen.devel/56284 > for some discussion. You can check that thread. >> >> Some point need discussion including: >> 1) How to detect the impacted components in Xen side? As the > above mail thread discussed, maybe some improvement can be > added to check the context more precisely. For example, if the > stack tell the MCE happens when Xen is running, even if > current is idle, we still think Xen is impacted. Also we can > check the page owner to decide impacted guest. Personally I > don''t think "current" is very helpful. >> 2) How to split the effort between Xen and Dom0? Maybe we > can re-use dom0''s MCE handler as much as possible, Xen will > only do some initial containment, especially considering Linux > is also enhancing MCE handler. Of course, another option is to > place all handler in Xen side. >> 3) We need consider what will happen if multiple MCA happen > at multiple CPU simuatanously, maybe some monarch algrithom is needed. >> >> Hope your input on it. >> >> BTW, I checked you patch to kernel of mce-xen.c, seems not > much change to it. You will add more changes on it, right? >> >> Thanks >> Yunhong Jiang >> >> xen-devel-bounces@lists.xensource.com <> wrote: >>> Hi, >>> >>> I am interested in MCA/MCE on x86 system. But I don''t have AMD >>> machine, so I tried to port MCA handler on AMD k8 to Intel p4, I >>> attach it. Additionally, I attach a patch that supports MCA handler for >>> linux-2.6.18-xen/x86_64. >>> >>> Unfortunately, I don''t know how to test MCA/MCE on Intel p4 CPU and >>> chipset, so I could not test with real MCA only in testing with fake >>> mode(just function calling). Is there anyone who knows and teaches me how >>> to test the real MCA? >>> >>> [1/2] xen part: mca-support-for-intel-xen.patch >>> [2/2] linux/x86_64 part: mca-support-for-intel-linux.patch >>> >>> Signed-off-by: Kazuhiro Suzuki <kaz@jp.fujitsu.com> >>> >>> Thanks, >>> KAZ >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christoph Egger
2008-Oct-06 10:20 UTC
Re: [Xen-devel] [RFC][PATCH 0/2] MCA support for Intel64
When I submitted the MCA support patches, I also sent a design document as a pdf file to this list. This should answer below questions. Christoph On Friday 26 September 2008 16:35:55 Jiang, Yunhong wrote:> Glad to know you are working on MCA too. > There are some discussion already on it, I google and get the link > http://article.gmane.org/gmane.comp.emulators.xen.devel/56284 for some > discussion. You can check that thread. > > Some point need discussion including: > 1) How to detect the impacted components in Xen side? As the above mail > thread discussed, maybe some improvement can be added to check the context > more precisely. For example, if the stack tell the MCE happens when Xen is > running, even if current is idle, we still think Xen is impacted. Also we > can check the page owner to decide impacted guest. Personally I don''t think > "current" is very helpful. 2) How to split the effort between Xen and Dom0? > Maybe we can re-use dom0''s MCE handler as much as possible, Xen will only > do some initial containment, especially considering Linux is also enhancing > MCE handler. Of course, another option is to place all handler in Xen side. > 3) We need consider what will happen if multiple MCA happen at multiple CPU > simuatanously, maybe some monarch algrithom is needed. > > Hope your input on it. > > BTW, I checked you patch to kernel of mce-xen.c, seems not much change to > it. You will add more changes on it, right? > > Thanks > Yunhong Jiang > > xen-devel-bounces@lists.xensource.com <> wrote: > > Hi, > > > > I am interested in MCA/MCE on x86 system. But I don''t have AMD > > machine, so I tried to port MCA handler on AMD k8 to Intel p4, I > > attach it. Additionally, I attach a patch that supports MCA handler for > > linux-2.6.18-xen/x86_64. > > > > Unfortunately, I don''t know how to test MCA/MCE on Intel p4 CPU and > > chipset, so I could not test with real MCA only in testing with fake > > mode(just function calling). Is there anyone who knows and teaches me how > > to test the real MCA? > > > > [1/2] xen part: mca-support-for-intel-xen.patch > > [2/2] linux/x86_64 part: mca-support-for-intel-linux.patch > > > > Signed-off-by: Kazuhiro Suzuki <kaz@jp.fujitsu.com> > > > > Thanks, > > KAZ > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- AMD Saxony, Dresden, Germany Operating System Research Center Legal Information: AMD Saxony Limited Liability Company & Co. KG Sitz (Geschäftsanschrift): Wilschdorfer Landstr. 101, 01109 Dresden, Deutschland Registergericht Dresden: HRA 4896 vertretungsberechtigter Komplementär: AMD Saxony LLC (Sitz Wilmington, Delaware, USA) Geschäftsführer der AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel