flyfan05
2010-Dec-12  07:08 UTC
[Xen-devel] CMCI exceptions happened and MCE entry state transition made Xen crashed.
Hi all, 
Three days ago, the server reported lots of CMCI exceptions and Xen 3.4.2
printed hundreds of "CMCI: send CMCI to DOM0 through virq" messages to
the console .  From the console output, Then I can see that Dom0 try to read the
MSR_CAP regs by #GP trap in order  to log the MCA error.
I am not sure why so many CMCI happened , maybe there were some thing wrong with
the hardware.  But unfortunately the server crashed at the end. The Xen BUG ON
at
    mctelem_append_processing()   
    ->  MCTE_TRANSITION_STATE(tep, COMMITTED, PROCESSING)  
    ->  BUG_ON(MCTE_STATE(tep) !=    (MCTE_F_STATE_##old)); 
  The output of the console is like this: 
(XEN) Xen bug on at mctelem.c : Line 437
Why the state of the entry is not correct ? Some one change that unexpected?  If
any body even resolve this kind problems, Pls do me a favor.
--Van
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
flyfan05
2010-Dec-12  07:11 UTC
[Xen-users] CMCI exceptions happened and MCE entry state transition made Xen crashed.
Hi all, 
Three days ago, the server reported lots of CMCI exceptions and Xen 3.4.2
printed hundreds of "CMCI: send CMCI to DOM0 through virq" messages to
the console .  From the console output, Then I can see that Dom0 try to read the
MSR_CAP regs by #GP trap in order  to log the MCA error.
I am not sure why so many CMCI happened , maybe there were some thing wrong with
the hardware.  But unfortunately the server crashed at the end. The Xen BUG ON
at
    mctelem_append_processing()   
    ->  MCTE_TRANSITION_STATE(tep, COMMITTED, PROCESSING)  
    ->  BUG_ON(MCTE_STATE(tep) !=    (MCTE_F_STATE_##old)); 
  The output of the console is like this: 
(XEN) Xen bug on at mctelem.c : Line 437
Why the state of the entry is not correct ? Some one change that unexpected?  If
any body even resolve this kind problems, Pls do me a favor.
--Van
网易163/126邮箱百分百兼容iphone ipad邮件收发
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users
Keir Fraser
2010-Dec-13  09:53 UTC
Re: [Xen-devel] CMCI exceptions happened and MCE entry state transition made Xen crashed.
Sounds like your system probably has a bad memory DIMM. If the MCE logic is causign problems you can turn it off with mce=0 Xen boot parameter. This will cause all these correctable errors to be ignored rather than logged; and uncorrectable errors will cause immediate hypervisor stop-and-crash rather than have dom0 attempt to fix up. K. On 12/12/2010 07:08, "flyfan05" <flyfan05@163.com> wrote:> Hi all, > Three days ago, the server reported lots of CMCI exceptions and Xen 3.4.2 > printed hundreds of "CMCI: send CMCI to DOM0 through virq" messages to the > console . From the console output, Then I can see that Dom0 try to read the > MSR_CAP regs by #GP trap in order to log the MCA error. > > I am not sure why so many CMCI happened , maybe there were some thing wrong > with the hardware. But unfortunately the server crashed at the end. The Xen > BUG ON at > mctelem_append_processing() > -> MCTE_TRANSITION_STATE(tep, COMMITTED, PROCESSING) > -> BUG_ON(MCTE_STATE(tep) != (MCTE_F_STATE_##old)); > The output of the console is like this: > (XEN) Xen bug on at mctelem.c : Line 437 > > Why the state of the entry is not correct ? Some one change that unexpected? > If any body even resolve this kind problems, Pls do me a favor. > > --Van > > > 网易163/126邮箱百分百兼容iphone ipad邮件收发 > <http://help.163.com/special/007525G0/163mail_guide.html?id=2716> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Reasonably Related Threads
- CMCI exceptions happened and MCE entry state transition made Xen crashed.
- FW: [patch 0/4]Enable CMCI (Corrected Machine Check Error Interrupt) for Intel CPUs
- [patch 3/3]Enable CMCI (Corrected Machine Check Error Interrupt) for Intel CPUs
- Xen-3.x fix pagefault in cmci handler
- [patch 1/4]Enable CMCI (Corrected Machine Check Error Interrupt) for Intel CPUs