Trenta sis
2013-Sep-08 14:35 UTC
Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
Hello, I have the same error, server is auto rebooted during every boot with kernel XEN, HS20 with Debian Wheezy and XEN hang on and AMM managment show same errors described in previous mails. With Debian wheezy wit non-xen kernel boots correcte, it seems that problems is with xen kernel Same Server HS20 with Debian Lenny+ XEN 3.2 or Debian Squeeze+XEN 4.0 working perfect Upgraded to Debian testing and unstable with same results XEN 4.1 and 4.2. If you need more information, you can ask. How can be solved this bug? Thanks On Fri, Feb 08, 2013 at 03:08:08PM +0100, agya naila wrote: Hello all, Today Xen finally running on IBM blade server machine, try to add nmi=dom0 and find the Base Board Management Controller on bios configuration and disabled the ''reboot system on nmi'' attribute. This step won''t eliminate the nmi problem since I still found NMI error interrupt on my blade server log but xen would ignored and keep running. If any other found better solution would be great. Thanks for the ''workaround'' info. We still should find out what exactly generates/causes that NMI with Xen.. -- Pasi Agya On Thu, Feb 7, 2013 at 9:51 PM, Fabian Arrotin <[1]arr...@centos.org> wrote: On 02/06/2013 02:39 PM, agya naila wrote: > I configure it by added nmi=ignore to my /boot/grub/grub.cfg > Just to add that I also tried the nmi=ignore parameter for Xen, and it stills hard reboot/resets automatically during the kernel dom0 boot Fabian References Visible links 1. mailto:arr...@centos.org _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Trenta sis
2013-Sep-08 14:41 UTC
IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
Hello, I have the same error, server is auto rebooted during every boot with kernel XEN, HS20 with Debian Wheezy and XEN hang on and AMM managment show same errors described in previous mails. With Debian wheezy wit non-xen kernel boots correcte, it seems that problems is with xen kernel Same Server HS20 with Debian Lenny+ XEN 3.2 or Debian Squeeze+XEN 4.0 working perfect Upgraded to Debian testing and unstable with same results XEN 4.1 and 4.2. If you need more information, you can ask. How can be solved this bug? Thanks> > > > > On Fri, Feb 08, 2013 at 03:08:08PM +0100, agya naila wrote: > > Hello all, Today Xen finally running on IBM blade server machine, try to > add nmi=dom0 and find the Base Board Management Controller on bios > configuration and disabled the ''reboot system on nmi'' attribute. This step > won''t eliminate the nmi problem since I still found NMI error interrupt on > my blade server log but xen would ignored and keep running. If any other > found better solution would be great. > > Thanks for the ''workaround'' info. > > We still should find out what exactly generates/causes that NMI with Xen.. > > -- Pasi > > Agya > > On Thu, Feb 7, 2013 at 9:51 PM, Fabian Arrotin <[1]arr...@centos.org> > wrote: > > On 02/06/2013 02:39 PM, agya naila wrote: > I configure it by added > nmi=ignore to my /boot/grub/grub.cfg > > > Just to add that I also tried the nmi=ignore parameter for Xen, and it > stills hard reboot/resets automatically during the kernel dom0 boot Fabian > > References > > Visible links 1. mailto:arr...@centos.org >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Sep-09 19:15 UTC
Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
On Sun, Sep 08, 2013 at 04:41:02PM +0200, Trenta sis wrote:> Hello, > > I have the same error, server is auto rebooted during every boot with > kernel XEN, HS20 with Debian Wheezy and XEN hang on and AMM managment show > same errors described in previous mails. With Debian wheezy wit non-xen > kernel boots correcte, it seems that problems is with xen kernel > Same Server HS20 with Debian Lenny+ XEN 3.2 or Debian Squeeze+XEN > 4.0 working perfect > > Upgraded to Debian testing and unstable with same results XEN 4.1 and 4.2. > > If you need more information, you can ask. > How can be solved this bug?Did you the workaround help? And in regards to finding out exactly what causes it - well there are logs in the BMC that can point to it the PCI device? Did you check those? Do they save if there is any device that has PCI SERR on them? Thanks.
Trenta sis
2013-Sep-12 12:47 UTC
Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
Hello, We need this server and we have made a downgrade to Debian Squeeze. I hope in a few day to have another HS20 to make some additional test, I''ll try to get all information that you asked and send Sorry, one question what is PCI SERR ? Where? Thanks for all 2013/9/9 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>> On Sun, Sep 08, 2013 at 04:41:02PM +0200, Trenta sis wrote: > > Hello, > > > > I have the same error, server is auto rebooted during every boot with > > kernel XEN, HS20 with Debian Wheezy and XEN hang on and AMM managment > show > > same errors described in previous mails. With Debian wheezy wit non-xen > > kernel boots correcte, it seems that problems is with xen kernel > > Same Server HS20 with Debian Lenny+ XEN 3.2 or Debian Squeeze+XEN > > 4.0 working perfect > > > > Upgraded to Debian testing and unstable with same results XEN 4.1 and > 4.2. > > > > If you need more information, you can ask. > > How can be solved this bug? > > Did you the workaround help? > > And in regards to finding out exactly what causes it - well there are > logs in the BMC that can point to it the PCI device? Did you check those? > Do they save if there is any device that has PCI SERR on them? > > Thanks. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Sep-23 14:02 UTC
Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
On Thu, Sep 12, 2013 at 02:47:39PM +0200, Trenta sis wrote:> Hello, > > We need this server and we have made a downgrade to Debian Squeeze. > I hope in a few day to have another HS20 to make some additional test, I''ll > try to get all information that you asked and send > Sorry, one question what is PCI SERR ? Where?If you log in the BladeCenter webfrontend you should see logs of each blade. Some of them are ''User XYZ logged in''. But in some cases the are more serious ones - such an NMI or PCI SERR. If you could copy-n-paste them it could help in figuring which PCI device is responsible for causing the NMI.> > Thanks for all > > 2013/9/9 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > On Sun, Sep 08, 2013 at 04:41:02PM +0200, Trenta sis wrote: > > > Hello, > > > > > > I have the same error, server is auto rebooted during every boot with > > > kernel XEN, HS20 with Debian Wheezy and XEN hang on and AMM managment > > show > > > same errors described in previous mails. With Debian wheezy wit non-xen > > > kernel boots correcte, it seems that problems is with xen kernel > > > Same Server HS20 with Debian Lenny+ XEN 3.2 or Debian Squeeze+XEN > > > 4.0 working perfect > > > > > > Upgraded to Debian testing and unstable with same results XEN 4.1 and > > 4.2. > > > > > > If you need more information, you can ask. > > > How can be solved this bug? > > > > Did you the workaround help? > > > > And in regards to finding out exactly what causes it - well there are > > logs in the BMC that can point to it the PCI device? Did you check those? > > Do they save if there is any device that has PCI SERR on them? > > > > Thanks. > >
Trenta sis
2013-Sep-29 10:47 UTC
Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
Hello, In Bladecenter webfrontend appears: 27 I Blade_09 09/08/13 13:25:17 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 28 E Blade_09 09/08/13 13:25:12 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 29 I Blade_09 09/08/13 13:09:14 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 30 I Blade_09 09/08/13 13:09:03 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 31 E Blade_09 09/08/13 13:08:58 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 32 I Blade_09 09/08/13 12:46:26 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 33 I Blade_09 09/08/13 12:46:15 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 34 E Blade_09 09/08/13 12:46:11 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 35 I Blade_09 09/08/13 12:34:13 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 36 I Blade_09 09/08/13 12:34:03 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 37 E Blade_09 09/08/13 12:33:58 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 38 I Blade_09 09/08/13 12:27:25 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 39 I Blade_09 09/08/13 12:27:14 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 40 E Blade_09 09/08/13 12:27:10 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 41 I Blade_09 09/08/13 12:20:45 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 42 I Blade_09 09/08/13 12:20:34 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 43 E Blade_09 09/08/13 12:20:30 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 44 I Blade_09 09/08/13 12:18:20 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 45 I Blade_09 09/08/13 12:18:10 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 46 E Blade_09 09/08/13 12:18:05 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 47 I Blade_09 09/08/13 12:15:47 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 48 I Blade_09 09/08/13 12:15:37 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 49 E Blade_09 09/08/13 12:15:32 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 27 I Blade_09 09/08/13 13:25:17 0x806f0013 Chassis, (NMI State) diagnostic interrupt 28 E Blade_09 09/08/13 13:25:12 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 29 I Blade_09 09/08/13 13:09:14 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt 30 I Blade_09 09/08/13 13:09:03 0x806f0013 Chassis, (NMI State) diagnostic interrupt 31 E Blade_09 09/08/13 13:08:58 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 32 I Blade_09 09/08/13 12:46:26 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt 33 I Blade_09 09/08/13 12:46:15 0x806f0013 Chassis, (NMI State) diagnostic interrupt 34 E Blade_09 09/08/13 12:46:11 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 35 I Blade_09 09/08/13 12:34:13 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt 36 I Blade_09 09/08/13 12:34:03 0x806f0013 Chassis, (NMI State) diagnostic interrupt 37 E Blade_09 09/08/13 12:33:58 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 38 I Blade_09 09/08/13 12:27:25 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt 39 I Blade_09 09/08/13 12:27:14 0x806f0013 Chassis, (NMI State) diagnostic interrupt 40 E Blade_09 09/08/13 12:27:10 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 41 I Blade_09 09/08/13 12:20:45 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt 42 I Blade_09 09/08/13 12:20:34 0x806f0013 Chassis, (NMI State) diagnostic interrupt 43 E Blade_09 09/08/13 12:20:30 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 44 I Blade_09 09/08/13 12:18:20 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt 45 I Blade_09 09/08/13 12:18:10 0x806f0013 Chassis, (NMI State) diagnostic interrupt 46 E Blade_09 09/08/13 12:18:05 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 47 I Blade_09 09/08/13 12:15:47 0x806f0013 Recovery Chassis, (NMI State) diagnostic interrupt 48 I Blade_09 09/08/13 12:15:37 0x806f0013 Chassis, (NMI State) diagnostic interrupt 49 E Blade_09 09/08/13 12:15:32 0x10000002 SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 Thanks 27 I Blade_09 09/08/13 13:25:17 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 28 E Blade_09 09/08/13 13:25:12 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 29 I Blade_09 09/08/13 13:09:14 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 30 I Blade_09 09/08/13 13:09:03 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 31 E Blade_09 09/08/13 13:08:58 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 32 I Blade_09 09/08/13 12:46:26 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 33 I Blade_09 09/08/13 12:46:15 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 34 E Blade_09 09/08/13 12:46:11 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 35 I Blade_09 09/08/13 12:34:13 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 36 I Blade_09 09/08/13 12:34:03 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 37 E Blade_09 09/08/13 12:33:58 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 38 I Blade_09 09/08/13 12:27:25 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 39 I Blade_09 09/08/13 12:27:14 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 40 E Blade_09 09/08/13 12:27:10 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 41 I Blade_09 09/08/13 12:20:45 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 42 I Blade_09 09/08/13 12:20:34 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 43 E Blade_09 09/08/13 12:20:30 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 44 I Blade_09 09/08/13 12:18:20 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 45 I Blade_09 09/08/13 12:18:10 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 46 E Blade_09 09/08/13 12:18:05 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 47 I Blade_09 09/08/13 12:15:47 0x806f0013 <javascript:;> Recovery Chassis, (NMI State) diagnostic interrupt 48 I Blade_09 09/08/13 12:15:37 0x806f0013 <javascript:;> Chassis, (NMI State) diagnostic interrupt 49 E Blade_09 09/08/13 12:15:32 0x10000002<javascript:;> SMI Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 2013/9/23 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>> On Thu, Sep 12, 2013 at 02:47:39PM +0200, Trenta sis wrote: > > Hello, > > > > We need this server and we have made a downgrade to Debian Squeeze. > > I hope in a few day to have another HS20 to make some additional test, > I''ll > > try to get all information that you asked and send > > Sorry, one question what is PCI SERR ? Where? > > If you log in the BladeCenter webfrontend you should see logs of > each blade. Some of them are ''User XYZ logged in''. But in some cases > the are more serious ones - such an NMI or PCI SERR. If you could > copy-n-paste > them it could help in figuring which PCI device is responsible for causing > the NMI. > > > > > Thanks for all > > > > 2013/9/9 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > > On Sun, Sep 08, 2013 at 04:41:02PM +0200, Trenta sis wrote: > > > > Hello, > > > > > > > > I have the same error, server is auto rebooted during every boot with > > > > kernel XEN, HS20 with Debian Wheezy and XEN hang on and AMM managment > > > show > > > > same errors described in previous mails. With Debian wheezy wit > non-xen > > > > kernel boots correcte, it seems that problems is with xen kernel > > > > Same Server HS20 with Debian Lenny+ XEN 3.2 or Debian Squeeze+XEN > > > > 4.0 working perfect > > > > > > > > Upgraded to Debian testing and unstable with same results XEN 4.1 and > > > 4.2. > > > > > > > > If you need more information, you can ask. > > > > How can be solved this bug? > > > > > > Did you the workaround help? > > > > > > And in regards to finding out exactly what causes it - well there are > > > logs in the BMC that can point to it the PCI device? Did you check > those? > > > Do they save if there is any device that has PCI SERR on them? > > > > > > Thanks. > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Sep-30 14:13 UTC
Is: 0xCF8 on extended config space instead of MCONF? Was:Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
> Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 > 27 I Blade_09 09/08/13 13:25:17 0x806f0013 Chassis, (NMI State) diagnostic > interrupt > 28 E Blade_09 09/08/13 13:25:12 0x10000002 SMI Hdlr: 00151743 HI Fatal > Error, HI_FERR/NERR Value= 0020Doing a simple Google search on HI_FERR tells me that it is: http://www.intel.com/content/dam/doc/datasheet/e7525-memory-controller-hub-datasheet.pdf and that 3.6.14 HI_FERR – Hub Interface First Error Register (D0:F1) has something in it. The value is 0020 (is that decimal or hex?). If it is decimal it is then 10100, which is bit 2 and 4: bit 2: HI Internal Parity Error Detected. This bit is sticky through reset. System software clears this bit by writing a ‘1’ to the location. 0 = No Internal Parity error detected. 1 = MCH HI bridge has detected an Internal Parity error. Non-fatal. and bit 4: HI Data Parity Error Detected. This bit is sticky through reset. System software clears this bit by writing a ‘1’ to the location. 0 = No HI data parity error. 1 = MCH has detected a parity error on the data phase of a HI transaction. But that is unlikely as these are 'non-fatal'. So if this is hex, then it would be bit 5, which is: Enhanced Configuration Access Error. This bit is sticky through reset. System software clears this bit by writing a ‘1’ to the location. 0 = No Enhanced Configuration Access error 1 = A PCI Express* Enhanced Configuration access was mistakenly targeting the legacy interface. Fatal That sounds more like it. So we touched a PCIe Enhanced Configuration (MMCONFIG?) using the legacy interface (cf8?). Jan, any thoughts? Is there a particular bug-fix we are missing in Xen 4.1 or Xen 4.2 for this? Xen 4.0 seems to work. Trenta, When you used Xen 4.0 did you use the same kernel as with Xen 4.1 or Xen 4.2? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2013-Sep-30 15:40 UTC
Is: 0xCF8 on extended config space instead of MCONF? Was:Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
>>> On 30.09.13 at 16:13, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > But that is unlikely as these are 'non-fatal'. So if this is hex, then it > would be bit 5, which is: > > Enhanced Configuration Access Error. This bit is sticky through reset. > System > software clears this bit by writing a ‘1’ to the location. > 0 = No Enhanced Configuration Access error > 1 = A PCI Express* Enhanced Configuration access was mistakenly targeting > the legacy interface. Fatal > > > That sounds more like it. So we touched a PCIe Enhanced Configuration > (MMCONFIG?) > using the legacy interface (cf8?). > > Jan, any thoughts? Is there a particular bug-fix we are missing in Xen 4.1 > or Xen 4.2 > for this? Xen 4.0 seems to work.Possibly MMCONF just didn't get used on 4.0? And no, I don't think I recall any possibly relevant change. Even more, the description above sounds more like an error resulting from device misbehavior than from software incorrectly doing some access. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Trenta sis
2013-Oct-04 16:31 UTC
Re: Is: 0xCF8 on extended config space instead of MCONF? Was:Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
Hi, With Xen 4.0 kernel used was 2.6.32, default kernel Debain 6 (Squeeze) Thanks 2013/9/30 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>> > Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 > > 27 I Blade_09 09/08/13 13:25:17 0x806f0013 Chassis, (NMI State) > diagnostic > > interrupt > > 28 E Blade_09 09/08/13 13:25:12 0x10000002 SMI Hdlr: 00151743 HI Fatal > > Error, HI_FERR/NERR Value= 0020 > > Doing a simple Google search on HI_FERR tells me that it is: > > > http://www.intel.com/content/dam/doc/datasheet/e7525-memory-controller-hub-datasheet.pdf > > and that > 3.6.14 HI_FERR – Hub Interface First Error Register (D0:F1) > > has something in it. The value is 0020 (is that decimal or hex?). If it is > decimal it is then 10100, which is bit 2 and 4: > > bit 2: > > HI Internal Parity Error Detected. This bit is sticky through reset. System > software clears this bit by writing a ‘1’ to the location. > 0 = No Internal Parity error detected. > 1 = MCH HI bridge has detected an Internal Parity error. Non-fatal. > > and bit 4: > HI Data Parity Error Detected. This bit is sticky through reset. System > software > clears this bit by writing a ‘1’ to the location. > 0 = No HI data parity error. > 1 = MCH has detected a parity error on the data phase of a HI transaction. > > > > But that is unlikely as these are ''non-fatal''. So if this is hex, then it > would > be bit 5, which is: > > Enhanced Configuration Access Error. This bit is sticky through reset. > System > software clears this bit by writing a ‘1’ to the location. > 0 = No Enhanced Configuration Access error > 1 = A PCI Express* Enhanced Configuration access was mistakenly targeting > the legacy interface. Fatal > > > That sounds more like it. So we touched a PCIe Enhanced Configuration > (MMCONFIG?) > using the legacy interface (cf8?). > > Jan, any thoughts? Is there a particular bug-fix we are missing in Xen 4.1 > or Xen 4.2 > for this? Xen 4.0 seems to work. > > Trenta, > > When you used Xen 4.0 did you use the same kernel as with Xen 4.1 or Xen > 4.2? >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Konrad Rzeszutek Wilk
2013-Oct-04 16:55 UTC
Re: Is: 0xCF8 on extended config space instead of MCONF? Was:Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash
On Fri, Oct 04, 2013 at 06:31:37PM +0200, Trenta sis wrote:> Hi, > > With Xen 4.0 kernel used was 2.6.32, default kernel Debain 6 (Squeeze) > ThanksSo if you swap either kernel or hypervisor do you see this? Meaning if you run with Xen 4.2 + 2.6.32 or Xen 4.0 + current kernel.> > 2013/9/30 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > Hdlr: 00151743 HI Fatal Error, HI_FERR/NERR Value= 0020 > > > 27 I Blade_09 09/08/13 13:25:17 0x806f0013 Chassis, (NMI State) > > diagnostic > > > interrupt > > > 28 E Blade_09 09/08/13 13:25:12 0x10000002 SMI Hdlr: 00151743 HI Fatal > > > Error, HI_FERR/NERR Value= 0020 > > > > Doing a simple Google search on HI_FERR tells me that it is: > > > > > > http://www.intel.com/content/dam/doc/datasheet/e7525-memory-controller-hub-datasheet.pdf > > > > and that > > 3.6.14 HI_FERR – Hub Interface First Error Register (D0:F1) > > > > has something in it. The value is 0020 (is that decimal or hex?). If it is > > decimal it is then 10100, which is bit 2 and 4: > > > > bit 2: > > > > HI Internal Parity Error Detected. This bit is sticky through reset. System > > software clears this bit by writing a ‘1’ to the location. > > 0 = No Internal Parity error detected. > > 1 = MCH HI bridge has detected an Internal Parity error. Non-fatal. > > > > and bit 4: > > HI Data Parity Error Detected. This bit is sticky through reset. System > > software > > clears this bit by writing a ‘1’ to the location. > > 0 = No HI data parity error. > > 1 = MCH has detected a parity error on the data phase of a HI transaction. > > > > > > > > But that is unlikely as these are 'non-fatal'. So if this is hex, then it > > would > > be bit 5, which is: > > > > Enhanced Configuration Access Error. This bit is sticky through reset. > > System > > software clears this bit by writing a ‘1’ to the location. > > 0 = No Enhanced Configuration Access error > > 1 = A PCI Express* Enhanced Configuration access was mistakenly targeting > > the legacy interface. Fatal > > > > > > That sounds more like it. So we touched a PCIe Enhanced Configuration > > (MMCONFIG?) > > using the legacy interface (cf8?). > > > > Jan, any thoughts? Is there a particular bug-fix we are missing in Xen 4.1 > > or Xen 4.2 > > for this? Xen 4.0 seems to work. > > > > Trenta, > > > > When you used Xen 4.0 did you use the same kernel as with Xen 4.1 or Xen > > 4.2? > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Possibly Parallel Threads
- [SMB 3.0.10] File Locking Mechanism Windows <-> Unix
- Bug#760563: xen-hypervisor-4.4-amd64: Xen >=4.1 not booting on IBM HS20
- menu.c32: CPU Fault on console 0
- Liebert IntelliSlot Unity Card IS-UNITY-DP snmp UPS-MIB.upsIdentManufacturer '1.3.6.1.2.1.33.1.1.1.0' problem
- QoS net-snmp ?