On Wednesday, January 29, 2014 6:49:21 pm Tim Daneliuk
wrote:> Resending in hopes that people on one of the other lists will have some
insight here:
>
> On 01/27/2014 10:50 PM, Tim Daneliuk wrote:
> > I am running 9.2 stable i386 r261207. As noted earlier:
> >
> >> I just replaced mobo/CPU on FBSD server (Gigabyte Z-87-D3HP with
> >> an Intel i3-4130). I am not overclocking ... but I continue to
see this sort of thing:
> >
> >> MCA: CPU 0 COR (1) internal parity error
> >
> > Dmesg shows:
> >
> >> MCA: Vendor "GenuineIntel", ID 0x306c3, APIC ID 0
> >> MCA: CPU 0 COR (1) internal parity error
> >> MCA: Bank 0, Status 0x90000040000f0005
> >> MCA: Global Cap 0x0000000000000c07, Status 0x0000000000000000_
> >
> > I've swapped CPUs (i5). I've fiddled with an endless supply of
> > mobo settings. I've switched power supplies. I've moved mem
> > sticks around .... No joy.
> >
> > So, I dug through the sources and found this:
> >
> >
> >
> > mca_log(const struct mca_record *rec)
> > {
> > uint16_t mca_error;
> >
> > printf("MCA: Bank %d, Status 0x%016llx\n",
rec->mr_bank,
> > (long long)rec->mr_status);
> > printf("MCA: Global Cap 0x%016llx, Status
0x%016llx\n",
> > (long long)rec->mr_mcg_cap, (long
long)rec->mr_mcg_status);
> > printf("MCA: Vendor \"%s\", ID 0x%x, APIC ID
%d\n", cpu_vendor,
> > rec->mr_cpu_id, rec->mr_apic_id);
> > printf("MCA: CPU %d ", rec->mr_cpu);
> > if (rec->mr_status & MC_STATUS_UC)
> > printf("UNCOR ");
> > else {
> > printf("COR ");
> > if (rec->mr_mcg_cap & MCG_CAP_CMCI_P)
> > printf("(%lld) ", ((long
long)rec->mr_status &
> > MC_STATUS_COR_COUNT) >> 38);
> > }
> >
> >
> > It looks like the trailing else clause is kicking out the error but I
am
> > unclear what the error means, beyond the fact that it appears to be a
parity
> > error somewhere within the CPU's internal memory (cache?). Is
this error
> > getting corrected? Is this benign, Should I get a different mobo?
> >
> > Um .... Haaaaalp :)
>
>
> I have now tried different motherboards, CPUs, memory, and power supplies
and
> this error is still showing up now and then.
>
> This points strongly to either FreeBSD bogus reporting, or these errors
being
> benign. It's hard to believe that the exact same error might occur
with
> completely different hardware ... unless it's being caused by the case.
Are they all the same model CPU? Since it is a corrected error you can
probably ignore it, but it is not bogus reporting. FreeBSD only reports
these errors because they show up in registers on your CPU.
--
John Baldwin