On Sat, Aug 12, 2017 at 05:51:33PM -0400, Steven Tardy wrote:> > > On Aug 12, 2017, at 3:50 PM, Fred Smith <fredex at fcshome.stoneham.ma.us> wrote: > > > > I had a series of kernel hardware error reports today while I was away > > from my computer: > > > > Message from syslogd at fcshome at Aug 12 10:12:24 ... > > kernel:[Hardware Error]: MC2 Error: VB Data ECC or parity error. > > > > Message from syslogd at fcshome at Aug 12 10:12:24 ... > > kernel:[Hardware Error]: Error Status: Corrected error, no action required. > > > > Message from syslogd at fcshome at Aug 12 10:12:24 ... > > kernel:[Hardware Error]: CPU:2 (15:2:0) MC2_STATUS[-|CE|MiscV|-|-|-|-|CECC]: 0x98444000010c0176 > > > > Message from syslogd at fcshome at Aug 12 10:12:24 ... > > kernel:[Hardware Error]: cache level: L2, tx: DATA, mem-tx: EV > > > > never saw anything like that before. > > > > cpu is: > > > > $ cat /proc/cpuinfo > > processor : 0 > > vendor_id : AuthenticAMD > > cpu family : 21 > > model : 2 > > model name : AMD FX(tm)-6300 Six-Core Processor > > stepping : 0 > > microcode : 0x600084f > > cpu MHz : 1400.000 > > cache size : 2048 KB > > physical id : 0 > > siblings : 6 > > core id : 0 > > cpu cores : 3 > > apicid : 16 > > initial apicid : 0 > > fpu : yes > > fpu_exception : yes > > cpuid level : 13 > > wp : yes > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bmi1 > > bogomips : 7023.90 > > TLB size : 1536 4K pages > > clflush size : 64 > > cache_alignment : 64 > > address sizes : 48 bits physical, 48 bits virtual > > power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro > > > > > > six core AMD, above is one of the cores. > > > > Any clues to figure out the errors, and/or mitigate? > > > > thanks! > > > > Fred > > MC == Machine check exception. > The important part of a MC is the "status" code. > One can use the Intel doc "Architecture Software Developers Manual" to decode this (4000 page .pdf). > Unsure but it looks like AMD does similar MC codes. > Luckily Linux does some heavy lifting and decodes to "cache hierarchy error L2 data eviction". > The next most important part is the "corrected" bit. > > Now what does that really mean? > *shrug*, could be firmware/drivers/overheating/poor-CPU-seating/DIMM-seating/faulty-motherboard/faulty-CPU/faulty-DIMM.Well. overheating is possible... we don't live in the cleanest possible house, AND we have cats. so, in general I open up this box twice a year and vacuum out the house dirt and cat fuzzies. I'm probably overdue for this task. This is the first one of these I've had. Hope it's the last. but a little PM is in order either way. thanks for the reply. Fred> > Hope that doesn't confuse too much. (: > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos-- ---- Fred Smith -- fredex at fcshome.stoneham.ma.us ----------------------------- The Lord detests the way of the wicked but he loves those who pursue righteousness. ----------------------------- Proverbs 15:9 (niv) -----------------------------
On 08/12/2017 07:24 PM, Fred Smith wrote:> Well. overheating is possible... we don't live in the cleanest possible > house, AND we have cats. so, in general I open up this box twice a year > and vacuum out the house dirt and cat fuzzies. I'm probably overdue for > this task.Cleaning is a good thing to do, but not with a vacuum... the vacuum could loosen components, even make them disappear. Much better would be to use a blower or bellows of some kind. Also, cowboys scoff, but I always wear a grounded wrist strap when handling electronics.
On 08/13/2017 05:18 AM, ken wrote:> Also, cowboys scoff, but I always wear a grounded wrist strap when > handling electronics.It's a good idea, especially in low-humidity climates. Also noteworthy: the air moving through a hose can cause a vacuum's hose or attachment to build up a static charge, which is another reason it can be a bad idea to use a vacuum in a computer.
On Sun, Aug 13, 2017 at 08:18:24AM -0400, ken wrote:> On 08/12/2017 07:24 PM, Fred Smith wrote: > >Well. overheating is possible... we don't live in the cleanest possible > >house, AND we have cats. so, in general I open up this box twice a year > >and vacuum out the house dirt and cat fuzzies. I'm probably overdue for > >this task. > > Cleaning is a good thing to do, but not with a vacuum... the vacuum > could loosen components, even make them disappear. Much better > would be to use a blower or bellows of some kind.thanks for the reminder. I don't actually use a vacuum, I was just being, er, loose with my terminology. I use a can of compressed "air" where possible, remove fans on heatsinks and blow or wipe/brush out the clogs, remove the inlet filters and wash 'em. I get amazing amounts of cat fur. -- ------------------------------------------------------------------------------- Under no circumstances will I ever purchase anything offered to me as the result of an unsolicited e-mail message. Nor will I forward chain letters, petitions, mass mailings, or virus warnings to large numbers of others. This is my contribution to the survival of the online community. --Roger Ebert, December, 1996 ----------------------------- The Boulder Pledge -----------------------------