search for: mcgstatus

Displaying 5 results from an estimated 5 matches for "mcgstatus".

2011 Mar 21
1
Cant find out MCE reason (CPU 35 BANK 8)
...n SuperMicro SYS-6026T-3RF with 2xIntel Xeon E5630 and 8xKingston KVR1333D3D4R9S/4G For some time we have lots of MCE in mcelog and we cant find out the reason. "Ordinary" mce message looks like: CPU 51 BANK 8 TSC 8511e3ca77dc MISC 274d587f00006141 ADDR 807044840 STATUS cc0055000001009f MCGSTATUS 0 decode with mcelog --ascii --cpu p4(cause there is no xeon56xx in list): HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 53 BANK 8 TSC 1982d8f72b1f MISC e1742eac00006242 ADDR 7ffd78a80 MCG status: MCi status: Error overflow MCi_MISC register valid MCi_A...
2010 Jun 22
4
New kernel causes hardware error?
...upt MCA: MEMORY CONTROLLER AC_CHANNEL0_ERR Transaction: Address/Command error Memory address parity error Memory corrected error count (CORE_ERR_CNT): 911 Memory transaction Tracker ID (RTId): 41 Memory DIMM ID of error: 0 Memory channel ID of error: 0 Memory ECC syndrome: 0 STATUS ea10e3c0008000b0 MCGSTATUS 0 MCE 0 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 2 BANK 8 MISC 41 MCG status: MCi status: Error overflow Uncorrected error MCi_MISC register valid Processor context corrupt MCA: MEMORY CONTROLLER AC_CHANNEL0_ERR Transaction: Address/Command error Mem...
2011 Jul 22
0
[PATCH] Dump mce log by ERST when mc panic
...mcinfo_bank *mc_bank) +{ + struct mce m; + + memset(&m, 0, sizeof(struct mce)); + + m.cpu = mc_global->mc_coreid; + m.cpuvendor = boot_cpu_data.x86_vendor; + m.cpuid = cpuid_eax(1); + m.socketid = mc_global->mc_socketid; + m.apicid = mc_global->mc_apicid; + + m.mcgstatus = mc_global->mc_gstatus; + m.status = mc_bank->mc_status; + m.misc = mc_bank->mc_misc; + m.addr = mc_bank->mc_addr; + m.bank = mc_bank->mc_bank; + + apei_write_mce(&m); +} + /* Dump machine check information in a format, * mcelog can parse. This is used only whe...
2011 Mar 24
6
Kernel Panic on HP/Compaq ProLiant G7
Hello Everyone, I recently installed CentOS 5.5 x86_64 on a brand new ProLiant DL380 G7. I have identical OS software running reock-solid on two other DL380 ProLiant servers, but they are G6 models, not G7. On the G7, the installation went perfectly and the machine ran great for about 2 weeks, when it just seemed to "stop". The system stopped responding on the network, and there was
2010 Jul 07
1
kernel: Machine check events logged
...rror bit35 = err cpu3 bit42 = L3 subcache in error bit 0 bit43 = L3 subcache in error bit 1 bit46 = corrected ecc error bit59 = misc error valid memory/cache error 'generic read mem transaction, generic transaction, level generic' STATUS 9c1f4cf8001c011b MCGSTATUS 0 No DIMM found for 1148f5940 in SMBIOS My machine (a CentOS 5.5/64bit server rented at German hoster strato.de) seems to run ok as a LAMP server though... What do these messages actually mean, is RAM defect and how critical is it (because I have an important event this Friday and would prefer no...