Displaying 5 results from an estimated 5 matches for "mcgstatus".
2011 Mar 21
1
Cant find out MCE reason (CPU 35 BANK 8)
...n SuperMicro SYS-6026T-3RF with 2xIntel Xeon
E5630 and 8xKingston KVR1333D3D4R9S/4G
For some time we have lots of MCE in mcelog and we cant find out the reason.
"Ordinary" mce message looks like:
CPU 51 BANK 8 TSC 8511e3ca77dc
MISC 274d587f00006141 ADDR 807044840
STATUS cc0055000001009f MCGSTATUS 0
decode with mcelog --ascii --cpu p4(cause there is no xeon56xx in list):
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 53 BANK 8 TSC 1982d8f72b1f
MISC e1742eac00006242 ADDR 7ffd78a80
MCG status:
MCi status:
Error overflow
MCi_MISC register valid
MCi_A...
2010 Jun 22
4
New kernel causes hardware error?
...upt
MCA: MEMORY CONTROLLER AC_CHANNEL0_ERR
Transaction: Address/Command error
Memory address parity error
Memory corrected error count (CORE_ERR_CNT): 911
Memory transaction Tracker ID (RTId): 41
Memory DIMM ID of error: 0
Memory channel ID of error: 0
Memory ECC syndrome: 0
STATUS ea10e3c0008000b0 MCGSTATUS 0
MCE 0
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 2 BANK 8 MISC 41
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
Processor context corrupt
MCA: MEMORY CONTROLLER AC_CHANNEL0_ERR
Transaction: Address/Command error
Mem...
2011 Jul 22
0
[PATCH] Dump mce log by ERST when mc panic
...mcinfo_bank *mc_bank)
+{
+ struct mce m;
+
+ memset(&m, 0, sizeof(struct mce));
+
+ m.cpu = mc_global->mc_coreid;
+ m.cpuvendor = boot_cpu_data.x86_vendor;
+ m.cpuid = cpuid_eax(1);
+ m.socketid = mc_global->mc_socketid;
+ m.apicid = mc_global->mc_apicid;
+
+ m.mcgstatus = mc_global->mc_gstatus;
+ m.status = mc_bank->mc_status;
+ m.misc = mc_bank->mc_misc;
+ m.addr = mc_bank->mc_addr;
+ m.bank = mc_bank->mc_bank;
+
+ apei_write_mce(&m);
+}
+
/* Dump machine check information in a format,
* mcelog can parse. This is used only whe...
2011 Mar 24
6
Kernel Panic on HP/Compaq ProLiant G7
Hello Everyone,
I recently installed CentOS 5.5 x86_64 on a brand new ProLiant DL380 G7. I have identical OS software running reock-solid on two other DL380 ProLiant servers, but they are G6 models, not G7. On the G7, the installation went perfectly and the machine ran great for about 2 weeks, when it just seemed to "stop". The system stopped responding on the network, and there was
2010 Jul 07
1
kernel: Machine check events logged
...rror
bit35 = err cpu3
bit42 = L3 subcache in error bit 0
bit43 = L3 subcache in error bit 1
bit46 = corrected ecc error
bit59 = misc error valid
memory/cache error 'generic read mem transaction, generic
transaction, level generic'
STATUS 9c1f4cf8001c011b MCGSTATUS 0
No DIMM found for 1148f5940 in SMBIOS
My machine (a CentOS 5.5/64bit server rented at German
hoster strato.de) seems to run ok as a LAMP server though...
What do these messages actually mean,
is RAM defect and how critical is it
(because I have an important event this Friday
and would prefer no...