Vnpenguin
2009-Feb-13 08:22 UTC
[CentOS] After electric breaking: HARDWARE ERROR Kernel panic
Hi all, After an electric breaking, my server (Centos 5.2 x86_64 with all updates) can not boot. The error message on screen is: ----------------------------------------------------------------------------------------------------------- Memory for crash kernel (0x0 to 0x0) notwithin permissible range <0> HARDWARE ERROR CPU 1: Machine Check Exception: 7 Bank 4: .... RIP 10:<.....> TSC 133eab63c9 ADDR 24fe3d028 This is not a software problem! Run through mcelog --ascii to decode and contact your hardware vendot Kernel panic - not syncing: Uncorrected machine check ------------------------------------------------------------------------- Anyone could tell me how to fix this please ! Help ! Thank you
John R Pierce
2009-Feb-13 08:35 UTC
[CentOS] After electric breaking: HARDWARE ERROR Kernel panic
Vnpenguin wrote:> Hi all, > After an electric breaking, my server (Centos 5.2 x86_64 with all > updates) can not boot. The error message on screen is: > > ----------------------------------------------------------------------------------------------------------- > Memory for crash kernel (0x0 to 0x0) notwithin permissible range > <0> > HARDWARE ERROR > CPU 1: Machine Check Exception: 7 Bank 4: .... > RIP 10:<.....> > TSC 133eab63c9 ADDR 24fe3d028 > This is not a software problem! > Run through mcelog --ascii to decode and contact your hardware vendot > Kernel panic - not syncing: Uncorrected machine check > ------------------------------------------------------------------------- > > Anyone could tell me how to fix this please ! Help ! >you have a hardware problem. something fried on the motherboard, possibly the ram, maybe something else.. if the server is on some sort of service contract or warranty, call the hardware or support vendor. if not, find someone skilled at troubleshooting x86_64 server hardware. I believe the Machine Check Exception: 7 Bank 4 does seem to indicate its a memory ECC issue with DIMM bank 4 on CPU 1 (I'm guessing this is an Opteron system?) you might try booting a memtest86 CD and seeing if that runs.
Lanny Marcus
2009-Feb-13 15:17 UTC
[CentOS] After electric breaking: HARDWARE ERROR Kernel panic
On Fri, Feb 13, 2009 at 3:35 AM, John R Pierce <pierce at hogranch.com> wrote:> Vnpenguin wrote: >> After an electric breaking, my server (Centos 5.2 x86_64 with all >> updates) can not boot. The error message on screen is: >> Memory for crash kernel (0x0 to 0x0) notwithin permissible range >> <0> >> HARDWARE ERROR >> CPU 1: Machine Check Exception: 7 Bank 4: .... >> RIP 10:<.....> >> TSC 133eab63c9 ADDR 24fe3d028 >> This is not a software problem! >> Run through mcelog --ascii to decode and contact your hardware vendot >> Kernel panic - not syncing: Uncorrected machine check> you have a hardware problem. something fried on the motherboard, > possibly the ram, maybe something else.. if the server is on some sort > of service contract or warranty, call the hardware or support vendor. > if not, find someone skilled at troubleshooting x86_64 server hardware.<snip> I doubt a warranty will cover electrical damage, but you can ask..... An insurance policy is more likely to cover this. :-) Run Diagnostics on the RAM (which is the reported error) and your motherboard. The PSU may also be damaged. Run Diagnostics on the hard drives, after you get it up and running.