Theo Band
2007-Jan-18 17:58 UTC
[CentOS] Machine all of a suddens freezes. Any suggestions?
This is a A8N32-SLI Deluxe motherboard with a AMD Athlon(tm) 64 X2 Dual Core 4400+ Processor. It freezes and cannot be accessed from the network. Keyboard/mouse/display, everything just stuck. It had an uptime of 17 days after I re-connected all cables and boards after a similar crash. My feeling is, this must be a hardware related problem. I inspected the logs under /var/log, but nothing. What would be the best approach to debug this problem? What can be the most likely cause? Or should I just accept it (it's a single user desktop machine). The machine has FC4 installed. Is there a chance that it becomes more stable with Centos (as said, I actually don't suspect the OS). For new machines I started installing Centos (and I'm happy with it, especially the fact that my EDA software vendors support it well). I monitor the machine with Zabbix, but no weird things there. Just before the freeze the machine had a processor load of 1 running a user simulation. Any suggestions would be welcome. Thanks, Theo
Theo Band
2007-Jan-18 19:52 UTC
Fwd: [CentOS] Machine all of a suddens freezes. Any suggestions?
Temperature could indeed be the case. The last time it happened was during heavy load. The fans are spinning, that I checked already. I will add temperature monitoring as well to the logs. Dirt could be, but this machine is not even a year old. I'll check the CPU cooler assembly. Thanks, Theo Dave K wrote:> This bounced when sent to the list, so I'm copying you directly. > > ---------- Forwarded message ---------- > From: Dave K <davek08054 at gmail.com> > Date: Jan 18, 2007 1:37 PM > Subject: Re: [CentOS] Machine all of a suddens freezes. Any suggestions? > To: CentOS mailing list <centos at centos.org> > > > On 1/18/07, Theo Band <theo.band at xanadu-wireless.com> wrote: >> ... It freezes and cannot be accessed from the network.... >> ... >> My feeling is, this must be a hardware related problem. I inspected the >> logs under /var/log, but nothing. >> ... >> What would be the best approach to debug this problem? What can be the >> most likely cause? > > The first thing I would suspect is some sort of cooling problem. > Verify that all of your fans are actually spinning. If the BIOS has > the option to always run the fans at full speed, try it for a while > (if you can stand the noise). Use a flashlight and check that the > airspace between the fins of any heatsinks aren't clogged with dust. > If there is room (and power) for additional fans, try adding one. > > The last system I had with "random" freezes runs fine now after a > replacement CPU fan, an additional case fan, and suitable application > of half a can of air-duster. > > -- > Dave K > Unix Systems & Network Administrator > Mount Laurel NJ > >
Mike Fedyk
2007-Jan-18 20:05 UTC
[CentOS] Machine all of a suddens freezes. Any suggestions?
Theo Band wrote:> What would be the best approach to debug this problem? What can be the > most likely cause? Or should I just accept it (it's a single user > desktop machine). The machine has FC4 installed. Is there a chance that > it becomes more stable with Centos (as said, I actually don't suspect > the OS). For new machines I started installing Centos (and I'm happy > with it, especially the fact that my EDA software vendors support it well). > I monitor the machine with Zabbix, but no weird things there. Just > before the freeze the machine had a processor load of 1 running a user > simulation.Turn on nmi_watchdog: http://www.mjmwired.net/kernel/Documentation/nmi_watchdog.txt#34 run memtest86+ for 24 to 48 hours. Stress your disks: for i in $(seq -w 20); do cp -ax / /tmp/$i & done (Be sure to stop the 20 cp processes before your disk(s) fill up) Mike