Hi, I have a CentOS 5.0 running as a web server. # uname -a Linux hostnamehidden.net 2.6.18-53.1.14.el5 #1 SMP Wed Mar 5 11:37:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Every 59 minutes (maybe every hour) it reboots without any logs, without any traces and unfortunately with breaking software raid. After reboot dmesg does not have any strange entries. I double-checked crons, any strange services, nothing suspicious. I did "yum update" recently. I went to Datacenter and waited before the monitor but during reboot I did not see anything strange. I guess reboot is cold reboot. I changed all system and cpu fans. Upgraded system powersupply with a more powerful one. Placed server infront of air-conditioner. Do you have any idea? Thanks.
A little birdy told me that Linux said: ] I have a CentOS 5.0 running as a web server. ] ] # uname -a ] Linux hostnamehidden.net 2.6.18-53.1.14.el5 #1 SMP Wed Mar 5 11:37:38 ] EST 2008 x86_64 x86_64 x86_64 GNU/Linux ] ] Every 59 minutes (maybe every hour) it reboots without any logs, ] without any traces and unfortunately with breaking software raid. ] After reboot dmesg does not have any strange entries. ] ] I double-checked crons, any strange services, nothing suspicious. ] ] I did "yum update" recently. ] ] I went to Datacenter and waited before the monitor but during reboot I ] did not see anything strange. I guess reboot is cold reboot. ] ] I changed all system and cpu fans. Upgraded system powersupply with a ] more powerful one. Placed server infront of air-conditioner. ] ] Do you have any idea? i have only recently started to believe CentOS 5 (not CentOS' fault at all, but really RHEL 5) is stable on a large enough scope of hardware to begin moving from CentOS 4 (which has been rock solid for my job's organization and my home use for years)... the issue you describe was one of the many symptoms that would manifest on some systems running 5... especially early on... as a suggestion, try disabling (really not installing) the X-server and see if the problem doesn't vanish... although i wouldn't consider that an acceptable "solution" for my own long term use (and thus the hesitance to move from 4 to 5) that WAS often a culprit in "periodic spontaneous crash/reboots"... i've found the simplest way to test this without massive software removal or reinstallation is to change the initdefault in /etc/inittab to "3"... and then to remove (or rename) /etc/X11/xorg.conf (to prevent the X-server from running during the boot notification sequence and possibly hanging at exit, thus preventing even console logins) this really only helps you if you have X11 installed/enabled to begin with... B. Karhan simon at pop.psu.edu PRI/SSRI Unix Administrator
Linux wrote on Fri, 11 Apr 2008 00:06:40 +0300:> Every 59 minutes (maybe every hour) it reboots without any logs, > without any traces and unfortunately with breaking software raid. > After reboot dmesg does not have any strange entries. > > I double-checked crons, any strange services, nothing suspicious.Disable cron and at completely for two hours or so and see what happens.> guess reboot is cold reboot.Guess? You would see that if you sit at the console. You do not see it shut down, just suddenly the BIOS screen? Then it's cold ... For what do you need that mem line for the kernel? Doesn't it recognize the RAM? Kai -- Kai Sch?tzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Kai Schaetzl <maillists at conactive.com> wrote:> Linux wrote on Fri, 11 Apr 2008 00:06:40 +0300: > > >> > Every 59 minutes (maybe every hour) it reboots without any logs, >> > without any traces and unfortunately with breaking software raid. >> > After reboot dmesg does not have any strange entries. >> > >> > I double-checked crons, any strange services, nothing suspicious. >> > > Disable cron and at completely for two hours or so and see what happens. > > >> > guess reboot is cold reboot. >> > > Guess? You would see that if you sit at the console. You do not see it > shut down, just suddenly the BIOS screen? Then it's cold ... > > For what do you need that mem line for the kernel? Doesn't it recognize > the RAM? > > KaiSome interesting information would be: as root: crontab -l and the output from: ls /etc/cron.hourly/ ls /var/spool/at/spool/ You might also try to keep top running when the system is due to reboot. If you're sshed in, the last update of top will be preserved at least until the system is back up. My old server led me on a wild goose chase when it rebooted whenever backups ran. Turned out that running amanda for backups was the heaviest load the system saw so it aggravated a hardware problem (bad capacitors). Not saying that that's specifically what's going on for your system but there are circumstances where software running causes a load change that then triggers a hardware fault. Cheers, Dave -- Politics, n. Strife of interests masquerading as a contest of principles. -- Ambrose Bierce
On Thu, Apr 10, 2008 at 5:06 PM, Linux <linuxlist at gmail.com> wrote:> > Every 59 minutes (maybe every hour) it reboots without any logs, > without any traces and unfortunately with breaking software raid. > After reboot dmesg does not have any strange entries.I have several both CentOS 4.1 and 5.1 servers in product no unintended reboot issues. I generally install with as few packages as I can and add back what I need, and carefully review which daemons/services are running and run as few as possible. Might try reviewing what services you have running: chkconfig --list | grep 3:on and try disabling a few at a time to see if shutting down any particular service changes behavior which would help narrow the issue. I did have an issue with NFS under CentOS 4.x that under some conditions locked the NFS host and it had to be manually rebooted. Brett
On Thu, Apr 10, 2008 at 6:06 PM, Linux <linuxlist at gmail.com> wrote:> Every 59 minutes (maybe every hour) it reboots without any logs, > without any traces and unfortunately with breaking software raid. > After reboot dmesg does not have any strange entries.I have^Whad exactly the same problem with cold reboots every hour. Trying to watch more closely, I installed lm_sensors, run sensors-detect and rebooted. "sensors -s" now tells me "No sensors found!" BUT now I have 19 hours of uptime!! Maybe someone can tell where to look for a reason to this *unexpected* solution... -- Marcelo "?No ser? acaso que ?sta vida moderna est? teniendo m?s de moderna que de vida?" (Mafalda)