Greetings everyone, I've got a problem with my fileserver running CentOS 3.1. The kernel seems to crash about once a day (inevitably when I'm not present), forcing me to reboot. This behavior started after I applied the new kernel patch (2.4.21-15.0.3.EL.c0). I've left it on 2.4.21-15.0.2 for a little bit to see if it will crash tonight/tomorrow. As this may be a hardware problem, I'm reluctant to put it into bugzilla until I can pin it on something. The server has an A-Bit KR7A motherboard with an Athlon XP 1700+, 1.5 GB of RAM, 4 120 GB ATA-100 hard drives mirrored in two sets (hence I only have 240 GB available), a Cirrus Logic GD 5430 video card, and an Intel EtherExpress Pro 100. There is nothing useful in /var/log/messages and the only reason I know when it crashes is due to my fetchmail log. Has anyone else seen anything like this? Did I patch too soon? Any suggestions on where I might look for info on why it's crashing? The screen's blank and the keyboard lights are flashing. Thanks in advance, Shawn M. Jones
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 "SMJ" == Shawn M Jones <smj at littleprojects.org> SMJ> Greetings everyone, I've got a problem with my fileserver SMJ> running CentOS 3.1. The kernel seems to crash about once SMJ> a day (inevitably when I'm not present), forcing me to SMJ> reboot. This behavior started after I applied the new SMJ> kernel patch (2.4.21-15.0.3.EL.c0). SMJ> There is nothing useful in /var/log/messages and the only SMJ> reason I know when it crashes is due to my fetchmail log. SMJ> Has anyone else seen anything like this? Did I patch too SMJ> soon? My workstation (running the new kernel) did something similar yesterday -- I left my office, came back a while later, and the screen was blank. I wasn't able to switch to a VC or log in remotely (the system was pingable, and the first chunk of an SSH connection would succeed, but not complete). I was forced to power cycle. What seemed to be the same problem happened to our main file server, still running RHL9 with kernel 2.4.20-31.9smp, the day before. With no X, I was able to type a username at the login: prompt, but never got a password prompt. The machine behaved similarly with ssh connections. There was nothing obvious in the logs. I could figure out roughly when the crash happened because nothing appeared in the logs after that time. Claire *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* Claire Connelly cmc at math.hmc.edu Systems Administrator (909) 621-8754 Department of Mathematics Harvey Mudd College *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/> iD8DBQFA7tupB0pE8d7vd8wRAgJGAJ9EoBicmziKwHGhCQD3HrA10wuA2ACdGc/c ePxg1Ln8c/sRIIvJ1tznm2s=gupq -----END PGP SIGNATURE-----
On Thu, Jul 08, 2004 at 08:51:14PM -0400, Shawn M. Jones wrote:> Greetings everyone, > > I've got a problem with my fileserver running CentOS 3.1. The kernel > seems to crash about once a day (inevitably when I'm not present), > forcing me to reboot. This behavior started after I applied the new > kernel patch (2.4.21-15.0.3.EL.c0). > > I've left it on 2.4.21-15.0.2 for a little bit to see if it will crash > tonight/tomorrow. As this may be a hardware problem, I'm reluctant to > put it into bugzilla until I can pin it on something.<SNIP> Update: The machine has been running 2.4.21-15.0.2 since yesterday and hasn't crashed yet. I see nothing wrong so far. As there is no indication of hardware problems, I'm thinking something is hokie with the 15.0.3 kernel. Seeing the Linux kernel crash is not a familiar experience. I've only seen it crash from hardware failure or really bad proprietary modules (drivers). Thoughts? Comments? --Shawn