Argh. After all the fixes done on the 5.4-STABLE and 6.0 codebases my Dell PE1750 still reboots randomly. Again last night at 03.03 :- ( essages still shows nothing, nothing special was going on at the time (loadavg ~ 0.00). It's running: FreeBSD xyz 6.0-RELEASE-p4 FreeBSD 6.0-RELEASE-p4 #0: Sun Feb 19 21:15:01 CET 2006 root@xyz:/usr/obj/usr/src/sys/SMP i386 What I've tried to fix the problems: - kern_proc.c patch submitted to freebsd-stable by Don Lewis. - disable HTT - upgrade to 5-STABLE - upgrade to 6.0-RELEASE-p1,2,3,4 What we've _not_ tried: - Swap memory This because we have 2850's that experience exactly the same problems, just less frequently (about once every 4 months). I'm completely at a loss, and inclined to remove FreeBSD and install "another OS" as it is an important management machine for us, that reboots about monthly. Any clues, tips, help, know bugs? Regards Rutger Bevaart
On Apr 4, 2006, at 4:37 AM, Rutger Bevaart wrote:> I'm completely at a loss, and inclined to remove FreeBSD and > install "another OS" as it is an important management machine for > us, that reboots about monthly. > > Any clues, tips, help, know bugs? >Either bad hardware or pilot error. Here's some stats for you: [morebiz]% grep DELL /var/run/dmesg.boot ACPI APIC Table: <DELL PE1750 > acpi0: <DELL PE1750> on motherboard [morebiz]% sysctl kern.boottime kern.boottime: { sec = 1130521993, usec = 140021 } Fri Oct 28 13:53:13 2005 [morebiz]% date Tue Apr 4 09:58:18 EDT 2006 [morebiz]% uptime 9:58AM up 157 days, 20:05, 1 user, load averages: 0.00, 0.00, 0.00 [morebiz]% uname -r 5.4-RELEASE-p8 This machine runs two instances of apache on two IPs, a postgres server and a mysql server to run a few different web sites. It gets a fair number of hits, many of which hit the dbs. I run with hyperthreading enabled, but when I next upgrade this box to 6.1, I will turn it off. I don't have any 2850's but the one 1850 I have has been 100% stable since it went into production last october running FreeBSD 6.0. I'd buy it again in a heartbeat. Are you sure your electrical power is stable?
On Apr 4, 2006, at 4:37 AM, Rutger Bevaart wrote:> This because we have 2850's that experience exactly the same > problems, just less frequently (about once every 4 months). > > I'm completely at a loss, and inclined to remove FreeBSD and > install "another OS" as it is an important management machine for > us, that reboots about monthly.By all means, feel free to see whether the problem reoccurs using another OS, but it sounds like an intermittent hardware failure or power drop to me. I've got a dozen or so Dell 2800 or 2850 machines which have no problems reaching 6+ months of uptime. -- -Chuck
It doesn't look like a power problem. We have it with several systems in different datacenters. I've tried the "giantlock" setting, let's hope it works! Am I safe to assume that it can (negatively) impact performance of the system? What can be the cause of "fine grained locking" causing the crashes? I'm willing to let a developer play around with one of the affected machines... Thanks again for the suggestion Ulrich. Met vriendelijke groet / Kind Regards, Rutger Bevaart On Apr 5, 2006, at 1:53 AM, Ulrich Keil wrote:> > We solved the problem by running the network stack with Giant lock > (set "debug.mpsafenet=0" in loader.conf). > Since then the machine runs rock stable. > > Ulrich >