On Wed, 9 Mar 2005, Tony Arcieri wrote:
> I have a dual Opteron upon which seems to only stay up approximately two
> weeks at a time then spontaneously reboots. It's colocated so I
can't ever
> see panic messages, and I don't have another system colocated at the
same
> place I can use to gather debugging info.
You may want to consider finding a small system with a free serial port to
serve as a temporary serial console. Without output from the crash its
impossible to tell what went wrong.
> I've never managed to get the system to generate a crash dump either.
It
> has a 1GB swap partition and 2GB of physical RAM but through the last
> few reboots I've been setting hw.physmem to 896M as the only custom
parameter
> in loader.conf. The swap partition is labeled as follows:
>
> twed0s1b swap 1024MB SWAP
>
> And dumpdev is set in rc.conf as follows:
>
> dumpdev="/dev/twed0s1b"
>
> /var/crash/minfree is set to 2048
>
> Lately I built a kernel from GENERIC using the latest RELENG_5 sources and
> without SMP support and experienced a reboot after approximately 16 days
uptime,
> roughly equivalent to how long it took the system to crash with SMP
enabled.
> No core file was generated.
>
> The kernel was built using source checked out from RELENG_5 on February
18th.
> I'm not sure if any Opteron specific fixes have been applied to the
branch
> since then.
Make sure you're actually running this kernel since crashdump support for
twe was added 2/12, in rev 1.22.2.1 of src/sys/dev/twe/twe.c.
> Are there any other means of gathering debugging data that would work in
> my situation? As is I'm still unsure if my problems are hardware or
> software related as I've still never seen a panic message from the
> system (hardware is a Tyan K8S motherboard in a Tyan Transport system)
You really, really want a serial console.
> Should I look into using KTR ALQ to log KTR data to the swap partition, and
> if it fills up will it wrap over to the beginning? I've never used
that
> feature before...
If you don't have a serial console to manipulate ddb from or crashdumps
then there is no way to retrieve the ktr data.
--
Doug White | FreeBSD: The Power to Serve
dwhite@gumbysoft.com | www.FreeBSD.org