Hi, I'm thinking of enabling the watchdog on our Dell PowerEdge 2950 / FreeBSD 8.0 amd64, so that it reboots the machine in case of lockups. Right now it doesn't work: # watchdog watchdog: patting the dog: Operation not supported # Looking through the kernel configuration I found two relevant settings: In /sys/conf/NOTES: # # Add software watchdog routines. # options SW_WATCHDOG and in /sys/amd64/conf/NOTES: # # Watchdog routines. # options MP_WATCHDOG Which of them should I rebuild the kernel with? BTW, the existing kernel is built with the default "options SCHED_ULE" to make good use of multiple CPUs, does watchdog work with it? Thanks.
On Sat, May 08, 2010 at 06:06:15PM +0500, rihad wrote:> Hi, I'm thinking of enabling the watchdog on our Dell PowerEdge 2950 > / FreeBSD 8.0 amd64, so that it reboots the machine in case of > lockups. > Right now it doesn't work: > > # watchdog > watchdog: patting the dog: Operation not supported > #This is almost certainly a failed WDIOCPATPAT ioctl() call, indicating you don't have support for watchdog stuff in your kernel. Once you add that, be aware you'll need to run watchdogd(8) as well.> Looking through the kernel configuration I found two relevant settings: > In /sys/conf/NOTES: > # > # Add software watchdog routines. > # > options SW_WATCHDOG > > > and in /sys/amd64/conf/NOTES: > # > # Watchdog routines. > # > options MP_WATCHDOG > > > Which of them should I rebuild the kernel with? BTW, the existing > kernel is built with the default "options SCHED_ULE" to make good use > of multiple CPUs, does watchdog work with it?I think what you want is SW_WATCHDOG, but I have no idea if this works properly or effectively on multiprocessor machines. MP_WATCHDOG may address that, but does not work with SCHED_ULE[1]. I would recommend reading the watchdog(4) and watchdogd(8) man pages. I would also recommend reading [2], since it sheds some light on how MP_WATCHDOG works. [1]: See src/sys/amd64/amd64/mp_watchdog.c, around line 32. There's an #error statement that gets hit if SCHED_ULE is used. [2]: See src/sys/amd64/amd64/mp_watchdog.c, around line 51. There's a large comment explaining what MP_WATCHDOG does. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
On Sat, May 8, 2010 at 2:06 PM, rihad <rihad@mail.ru> wrote:> Hi, I'm thinking of enabling the watchdog on our Dell PowerEdge 2950 / > FreeBSD 8.0 amd64, so that it reboots the machine in case of lockups. > Right now it doesn't work: >I installed watchdogd on a few 8-core Dell PowerEdge 1950, which I assume are similar to the 2950s. I wrote about how to do this on my blog[1] last year. However, something in the last couple of months has broken watchdog and my machines causing them to regularly lock up. I'm unsure what has changed, and I didn't have time to investigate so I just turned it off and everything is now fine. Andrew [1] http://bramp.net/blog/freebsd-software-watchdog
rihad writes: | Hi, I'm thinking of enabling the watchdog on our Dell PowerEdge 2950 / | FreeBSD 8.0 amd64, so that it reboots the machine in case of lockups. | Right now it doesn't work: | | # watchdog | watchdog: patting the dog: Operation not supported | # | Looking through the kernel configuration I found two relevant settings: | In /sys/conf/NOTES: | # | # Add software watchdog routines. | # | options SW_WATCHDOG | | and in /sys/amd64/conf/NOTES: | # | # Watchdog routines. | # | options MP_WATCHDOG | | Which of them should I rebuild the kernel with? BTW, the existing kernel | is built with the default "options SCHED_ULE" to make good use of | multiple CPUs, does watchdog work with it? If no one has said yet, kldload ipmi then run watchdogd. ... or compile it into the kernel. This will enable the IPMI HW watchdog. If it triggers, it will appear in the IPMI SEL (ipmitool sel list). Doug A.