At 07:55 AM 26/06/2006, Marc G. Fournier wrote:
>For the server that I'm fighting with right now, where Dmitry
>pointed out that it looks like a deadlock issue ... I have
>dumpdev/savecore enabled, is there some way of forcing it to panic
>when I know I actually have the deadlock, so that it will dump a core?
>
>DDB is a difficult option, since a keyboard isn't always attached to
>the server when it boots ...
These are ugly quick hacks, but it might work for you... If the
network still continues to function. you might be able to hack up a
quick script to force a panic. Hackup some kld (e.g. ichwd) with
something like
# diff -u /usr/src/sys/dev/ichwd/ichwd.c.orig /usr/src/sys/dev/ichwd/ichwd.c
--- /usr/src/sys/dev/ichwd/ichwd.c.orig Mon Jun 26 09:50:33 2006
+++ /usr/src/sys/dev/ichwd/ichwd.c Mon Jun 26 09:51:04 2006
@@ -225,6 +225,7 @@
device_t ich = NULL;
device_t dev;
+ panic("I played panicky idiot no 3 on the Poseidon
Adventure");
/* look for an ICH LPC interface bridge */
for (id = ichwd_devices; id->desc != NULL; ++id)
if ((ich = pci_find_device(id->vendor, id->device)) !=
NULL)
Then run a script something like the one below. Set target to be an
ip that you control and is always up. When you think your box has
deadlocked, add a firewall rule on the target machine to block ICMP
echos from the problem machine. You might need to fiddle with
max_tries to make it more aggressive. If the target machine is on
the local LAN you can make it a nice low value like 2 or 3. Ideally,
you would want to make a kld that would instead do the test for you,
or you could perhaps hack up the software watchdog to call a panic
for you. Dont know if that works or not as I have only used hardware watchdogs.
#!/bin/sh
timeout=5
no_resp_sleep=10
max_tries=25
normal_sleep=300
con_cnt=0
target=1.1.1.1
while true; do
strings /boot/kernel/ichwd.ko > /dev/null # try and make sure
these binaries are cached
strings /sbin/kldload > /dev/null # try and make sure these
binaries are cached
if /sbin/ping -c1 -t$timeout $target > /dev/null 2>&1; then
no_resp=0
else
no_resp=$(($no_resp + 1))
fi
if [ $no_resp -gt $max_tries ]; then
/sbin/kldload ichwd
fi
if [ $no_resp -gt 0 ]; then
sleep $no_resp_sleep
else
sleep $normal_sleep
if [ $con_cnt -lt 25 ]; then
con_cnt=$(($con_cnt + 1))
fi
fi
done &
---Mike