Guy Helmer wrote:> (reposting since this involves 6-stable)
>
> I'm investigating a problem where a pretty much stock 6.2 SMP kernel
> randomly hangs on multiple Supermicro X7DBR-i+ and X7DBR-8+ systems.
> The system syncs the filesystems and prints "Uptime: ...", then
hangs.
>
> So far, I've narrowed it down to the MOD_SHUTDOWN request to the
> "rootbus" module. Adding a printf() before and after the
> "device_shutdown(child);" line in subr_bus.c method
> bus_generic_shutdown() seems to make the problem go away, as does
> running a kernel with INVARIANTS, WITNESS, and DDB/KDB. I'm trying to
> reproduce the hang on a plain SMP kernel with just DDB/KDB, but it
> hasn't hung yet.
>
I'm still not clear as to why, but the following change to
bus_generic_shutdown() has kept the machines in question successfully
rebooting continuously without hanging for the past three days:
Index: sys/kern/subr_bus.c
==================================================================RCS file:
/home/ncvs/src/sys/kern/subr_bus.c,v
retrieving revision 1.184.2.4
diff -u -r1.184.2.4 subr_bus.c
--- sys/kern/subr_bus.c 22 Sep 2006 18:49:14 -0000 1.184.2.4
+++ sys/kern/subr_bus.c 16 Mar 2007 17:59:04 -0000
@@ -2913,7 +2913,11 @@
device_t child;
TAILQ_FOREACH(child, &dev->children, link) {
+ //printf(" Calling device_shutdown on child '%s':\n",
child->nameunit);
+ DELAY(1000);
device_shutdown(child);
+ //printf(" Returned from device_shutdown on child '%s'.\n",
child->nameunit);
+ DELAY(1000);
}
return (0);