I suspect I have a machine that appears to be suffering from a severe interrupt problem: It becomes non-responsive for all practical purposes. Entering 'Ctrl-Alt-Esc' eventually results in a very slow dribble of characters with 'db> ' being reached after about 1/2 hr. Typing characters at the prompt resulted in nothing being echoed after about 2 hours (and which point it got reset). This is a 2.4GB Xeon CPU so I would expect a slightly zippier response :-). My working hypothesis is that an interrupt storm is preventing the system from doing anything other than acknowledge interrupts. Does this sound reasonable? If not, can anyone suggest any alternate hypotheses? In either case, does anyone have any ideas on how to get a crashdump when there's (effectively) no response to DDB? Is enabling the logical CPUs (machdep.cpu_idle_hlt=0) likely to have any effect? The machine is running 4.9p1. It was previously running 4.8p7 and this problem was not noticed - though it's not clear that it wasn't there. -- Peter Jeremy
On Wed, 11 Feb 2004, Peter Jeremy wrote:> My working hypothesis is that an interrupt storm is preventing the > system from doing anything other than acknowledge interrupts. Does > this sound reasonable? If not, can anyone suggest any alternate > hypotheses?Sounds about right. Monitor vmstat -i or 'show intr' from ddb; if any of the numbers seems really high, there's your culprit. -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org