Oleg Derevenetz
2007-Oct-19 08:17 UTC
kern/104406: [ufs] Processes get stuck in "ufs" state under persistent CPU load
Hi all, Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, but I can't obtain a kernel dump to get result of all show commands from here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html After my break to debugger using Ctrl+Alt+Esc sequence and entering a "panic" command kernel does not wrote a kernel dump but seems to hang. Can anyone describe how to obtain a kernel dump in this situation, or at least say - which output of show commands need in first place to debug this ? Output of all suggested commands is huge and I afraid of making mistake when carrying this output from screen to list of paper and back :-) -- Oleg Derevenetz <oleg@vsi.ru> OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISP http://isp.vsi.ru
Alfred Perlstein
2007-Oct-19 15:05 UTC
kern/104406: [ufs] Processes get stuck in "ufs" state under persistent CPU load
* Oleg Derevenetz <oleg@vsi.ru> [071019 08:17] wrote:> Hi all, > > Can anyone take a look on PR kern/104406 ? I got repeatable hang situation, > but I can't obtain a kernel dump to get result of all show commands from > here: > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html > > After my break to debugger using Ctrl+Alt+Esc sequence and entering a > "panic" command kernel does not wrote a kernel dump but seems to hang. Can > anyone describe how to obtain a kernel dump in this situation, or at least > say - which output of show commands need in first place to debug this ? > Output of all suggested commands is huge and I afraid of making mistake > when carrying this output from screen to list of paper and back :-)Oleg, one thing you can do to make this less painful is to run your machine's console over serial port. First get a crossover serial cable, make sure it works from one box to another, it should be easy to run "tip com1" on both boxes to ensure that it works. Then you just need to add console=comconsole to /boot/loader.conf and your box's console should come over serial. Then on the machine watching the console, you can just do this: % script Script started, output file is typescript % tip com1 ...do ddb stuff now... ...stop tip % exit now you should have everything logged into a file called "typescript" should save you a big headache. As far as getting a dump from ddb, try this: ddb> call doadump I'm completely at a loss why this isn't a base ddb command "dump" but whatever... :) -Alfred
Rainer Hurling
2007-Oct-31 06:45 UTC
kern/104406: [ufs] Processes get stuck in "ufs" state under persistent CPU load
Thanks for your answer. Kris Kennaway schrieb:> Rainer Hurling wrote: >> Looking into PR kern/104406 it seems, that this describes exactly what >> I am experiencing on three of my systems over the last weeks. They are >> running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long ago ;-) ). > > Actually it sounds nothing like it at all ;) > >> On these machines I often observe hangings, sometimes only a few >> seconds, on other times 20-30 seconds before input/output is back. >> This seems to happen when more extensive disk usage is needed >> (portupgrade, buildworld, browsing complicated websites etc.). During >> the hang even xterm is not responding any more, other (diskless) >> applications like xclock keep to continue. I have no panics, only UFS >> (and MSDOSFS) are mounted, no NTFS. About two months ago none of my >> systems showed these hangings. > > Is your system swapping? This is the usual cause of pauses during high > application (actually memory) load. > > KrisNo, I am working with 2GB RAM, without swapping at all. In the meantime I tested the above described behaviour a little more. The hangings even appeared without using Xorg, only working on consoles under heavy disk usage (portupgrade etc.). Rainer
Kris Kennaway
2007-Oct-31 14:44 UTC
kern/104406: [ufs] Processes get stuck in "ufs" state under persistent CPU load
Rainer Hurling wrote:> Thanks for your answer. > > Kris Kennaway schrieb: >> Rainer Hurling wrote: >>> Looking into PR kern/104406 it seems, that this describes exactly >>> what I am experiencing on three of my systems over the last weeks. >>> They are running FreeBSD 8.0-CURRENT (known as 7.0-CURRENT not long >>> ago ;-) ). >> >> Actually it sounds nothing like it at all ;) >> >>> On these machines I often observe hangings, sometimes only a few >>> seconds, on other times 20-30 seconds before input/output is back. >>> This seems to happen when more extensive disk usage is needed >>> (portupgrade, buildworld, browsing complicated websites etc.). During >>> the hang even xterm is not responding any more, other (diskless) >>> applications like xclock keep to continue. I have no panics, only UFS >>> (and MSDOSFS) are mounted, no NTFS. About two months ago none of my >>> systems showed these hangings. >> >> Is your system swapping? This is the usual cause of pauses during >> high application (actually memory) load. >> >> Kris > > No, I am working with 2GB RAM, without swapping at all. > > In the meantime I tested the above described behaviour a little more. > The hangings even appeared without using Xorg, only working on consoles > under heavy disk usage (portupgrade etc.).OK, configure the system with the debugger and when it is "hung", break to DDB and obtain the data requested in the developers handbook to try and investigate what is going on. You may want to do this a few times to make sure you capture a representative sample. Kris
Kris Kennaway
2007-Nov-04 07:04 UTC
kern/104406: [ufs] Processes get stuck in "ufs" stateunderpersistent CPU load
Oleg Derevenetz wrote:>>> Dumpdev is swap partition on da0 (single physical disk) that >>> connected to Mylex AcceleRAID 170 RAID controller. The problem >>> arrives when I copy large amount of files from FTP to another disk >>> (da1) that is connected to the same RAID controller. >> >> If the driver or controller is misbehaving it could explain both >> problems. Any chance you can get another disk in there on a different >> controller to dump onto? > > Yes, I got IDE disk and saved kernel dump for another static hang state > on it. Here is the dump: > > ftp://oleg.vsi.ru/private/vmcore.0.zipIs this just the vmcore, or the debugging kernel also? Both are needed to make sense of the dump. Kris
Oleg Derevenetz
2007-Nov-04 12:15 UTC
kern/104406: [ufs] Processes get stuck in "ufs" stateunderpersistent CPU load
>>>> Dumpdev is swap partition on da0 (single physical disk) that >>>> connected to Mylex AcceleRAID 170 RAID controller. The problem >>>> arrives when I copy large amount of files from FTP to another disk >>>> (da1) that is connected to the same RAID controller. >>> >>> If the driver or controller is misbehaving it could explain both >>> problems. Any chance you can get another disk in there on a different >>> controller to dump onto? >> >> Yes, I got IDE disk and saved kernel dump for another static hang state >> on it. Here is the dump: >> >> ftp://oleg.vsi.ru/private/vmcore.0.zip > > Is this just the vmcore, or the debugging kernel also? Both are needed > to make sense of the dump.Kernel binary with kernel config is here: ftp://oleg.vsi.ru/private/kernel.zip This kernel was built statically, and no modules loaded on boot at all. -- Oleg Derevenetz <oleg@vsi.ru> OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISP http://isp.vsi.ru