Hi, I have several servers configured as follows: an unpatched 2.4.19-pre7 kernel, IDE discs, 1Gb RAM (not all used), load avg around 1-4. a large partition, formatted reiserfs within large partition are 80 300Mb loopback file systems, formatted ext3, quotas enabled From time to time, the processes on the server get stuck in D state. The processes can be attributed to one loopback filesystem i.e. processes using other loopback filesystems remain unaffected. Since there's no kernel oops, I've provided the (gzipped) SysRQ-T output that was dumped by the kernel. You'll notice several sendmail processes stuck like this: sendmail D F0E31F9C 0 32680 436 328 32617 (NOTLB) Call Trace: [page_getlink+34/176] [__down+104/208] [__down_failed+8/12] [.text.lock.namei+53/1221] [link_path_walk+1822/2496] Call Trace: [<c0147402>] [<c01076f8>] [<c01078a4>] [<c0147730>] [<c014453e>] [path_release+16/48] [getname+94/160] [__user_walk+51/80] [sys_lstat64+20/112] [system_call+51/56] [<c0143b90>] [<c014390e>] [<c0144c73>] [<c0141474>] [<c0108b0b>] Any thoughts ? Regards, Nick.
Hi, On Thu, Jun 13, 2002 at 05:18:18PM +0100, Nick Burrett wrote:> Hi, > > I have several servers configured as follows: > an unpatched 2.4.19-pre7 kernel, IDE discs, 1Gb RAM (not all used), > load avg around 1-4. > a large partition, formatted reiserfs > within large partition are 80 300Mb loopback file systems, formatted > ext3, quotas enabledThere are a lot of quota problems in mainline 2.4 which are fixed in the -ac kernels: trying that would eliminate at least one possible cause here. There are a few processes stuck inside dquot operations in your trace, and they may be innocent victims of the lockup, but maybe not. Cheers, Stephen