Hi,
I have several servers configured as follows:
an unpatched 2.4.19-pre7 kernel, IDE discs, 1Gb RAM (not all used),
load avg around 1-4.
a large partition, formatted reiserfs
within large partition are 80 300Mb loopback file systems, formatted
ext3, quotas enabled
From time to time, the processes on the server get stuck in D state.
The processes can be attributed to one loopback filesystem i.e.
processes using other loopback filesystems remain unaffected.
Since there's no kernel oops, I've provided the (gzipped) SysRQ-T output
that was dumped by the kernel. You'll notice several sendmail processes
stuck like this:
sendmail D F0E31F9C 0 32680 436 328 32617 (NOTLB)
Call Trace: [page_getlink+34/176] [__down+104/208] [__down_failed+8/12]
[.text.lock.namei+53/1221] [link_path_walk+1822/2496]
Call Trace: [<c0147402>] [<c01076f8>] [<c01078a4>]
[<c0147730>]
[<c014453e>]
[path_release+16/48] [getname+94/160] [__user_walk+51/80]
[sys_lstat64+20/112] [system_call+51/56]
[<c0143b90>] [<c014390e>] [<c0144c73>] [<c0141474>]
[<c0108b0b>]
Any thoughts ?
Regards,
Nick.
Hi, On Thu, Jun 13, 2002 at 05:18:18PM +0100, Nick Burrett wrote:> Hi, > > I have several servers configured as follows: > an unpatched 2.4.19-pre7 kernel, IDE discs, 1Gb RAM (not all used), > load avg around 1-4. > a large partition, formatted reiserfs > within large partition are 80 300Mb loopback file systems, formatted > ext3, quotas enabledThere are a lot of quota problems in mainline 2.4 which are fixed in the -ac kernels: trying that would eliminate at least one possible cause here. There are a few processes stuck inside dquot operations in your trace, and they may be innocent victims of the lockup, but maybe not. Cheers, Stephen