I''m exporting a 33tb lustre mount to various linux and tru64 clients. The nfs daemon likes to frequently hang and when for no apparent reason. A trace of the hung daemons looks like this-- Dec 18 11:28:20 cpu1 kernel: nfsd D ffff81081a8e7a70 0 30252 1 30255 30251 (L-TLB) Dec 18 11:28:20 cpu1 kernel: ffff81081a8e7a70 0000000000000000 0000000000000282 000000000000000a Dec 18 11:28:20 cpu1 kernel: ffff81082ae66040 ffffffff802d0ae0 00016d72d81489b7 000000000000bb71 Dec 18 11:28:20 cpu1 kernel: ffff81082ae66228 0000000000000000 7fffffffffffffff ffffffffffffffff Dec 18 11:28:20 cpu1 kernel: Call Trace: Dec 18 11:28:20 cpu1 kernel: [<ffffffff80062113>] __down+0xc3/0xd8 Dec 18 11:28:20 cpu1 kernel: [<ffffffff800867b0>] default_wake_function+0x0/0xe Dec 18 11:28:20 cpu1 kernel: [<ffffffff8875ff28>] :lustre:ll_iget_for_nfs+0x6a8/0x760 Dec 18 11:28:20 cpu1 kernel: [<ffffffff80061dd1>] __down_failed +0x35/0x3a Dec 18 11:28:20 cpu1 kernel: [<ffffffff8874f489>] :lustre:.text.lock.file+0x69/0xa0 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887603dd>] :lustre:ras_reset +0x2d/0xf0 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887f1848>] :nfsd:nfsd_acceptable+0x0/0xd8 Dec 18 11:28:20 cpu1 kernel: [<ffffffff8021663e>] __qdisc_run +0x36/0x1bb Dec 18 11:28:20 cpu1 kernel: [<ffffffff800d094b>] do_readv_writev +0x198/0x295 Dec 18 11:28:20 cpu1 kernel: [<ffffffff8874a840>] :lustre:ll_file_write+0x0/0xa20 Dec 18 11:28:20 cpu1 kernel: [<ffffffff8874ddd8>] :lustre:ll_file_open +0xbc8/0xd60 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887f2725>] :nfsd:nfsd_vfs_write +0xf2/0x2e1 Dec 18 11:28:20 cpu1 kernel: [<ffffffff8874d210>] :lustre:ll_file_open +0x0/0xd60 Dec 18 11:28:20 cpu1 kernel: [<ffffffff8001dea8>] __dentry_open +0x101/0x1dc Dec 18 11:28:20 cpu1 kernel: [<ffffffff887f2f99>] :nfsd:nfsd_write +0xb5/0xd5 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887f97a0>] :nfsd:nfsd3_proc_write+0xea/0x109 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887ef0e9>] :nfsd:nfsd_dispatch +0xd7/0x198 Dec 18 11:28:20 cpu1 kernel: [<ffffffff883e94b5>] :sunrpc:svc_process +0x43c/0x6fa Dec 18 11:28:20 cpu1 kernel: [<ffffffff80061d1c>] __down_read+0x12/0x92 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887ef471>] :nfsd:nfsd+0x0/0x327 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887ef624>] :nfsd:nfsd +0x1b3/0x327 Dec 18 11:28:20 cpu1 kernel: [<ffffffff8005bc25>] child_rip+0xa/0x11 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887ef471>] :nfsd:nfsd+0x0/0x327 Dec 18 11:28:20 cpu1 kernel: [<ffffffff887ef471>] :nfsd:nfsd+0x0/0x327 Dec 18 11:28:20 cpu1 kernel: [<ffffffff8005bc1b>] child_rip+0x0/0x11 Any ideas?
Hello! On Dec 18, 2007, at 11:30 AM, Aaron Knister wrote:> I''m exporting a 33tb lustre mount to various linux and tru64 clients. > The nfs daemon likes to frequently hang and when for no apparent > reason. A trace of the hung daemons looks like this--Please apply a patch from bug 14360. also you might want to give other patches referenced from there a try Bye, Oleg