thr3ads.net - Lustre discuss - [Lustre-discuss] Looping in __d

If this information is useful, please help other people find it:
Share via:

Jakob Goldbach

2008-May-15 09:34 UTC

[Lustre-discuss] Looping in __d_lookup

Hi,

I run 1.6.4.3-patchless againt a vanilla 2.6.18 - but with OpenVZ
patches on my clients. OpenVZ is virtualization (like vserver, bsd jails
etc.).

On a regular basis a process get stuck in __d_lookup. When I dig in it
seems I''m caught in the hlist_for_each_entry_rcu loop never satisfying
the exit-from-loop condition. 

Any ideas on how to solve this ? Is this even Lustre issues ? 

dmesg and gdb output below. 

Thanks 
/Jakob


(gdb) list *__d_lookup+0x10b
0x12f0 is in __d_lookup (fs/dcache.c:1153).
1148            struct dentry *dentry, *found;
1149
1150            rcu_read_lock();
1151
1152            found = NULL;
1153            hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
1154                    struct qstr *qstr;
1155
1156                    if (dentry->d_name.hash != hash)
1157                            continue;


[186124.494329] SysRq: Show Regs
[186124.495218] ----------- IPI show regs -----------
[186124.496136] CPU 3, VCPU 0:1
[186124.496804] Modules linked in: simfs vznetdev vzethdev vzrst ip_nat
vzcpt ip_conntrack nfnetlink vzdquota vzmon vzdev xt_length ipt_ttl xt_
tcpmss ipt_TCPMSS iptable_mangle xt_multiport xt_limit ipt_tos
ipt_REJECT iptable_filter ip_tables x_tables 8021q osc mgc lustre lov
lquota mdc
 ksocklnd ptlrpc obdclass lnet lvfs libcfs bonding xfs
[186124.503636] Pid: 22699, comm: find Not tainted
2.6.18.8-openvz-028stab053-bnx2-1.6.7b-arpannounce1 #3 028stab053
[186124.505535] RIP: 0060:[<ffffffff8029b314>]  [<ffffffff8029b314>]
__d_lookup+0x10b/0x142
[186124.507265] RSP: 0068:ffff810073d63bc8  EFLAGS: 00000282
[186124.508296] RAX: ffff8101016dc298 RBX: ffff8101016dc270 RCX:
0000000000000013
[186124.509768] RDX: 0000000000025ff5 RSI: 00c38320c56a5ff5 RDI:
ffff810118b056b0
[186124.511480] RBP: ffff810073d63c08 R08: ffff8100ac9e8000 R09:
ffff810118b056b0
[186124.512963] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff8101016dc298
[186124.514452] R13: ffff810073d63e38 R14: ffff810118b056b0 R15:
ffff810073d63c78
[186124.515931] FS:  00002ba786cb56d0(0000) GS:ffff81012a693340(0000)
knlGS:0000000000000000
[186124.517538] CS:  0060 DS: 0000 ES: 0000 CR0: 0000000080050033
[186124.518587] CR2: 0000000000539938 CR3: 0000000073f06000 CR4:
00000000000006e0
[186124.520022] 
[186124.520023] Call Trace:
[186124.521245]  [<ffffffff8029105d>] do_lookup+0x2c/0x193
[186124.522363]  [<ffffffff80293122>] __link_path_walk+0xb07/0x10ac
[186124.523642]  [<ffffffff8029374e>] link_path_walk+0x87/0x140
[186124.524818]  [<ffffffff80293c76>] do_path_lookup+0x2d3/0x2f8
[186124.526000]  [<ffffffff802945e2>] __user_walk_fd+0x41/0x62
[186124.527156]  [<ffffffff8028cecb>] vfs_lstat_fd+0x24/0x5a
[186124.528278]  [<ffffffff8028cf23>] sys_newlstat+0x22/0x3c
[186124.529383]  [<ffffffff80209902>] system_call+0x7e/0x83
[186124.530362] DWARF2 unwinder stuck at system_call+0x7e/0x83
[186124.531460] Leftover inexact backtrace:
[186124.532563]

Oleg Drokin

2008-May-16 22:44 UTC

head link

[Lustre-discuss] Looping in __d_lookup

Hello!

On May 15, 2008, at 5:34 AM, Jakob Goldbach wrote:
> On a regular basis a process get stuck in __d_lookup. When I dig in it
> seems I''m caught in the hlist_for_each_entry_rcu loop never
satisfying
> the exit-from-loop condition.
Hm, so you actually have a circular loop? I wonder if you can e.g. run  
your (or similar setup)
in vmware on or other environment that allows to attach gdb to running  
kernel, reproduce and then
see what''s going on.
> Any ideas on how to solve this ? Is this even Lustre issues ?
I did some searches and while I remember some similar thing being  
fixed, I cannot find it
and I think it was only happening on server anyway. Might be somebody  
else can remember more.

Bye,
     Oleg

Jakob Goldbach

2008-May-20 20:24 UTC

head link

[Lustre-discuss] Looping in __d_lookup

> 
> Hm, so you actually have a circular loop? 
Yes - I''ve asked for help on the OpenVZ list as well - Pavel Emelyanov
provided me with a debug patch. This patch has now confirmed the loop in
__d_lookup.
> I wonder if you can e.g. run  
> your (or similar setup)
> in vmware [snip]
Hmm, I''m not really vmware enabled but I''ll see what I can do.
What
should I look for in the debugger ? 

Thanks
/Jakob

Robin Humble

2008-May-21 08:18 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Tue, May 20, 2008 at 10:24:25PM +0200, Jakob Goldbach
wrote:>> Hm, so you actually have a circular loop? 
>Yes - I''ve asked for help on the OpenVZ list as well - Pavel
Emelyanov
>provided me with a debug patch. This patch has now confirmed the loop in
>__d_lookup.
we''re also seeing __d_lookup soft lockups. 4 are attached.

one was a cached_lookup but the first line of that fn is a __d_lookup
which is where I suspect is where the real soft lockup occurred. after a
while the node is toast and has to be rebooted.

kernel is 2.6.23.17 with patchless lustre 1.6.4.3, modified with
patches from bz 14250 (attachment 14109) and 13378 (attachment 12276).
using o2ib, x86_64, rhel5.1

cheers,
robin

ps. I''m glad to see some Lustre support for modern linux kernels and
would like to see a lot more! the VM''s of distro kernel sometimes
behave erratically enough that we can''t use them on Lustre client nodes
in production.
-------------- next part --------------
May  6 14:36:53 x6 kernel: BUG: soft lockup - CPU#2 stuck for 11s! [rsync:14200]
May  6 14:36:53 x6 kernel: CPU 2: 
May  6 14:36:53 x6 kernel: Modules linked in: ext3 jbd loop osc mgc lustre lov
lquota mdc ko2iblnd ptlrpc obdclass lnet lvfs libcfs  rdma_ucm ib_ucm rdma_cm
iw_cm ib_addr ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc
dm_mirror dm_multipath dm_mod
battery ac sg sd_mod i2c_i801 i2c_core ahci ata_piix libata ib_mthca scsi_mod
i5000_edac ib_mad edac_core ib_core ehci_hcd  shpchp uhci_hcd rng_core button
nfs nfs_acl lockd sunrpc e1000
May  6 14:36:53 x6 kernel: Pid: 14200, comm: rsync Not tainted 2.6.23.17 #1 
May  6 14:36:53 x6 kernel: RIP: 0010:[<ffffffff80282cef>] 
[<ffffffff80282cef>] __d_lookup+0xed/0x110
May  6 14:36:53 x6 kernel: RSP: 0018:ffff81011b7ebbe8  EFLAGS: 00000286 
May  6 14:36:53 x6 kernel: RAX: ffff81011d71aea0 RBX: ffff81011d71aea0 RCX:
0000000000000014
May  6 14:36:53 x6 kernel: RDX: 00000000000cb941 RSI: ffff81011b7ebcb8 RDI:
ffff810120522cb8
May  6 14:36:53 x6 kernel: RBP: ffff8101d2707408 R08: 0000000000000007 R09:
0000000000000007
May  6 14:36:53 x6 kernel: R10: 00007fffffffb750 R11: ffffffff802c7d3e R12:
ffffffff8853dc60
May  6 14:36:53 x6 kernel: R13: ffff8100992a5470 R14: ffff81024adb3af8 R15:
ffff810219defe80
May  6 14:36:53 x6 kernel: FS:  00002aaaab6296e0(0000) GS:ffff81025fc6d840(0000)
knlGS:0000000000000000
May  6 14:36:53 x6 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b 
May  6 14:36:53 x6 kernel: CR2: 000000000079f018 CR3: 000000018c1d0000 CR4:
00000000000006e0
May  6 14:36:53 x6 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
May  6 14:36:54 x6 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
May  6 14:36:54 x6 kernel:  
May  6 14:36:54 x6 kernel: Call Trace: 
May  6 14:36:54 x6 kernel:  [<ffffffff80282cd5>] __d_lookup+0xd3/0x110 
May  6 14:36:54 x6 kernel:  [<ffffffff80279b1e>] do_lookup+0x2a/0x1ae 
May  6 14:36:54 x6 kernel:  [<ffffffff8027bc4d>]
__link_path_walk+0x924/0xde9
May  6 14:36:54 x6 kernel:  [<ffffffff8027c16a>] link_path_walk+0x58/0xe0 
May  6 14:36:54 x6 kernel:  [<ffffffff8027c536>]
do_path_lookup+0x1ab/0x1cf
May  6 14:36:54 x6 kernel:  [<ffffffff8027afe4>] getname+0x14c/0x191 
May  6 14:36:54 x6 kernel:  [<ffffffff8027cd67>] __user_walk_fd+0x37/0x53 
May  6 14:36:54 x6 kernel:  [<ffffffff80275a7b>] vfs_lstat_fd+0x18/0x47 
May  6 14:36:54 x6 kernel:  [<ffffffff80275c6d>] sys_newlstat+0x19/0x31 
May  6 14:36:54 x6 kernel:  [<ffffffff8020b3ae>] system_call+0x7e/0x83

May  8 09:32:36 x10 kernel: BUG: soft lockup - CPU#1 stuck for 11s!
[bonnie++.mpi:31720]
May  8 09:32:36 x10 kernel: CPU 1: 
May  8 09:32:36 x10 kernel: Modules linked in: loop osc mgc lustre lov lquota
mdc ko2iblnd ptlrpc obdclass lnet lvfs libcfs rdma_ucm ib_ucm rdma_cm iw_cm
ib_addr ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc dm_mirror
dm_multipath dm_mod battery
ac sg sd_mod i5000_edac edac_core ehci_hcd ahci rng_core ata_piix libata
i2c_i801 scsi_mod i2c_core ib_mthca ib_mad shpchp uhci_hcd ib_core button nfs
nfs_acl lockd sunrpc e1000
May  8 09:32:36 x10 kernel: Pid: 31720, comm: bonnie++.mpi Not tainted 2.6.23.17
#1
May  8 09:32:36 x10 kernel: RIP: 0010:[<ffffffff80282cef>] 
[<ffffffff80282cef>] __d_lookup+0xed/0x110
May  8 09:32:36 x10 kernel: RSP: 0018:ffff81014248fd78  EFLAGS: 00000286 
May  8 09:32:36 x10 kernel: RAX: ffff8101a79cc500 RBX: ffff8101a79cc500 RCX:
0000000000000014
May  8 09:32:36 x10 kernel: RDX: 00000000000ac16f RSI: ffff81014248feb8 RDI:
ffff8101ff617898
May  8 09:32:36 x10 kernel: RBP: ffff8101ff617898 R08: 0000000000000000 R09:
ffff81025fd3d5c0
May  8 09:32:37 x10 kernel: R10: 0000000000000001 R11: ffffffff802c7d3e R12:
ffffffff8027c1e0
May  8 09:32:37 x10 kernel: R13: ffff81014248fea8 R14: 0000000000000000 R15:
ffff81025fd3d5c0
May  8 09:32:37 x10 kernel: FS:  00002aaaac95a990(0000)
GS:ffff81025fc6de40(0000) knlGS:0000000000000000
May  8 09:32:37 x10 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b 
May  8 09:32:37 x10 kernel: CR2: 00002aaaab1abe20 CR3: 00000001b5525000 CR4:
00000000000006e0
May  8 09:32:37 x10 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
May  8 09:32:37 x10 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
May  8 09:32:37 x10 kernel:  
May  8 09:32:37 x10 kernel: Call Trace: 
May  8 09:32:37 x10 kernel:  [<ffffffff80279cba>] cached_lookup+0x18/0x86 
May  8 09:32:37 x10 kernel:  [<ffffffff80279d87>] lookup_hash+0x5f/0xb5 
May  8 09:32:37 x10 kernel:  [<ffffffff8027d146>] open_namei+0xf4/0x631 
May  8 09:32:37 x10 kernel:  [<ffffffff802738b1>] file_move+0x1d/0x51 
May  8 09:32:37 x10 kernel:  [<ffffffff8027148a>]
__dentry_open+0x101/0x1ad
May  8 09:32:37 x10 kernel:  [<ffffffff802715dc>] do_filp_open+0x1c/0x38 
May  8 09:32:37 x10 kernel:  [<ffffffff80271371>]
get_unused_fd_flags+0x115/0x126
May  8 09:32:37 x10 kernel:  [<ffffffff8027163e>] do_sys_open+0x46/0xca 
May  8 09:32:37 x10 kernel:  [<ffffffff8020b3ae>] system_call+0x7e/0x83

May 16 06:10:27 x3 kernel: BUG: soft lockup - CPU#1 stuck for 11s! [python:2684]
May 16 06:10:27 x3 kernel: CPU 1: 
May 16 06:10:27 x3 kernel: Modules linked in: loop osc mgc lustre lov lquota mdc
ko2iblnd ptlrpc obdclass lnet lvfs libcfs rdma_ucm ib_ucm rdma_cm iw_cm ib_addr
ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc dm_mirror dm_multipath
dm_mod battery ac sg sd_mod i5000_edac rng_core ata_piix ahci edac_core ib_mthca
uhci_hcd i2c_i801 libata i2c_core ib_mad scsi_mod ib_core button shpchp ehci_hcd
nfs nfs_acl lockd sunrpc e1000
May 16 06:10:28 x3 kernel: Pid: 2684, comm: python Not tainted 2.6.23.17 #1

May 20 12:27:55 x14 kernel: BUG: soft lockup - CPU#0 stuck for 11s! [ps:18064] 
May 20 12:27:55 x14 kernel: CPU 0: 
May 20 12:27:55 x14 kernel: Modules linked in: loop osc mgc lustre lov lquota
mdc ko2iblnd ptlrpc obdclass lnet lvfs libcfs rdma_ucm  ib_ucm rdma_cm iw_cm
ib_addr ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc dm_mirror
dm_multipath dm_mod battery
ac sg sd_mod i5000_edac edac_core rng_core ata_piix ehci_hcd ahci libata
ib_mthca shpchp uhci_hcd scsi_mod ib_mad i2c_i801 ib_core i2c_core button nfs
nfs_acl lockd sunrpc e1000
May 20 12:27:55 x14 kernel: Pid: 18064, comm: ps Not tainted 2.6.23.17 #1 
May 20 12:27:55 x14 kernel: RIP: 0010:[<ffffffff80282cef>] 
[<ffffffff80282cef>] __d_lookup+0xed/0x110
May 20 12:27:55 x14 kernel: RSP: 0018:ffff81023bea7bb8  EFLAGS: 00000286 
May 20 12:27:55 x14 kernel: RAX: ffff81018ce277a0 RBX: ffff81018ce277a0 RCX:
0000000000000014
May 20 12:27:55 x14 kernel: RDX: 000000000008af04 RSI: ffff81023bea7c88 RDI:
ffff810202099dd8
May 20 12:27:55 x14 kernel: RBP: ffff8101c18c49c0 R08: 0000000000000007 R09:
0000000000000007
May 20 12:27:55 x14 kernel: R10: ffff81025fef7a18 R11: ffffffff802c7d3e R12:
ffffffff88061600
May 20 12:27:55 x14 kernel: R13: ffffffff88061640 R14: ffff8101c18c49c0 R15:
ffffffff8045a22d
May 20 12:27:55 x14 kernel: FS:  00002aaaab43cf30(0000)
GS:ffffffff804d4000(0000) knlGS:0000000000000000
May 20 12:27:55 x14 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b 
May 20 12:27:55 x14 kernel: CR2: 00002aaaab17a548 CR3: 000000012dad3000 CR4:
00000000000006e0
May 20 12:27:56 x14 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
May 20 12:27:56 x14 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
May 20 12:27:56 x14 kernel:  
May 20 12:27:56 x14 kernel: Call Trace: 
May 20 12:27:56 x14 kernel:  [<ffffffff80282cd5>] __d_lookup+0xd3/0x110 
May 20 12:27:56 x14 kernel:  [<ffffffff80279b1e>] do_lookup+0x2a/0x1ae 
May 20 12:27:56 x14 kernel:  [<ffffffff8027bc4d>]
__link_path_walk+0x924/0xde9
May 20 12:27:56 x14 kernel:  [<ffffffff8027c16a>] link_path_walk+0x58/0xe0
May 20 12:27:56 x14 kernel:  [<ffffffff80271371>]
get_unused_fd_flags+0x115/0x126
May 20 12:27:56 x14 kernel:  [<ffffffff802d742e>]
bitmap_scnprintf+0x8d/0xac
May 20 12:27:56 x14 kernel:  [<ffffffff8027c536>]
do_path_lookup+0x1ab/0x1cf
May 20 12:27:56 x14 kernel:  [<ffffffff8027cf26>]
__path_lookup_intent_open+0x56/0x96
May 20 12:27:56 x14 kernel:  [<ffffffff8027d0ca>] open_namei+0x78/0x631 
May 20 12:27:56 x14 kernel:  [<ffffffff80257561>] __alloc_pages+0x6c/0x2c1
May 20 12:27:56 x14 kernel:  [<ffffffff802715dc>] do_filp_open+0x1c/0x38 
May 20 12:27:56 x14 kernel:  [<ffffffff80271371>]
get_unused_fd_flags+0x115/0x126
May 20 12:27:56 x14 kernel:  [<ffffffff8027163e>] do_sys_open+0x46/0xca 
May 20 12:27:56 x14 kernel:  [<ffffffff8020b3ae>] system_call+0x7e/0x83 
May 20 12:27:56 x14 kernel:

Jakob Goldbach

2008-May-21 19:05 UTC

head link

[Lustre-discuss] Looping in __d_lookup

> 
> kernel is 2.6.23.17 with patchless lustre 1.6.4.3, 
I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel.
Or at least that is what I thought. OpenVz patch effectively makes the
kernel a 2.6.18++ kernel because they add features from newer kernels in
their maintained 2.6.18 based kernel.  

So the lockup in __d_lookup may just relate to newer patchless clients. 

I got a debug patch from the OpenVz community which indicate dcache
chain corruption in a lustre code path. 

Patch snippet is

--- ./fs/dcache.c.ddebug2	2008-05-21 14:52:15.000000000 +0400
+++ ./fs/dcache.c	2008-05-21 15:10:06.000000000 +0400
@@ -1350,6 +1350,18 @@ static void __d_rehash(struct dentry * e
 {
 
  	entry->d_flags &= ~DCACHE_UNHASHED;
+	if (!spin_is_locked(&dcache_lock)) {
+		printk(KERN_ERR "Dcache lock is not taken on add\n");
+		dump_stack();
+	} else if (list->first != NULL &&
+			list->first->pprev != &list->first) {
+		printk(KERN_ERR "Dcache chain corruption:\n");
+		printk(KERN_ERR "Chain %p --next-> %p\n",
+				list, list->first);
+		printk(KERN_ERR "First %p <-pprev- %p\n",
+				list->first, list->first->pprev);
+		dump_stack();
+	}
  	hlist_add_head_rcu(&entry->d_hash, list);
 }

and stack trace 

[ 6447.548789] Dcache chain corruption:
[ 6447.549529] Chain ffff8100010de880 --next-> ffff8100b4ce00b0
[ 6447.550699] First ffff8100b4ce00b0 <-pprev- 0000000000200200
[ 6447.551711] 
[ 6447.551713] Call Trace:
[ 6447.552809]  [<ffffffff8020ae20>] show_trace+0xae/0x360
[ 6447.553784]  [<ffffffff8020b0e7>] dump_stack+0x15/0x17
[ 6447.554727]  [<ffffffff8029ee94>] __d_rehash+0x75/0x97
[ 6447.555797]  [<ffffffff8029ef2a>] d_rehash+0x74/0x91
[ 6447.556846]  [<ffffffff883b4c6a>] :lustre:ll_revalidate_it+0xa1a/0xd90
[ 6447.557966]  [<ffffffff883b529c>] :lustre:ll_revalidate_nd+0x2bc/0x360
[ 6447.559082]  [<ffffffff80295741>] do_lookup+0x15d/0x193
[ 6447.560142]  [<ffffffff80296fd9>] __link_path_walk+0x409/0x10ac
[snip]

See details in http://bugzilla.openvz.org/show_bug.cgi?id=895

/Jakob

Andreas Dilger

2008-May-22 02:04 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On May 21, 2008  21:05 +0200, Jakob Goldbach wrote:> I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla
kernel.
> Or at least that is what I thought. OpenVz patch effectively makes the
> kernel a 2.6.18++ kernel because they add features from newer kernels in
> their maintained 2.6.18 based kernel.  
> 
> So the lockup in __d_lookup may just relate to newer patchless clients. 
> 
> I got a debug patch from the OpenVz community which indicate dcache
> chain corruption in a lustre code path. 
Do you have the fixes for the statahead patches, disable statahead via
"echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you try
out the
v1_6_5_RC4 tag from CVS (which also contains those patches)?

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Jakob Goldbach

2008-May-22 03:46 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Wed, 2008-05-21 at 20:04 -0600, Andreas Dilger wrote:
> Do you have the fixes for the statahead patches, disable statahead via
> "echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you
try out the
> v1_6_5_RC4 tag from CVS (which also contains those patches)?
> 
I''m running without statahead fixes. We''ll try to disable
statahead for
now and see if I can trigger the lockup again. 

Thanks,
Jakob

Robin Humble

2008-May-22 04:51 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Wed, May 21, 2008 at 08:04:27PM -0600, Andreas Dilger
wrote:>On May 21, 2008  21:05 +0200, Jakob Goldbach wrote:
>> I''m running 1.6.4.3 patchless as well against an 2.6.18
vanilla kernel.
>> Or at least that is what I thought. OpenVz patch effectively makes the
>> kernel a 2.6.18++ kernel because they add features from newer kernels
in
>> their maintained 2.6.18 based kernel.  
>> 
>> So the lockup in __d_lookup may just relate to newer patchless clients.
>> 
>> I got a debug patch from the OpenVz community which indicate dcache
>> chain corruption in a lustre code path. 
>
>Do you have the fixes for the statahead patches, disable statahead via
>"echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you
try out the
>v1_6_5_RC4 tag from CVS (which also contains those patches)?
all of our __d_lookup soft lockups have been when running with 0 in
/proc/fs/lustre/llite/*/statahead_max

I''ll try v1_6_5_RC4 now - should be fun :-)

cheers,
robin

Jakob Goldbach

2008-May-22 05:39 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Thu, 2008-05-22 at 05:46 +0200, Jakob Goldbach wrote:> On Wed, 2008-05-21 at 20:04 -0600, Andreas Dilger wrote:
> 
> > Do you have the fixes for the statahead patches, disable statahead via
> > "echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can
you try out the
> > v1_6_5_RC4 tag from CVS (which also contains those patches)?
> > 
> 
> I''m running without statahead fixes. We''ll try to disable
statahead for
> now and see if I can trigger the lockup again. 
> 
Unfortunately, I was able to trigger this error with statahead disabled.



/Jakob

Alex Lyashkov

2008-May-22 06:22 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote:> > 
> > kernel is 2.6.23.17 with patchless lustre 1.6.4.3, 
> 
> I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla
kernel.
> Or at least that is what I thought. OpenVz patch effectively makes the
> kernel a 2.6.18++ kernel because they add features from newer kernels in
> their maintained 2.6.18 based kernel.  
> 
> So the lockup in __d_lookup may just relate to newer patchless clients. 
> 
> I got a debug patch from the OpenVz community which indicate dcache
> chain corruption in a lustre code path. 
> 
> Patch snippet is
> 
> --- ./fs/dcache.c.ddebug2	2008-05-21 14:52:15.000000000 +0400
> +++ ./fs/dcache.c	2008-05-21 15:10:06.000000000 +0400
> @@ -1350,6 +1350,18 @@ static void __d_rehash(struct dentry * e
>  {
>  
>   	entry->d_flags &= ~DCACHE_UNHASHED;
> +	if (!spin_is_locked(&dcache_lock)) {
> +		printk(KERN_ERR "Dcache lock is not taken on add\n");
> +		dump_stack();
> +	} else if (list->first != NULL &&
> +			list->first->pprev != &list->first) {
> +		printk(KERN_ERR "Dcache chain corruption:\n");
> +		printk(KERN_ERR "Chain %p --next-> %p\n",
> +				list, list->first);
> +		printk(KERN_ERR "First %p <-pprev- %p\n",
> +				list->first, list->first->pprev);
> +		dump_stack();
> +	}
>   	hlist_add_head_rcu(&entry->d_hash, list);
>  }
> 
> and stack trace 
> 
> [ 6447.548789] Dcache chain corruption:
> [ 6447.549529] Chain ffff8100010de880 --next-> ffff8100b4ce00b0
> [ 6447.550699] First ffff8100b4ce00b0 <-pprev- 0000000000200200
> [ 6447.551711] 
> [ 6447.551713] Call Trace:
> [ 6447.552809]  [<ffffffff8020ae20>] show_trace+0xae/0x360
> [ 6447.553784]  [<ffffffff8020b0e7>] dump_stack+0x15/0x17
> [ 6447.554727]  [<ffffffff8029ee94>] __d_rehash+0x75/0x97
> [ 6447.555797]  [<ffffffff8029ef2a>] d_rehash+0x74/0x91
> [ 6447.556846]  [<ffffffff883b4c6a>]
:lustre:ll_revalidate_it+0xa1a/0xd90
> [ 6447.557966]  [<ffffffff883b529c>]
:lustre:ll_revalidate_nd+0x2bc/0x360
> [ 6447.559082]  [<ffffffff80295741>] do_lookup+0x15d/0x193
> [ 6447.560142]  [<ffffffff80296fd9>] __link_path_walk+0x409/0x10ac
> [snip]
> 
This patch and backtrace say - dcache chain was damaged _before_ enter
to lustre, lustre start add entry to new position in dentry cache, and
find damaged entry in list.


-- 
Alex Lyashkov <Alexey.lyashkov at sun.com>
Lustre Group, Sun Microsystems

Jakob Goldbach

2008-May-22 11:45 UTC

head link

[Lustre-discuss] Looping in __d_lookup

> This patch and backtrace say - dcache chain was damaged _before_ enter
> to lustre, lustre start add entry to new position in dentry cache, and
> find damaged entry in list.
> 
Allright. Any idea on how to detect the bad entry upon insertion ? 

Thanks,
Jakob

Robin Humble

2008-May-22 15:51 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Thu, May 22, 2008 at 12:51:42AM -0400, Robin Humble
wrote:>On Wed, May 21, 2008 at 08:04:27PM -0600, Andreas Dilger wrote:
>>On May 21, 2008  21:05 +0200, Jakob Goldbach wrote:
>>> I''m running 1.6.4.3 patchless as well against an 2.6.18
vanilla kernel.
>>> Or at least that is what I thought. OpenVz patch effectively makes
the
>>> kernel a 2.6.18++ kernel because they add features from newer
kernels in
>>> their maintained 2.6.18 based kernel.  
>>> 
>>> So the lockup in __d_lookup may just relate to newer patchless
clients.
>>> 
>>> I got a debug patch from the OpenVz community which indicate dcache
>>> chain corruption in a lustre code path. 
>>
>>Do you have the fixes for the statahead patches, disable statahead via
>>"echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can
you try out the
>>v1_6_5_RC4 tag from CVS (which also contains those patches)?
>
>all of our __d_lookup soft lockups have been when running with 0 in
>/proc/fs/lustre/llite/*/statahead_max
>
>I''ll try v1_6_5_RC4 now - should be fun :-)
unfortunately patched v1_6_5_RC4 doesn''t appear to be workable on
2.6.23 and 2.6.24 kernels. see the other thread.

cheers,
robin

Jakob Goldbach

2008-Jun-03 22:19 UTC

head link

[Lustre-discuss] Looping in __d_lookup

> On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote:
> > 
> > So the lockup in __d_lookup may just relate to newer patchless
clients.
> > 
I got rid of my dcache chain corruption by adding patch below and
exporting _d_rehash from kernel (offcourse, no longer patchless). 

Could this fix a race in patchless code ? 

/Jakob

p.s: A kind soul came up with this. Im not a kernel hacker. 


--- ./lustre/llite/dcache.c.xxx	2007-09-27 16:04:08.000000000 +0400
+++ ./lustre/llite/dcache.c	2008-05-29 11:53:07.000000000 +0400
@@ -470,8 +470,8 @@ revalidate_finish:
         spin_lock(&dcache_lock);
         lock_dentry(de);
         __d_drop(de);
+        _d_rehash(de);
         unlock_dentry(de);
-        __d_rehash(de, 0);
         spin_unlock(&dcache_lock);

Andreas Dilger

2008-Jun-03 23:05 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Jun 04, 2008  00:19 +0200, Jakob Goldbach wrote:> > On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote:
> > > 
> > > So the lockup in __d_lookup may just relate to newer patchless
clients.
> 
> I got rid of my dcache chain corruption by adding patch below and
> exporting _d_rehash from kernel (offcourse, no longer patchless). 
> 
> Could this fix a race in patchless code ? 
> 
> /Jakob
> 
> p.s: A kind soul came up with this. Im not a kernel hacker. 
> 
> 
> --- ./lustre/llite/dcache.c.xxx	2007-09-27 16:04:08.000000000 +0400
> +++ ./lustre/llite/dcache.c	2008-05-29 11:53:07.000000000 +0400
> @@ -470,8 +470,8 @@ revalidate_finish:
>          spin_lock(&dcache_lock);
>          lock_dentry(de);
>          __d_drop(de);
> +        _d_rehash(de);
>          unlock_dentry(de);
> -        __d_rehash(de, 0);
>          spin_unlock(&dcache_lock);
It seems (with 2.6.18-RHEL5 at least) that it is possible to implement
_d_rehash() without any exported kernel functions:

static inline void _d_rehash(struct dentry *dentry)
{
	dentry->d_flags &= ~DCACHE_UNHASHED;
	hlist_add_head_rcu(&dentry->d_hash,
			   d_hash(entry->d_parent, entry->d_name.hash);
}

This should be added to lustre_patchless_compat.h.  We have an existing
function "d_rehash_cond(dentry, lock)" which is only ever called from
Lustre when we don''t want to lock the dcache_lock, but the above usage
is racy.  This should be replaced with _d_rehash() as above for patchless
clients (where d_rehash_cond() is never implemented).

Can you please file a bug with the original details, so that this gets
fixed in the next release.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Jakob Goldbach

2008-Jun-04 06:21 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote:> Can you please file a bug with the original details, so that this gets
> fixed in the next release.
> 
https://bugzilla.lustre.org/show_bug.cgi?id=15975

Alex Lyashkov

2008-Jun-04 09:10 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote:> On Jun 04, 2008  00:19 +0200, Jakob Goldbach wrote:
> > > On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote:
> > > > 
> > > > So the lockup in __d_lookup may just relate to newer
patchless clients.
> > 
> > I got rid of my dcache chain corruption by adding patch below and
> > exporting _d_rehash from kernel (offcourse, no longer patchless). 
> > 
> > Could this fix a race in patchless code ? 
> > 
> > /Jakob
> > 
> > p.s: A kind soul came up with this. Im not a kernel hacker. 
> > 
> > 
> > --- ./lustre/llite/dcache.c.xxx	2007-09-27 16:04:08.000000000 +0400
> > +++ ./lustre/llite/dcache.c	2008-05-29 11:53:07.000000000 +0400
> > @@ -470,8 +470,8 @@ revalidate_finish:
> >          spin_lock(&dcache_lock);
> >          lock_dentry(de);
> >          __d_drop(de);
> > +        _d_rehash(de);
> >          unlock_dentry(de);
> > -        __d_rehash(de, 0);
> >          spin_unlock(&dcache_lock);this will be deadlock with RHEL4 kernel.

cut from rhel4:

void __d_rehash(struct dentry * entry, int lock)
{
        struct hlist_head *list = d_hash(entry->d_parent,
entry->d_name.hash);

        if (lock)
                spin_lock(&dcache_lock);
        spin_lock(&entry->d_lock);
        entry->d_flags &= ~DCACHE_UNHASHED;
        spin_unlock(&entry->d_lock);
        entry->d_bucket = list;
        hlist_add_head_rcu(&entry->d_hash, list);
        if (lock)
                spin_unlock(&dcache_lock);
}

Jakob Goldbach

2008-Jun-04 09:25 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Wed, 2008-06-04 at 12:10 +0300, Alex Lyashkov wrote:> > > 
> > > --- ./lustre/llite/dcache.c.xxx	2007-09-27 16:04:08.000000000
+0400
> > > +++ ./lustre/llite/dcache.c	2008-05-29 11:53:07.000000000 +0400
> > > @@ -470,8 +470,8 @@ revalidate_finish:
> > >          spin_lock(&dcache_lock);
> > >          lock_dentry(de);
> > >          __d_drop(de);
> > > +        _d_rehash(de);
> > >          unlock_dentry(de);
> > > -        __d_rehash(de, 0);
> > >          spin_unlock(&dcache_lock);
> this will be deadlock with RHEL4 kernel.
> 
> cut from rhel4:
> 
> void __d_rehash(struct dentry * entry, int lock)
> {

But Im not calling __d_rehash. Im calling _d_rehash which Andres suggest
to be defined in lustre_pathcless_compat.h. Your cut is from kernel
pathces anyway. My race happens on _patchless_ client.


/Jakob

Johann Lombardi

2008-Jun-13 08:16 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Wed, Jun 04, 2008 at 11:25:34AM +0200, Jakob Goldbach
wrote:> On Wed, 2008-06-04 at 12:10 +0300, Alex Lyashkov wrote:
> > > > 
> > > > --- ./lustre/llite/dcache.c.xxx	2007-09-27
16:04:08.000000000 +0400
> > > > +++ ./lustre/llite/dcache.c	2008-05-29 11:53:07.000000000
+0400
> > > > @@ -470,8 +470,8 @@ revalidate_finish:
> > > >          spin_lock(&dcache_lock);
> > > >          lock_dentry(de);
> > > >          __d_drop(de);
> > > > +        _d_rehash(de);
> > > >          unlock_dentry(de);
> > > > -        __d_rehash(de, 0);
> > > >          spin_unlock(&dcache_lock);
> > this will be deadlock with RHEL4 kernel.
> > 
> > cut from rhel4:
> > 
> > void __d_rehash(struct dentry * entry, int lock)
> > {
> 
> 
> But Im not calling __d_rehash.
Actually, _d_rehash() is not exported by the kernel, so I assume that you have
patched the kernel. At least, that''s what I understand from the OpenVZ
ticket:
http://bugzilla.openvz.org/show_bug.cgi?id=895#c27

Of course, we cannot do this for patchless clients :(
> Im calling _d_rehash which Andres suggest
> to be defined in lustre_pathcless_compat.h. Your cut is from kernel
> pathces anyway. My race happens on _patchless_ client.
As Yangsheng pointed out, the problem is that d_hash() cannot be used
by Lustre since the dentry_hashtable symbol is not exported.

Johann

Jakob Goldbach

2008-Jun-13 09:05 UTC

head link

[Lustre-discuss] Looping in __d_lookup

> Actually, _d_rehash() is not exported by the kernel, so I assume that you
have
> patched the kernel. At least, that''s what I understand from the
OpenVZ ticket:
> http://bugzilla.openvz.org/show_bug.cgi?id=895#c27
> 
> Of course, we cannot do this for patchless clients :(
> 
True - I patched my kernel breaking the concept of patchless. Initially
just to test that it fixed the problem. 
> > Im calling _d_rehash which Andres suggest
> > to be defined in lustre_pathcless_compat.h. Your cut is from kernel
> > pathces anyway. My race happens on _patchless_ client.
> 
> As Yangsheng pointed out, the problem is that d_hash() cannot be used
> by Lustre since the dentry_hashtable symbol is not exported.
I ran into this one too trying to avoid to patch kernel. Decided to keep
with patched kernel and let you do the Lustre magic :-)  

/Jakob

Andreas Dilger

2008-Jun-13 10:26 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Jun 13, 2008  10:16 +0200, Johann Lombardi wrote:> On Wed, Jun 04, 2008 at 11:25:34AM +0200, Jakob Goldbach wrote:
> > > > > @@ -470,8 +470,8 @@ revalidate_finish:
> > > > >          spin_lock(&dcache_lock);
> > > > >          lock_dentry(de);
> > > > >          __d_drop(de);
> > > > > +        _d_rehash(de);
> > > > >          unlock_dentry(de);
> > > > > -        __d_rehash(de, 0);
> > > > >          spin_unlock(&dcache_lock);
> 
> Actually, _d_rehash() is not exported by the kernel, so I assume that you
have
> patched the kernel. At least, that''s what I understand from the
OpenVZ ticket:
> http://bugzilla.openvz.org/show_bug.cgi?id=895#c27
> 
> Of course, we cannot do this for patchless clients :(
> > Im calling _d_rehash which Andres suggest
> > to be defined in lustre_pathcless_compat.h. Your cut is from kernel
> > pathces anyway. My race happens on _patchless_ client.
> 
> As Yangsheng pointed out, the problem is that d_hash() cannot be used
> by Lustre since the dentry_hashtable symbol is not exported.
Doh!  I looked at this code a couple of times, but didn''t think that
d_hash() would be using an internal pointer.  What could be pretty easy
to do is to extract the value for the dentry_hashtable pointer during
the filesystem init and make our own pointer with the same value.

struct hlist_head *ll_dentry_hashtable;

struct hlist_head *ll_d_hash(struct dentry *parent, unsigned long hash)
{
	hash += ((unsigned long) parent ^ GOLDEN_RATIO_PRIME) / L1_CACHE_BYTES;
	hash = hash ^ ((hash ^ GOLDEN_RATIO_PRIME) >> D_HASHBITS);
	return ll_dentry_hashtable + (hash & D_HASHMASK);
}

static inline void ll_d_rehash(struct dentry *dentry)
{
	dentry->d_flags &= ~DCACHE_UNHASHED;
	hlist_add_head_rcu(&dentry->d_hash, ll_d_hash(dentry->d_parent,
						      dentry->d_name.hash));
}


static int init_lustre_lite()
{
	struct dentry dummy;
	unsigned int dummy_hash;
	struct hlist_node *pprev = NULL, *pprev2;

	/* The first time ll_d_hash() is called, ll_dentry_hashtable == NULL,
	 * so it returns only the hash table bucket offset.  We use this to
	 * figure out the un-exported dentry_hashtable pointer.  This might
	 * be racy (we aren''t guaranteed to still be the first entry on the
	 * hash chain, so it should be called a few times to verify. */
	memset(&dummy, 0, sizeof(dummy));
	spin_lock_init(&dummy.d_lock);
	dummy_hash = ll_d_hash(&dummy, 0);

	while (pprev != pprev2) {
		d_rehash(&dummy);
		pprev2 = pprev;
		pprev = dummy.d_hash.pprev
		hlist_del_rcu(&dummy.d_hash);
	}

	ll_dentry_hashtable = pprev - dummy_hash;




Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Reto Gantenbein

2008-Nov-17 15:07 UTC

head link

[Lustre-discuss] Looping in __d_lookup

Hi everybody

Why can''t this bug being accessed? Even when I login with a bugzilla
account there is the message "You are not authorized to access bug
#15975".

I experience this bug with gentoo-sources-2.6.22 and lustre-1.6.3 and
would like to fix it because of lacking hardware support in the patched
2.6.18 kernel.

Kind regards
Reto Gantenbein

On Wed, 2008-06-04 at 08:21 +0200, Jakob Goldbach wrote:> On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote:
> > Can you please file a bug with the original details, so that this gets
> > fixed in the next release.
> > 
> 
> https://bugzilla.lustre.org/show_bug.cgi?id=15975
> 
> 
> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Andreas Dilger

2008-Nov-17 16:59 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Nov 17, 2008  16:07 +0100, Reto Gantenbein wrote:> Why can''t this bug being accessed? Even when I login with a
bugzilla
> account there is the message "You are not authorized to access bug
> #15975".
This bug was marked private by the customer that filed it.  The
patch was included in the recently-released 1.6.6 version, and
that also includes fixes for a few other statahead problems among
other things, so is the preferred route to solve this problem.
> I experience this bug with gentoo-sources-2.6.22 and lustre-1.6.3 and
> would like to fix it because of lacking hardware support in the patched
> 2.6.18 kernel.
> 
> Kind regards
> Reto Gantenbein
> 
> 
> On Wed, 2008-06-04 at 08:21 +0200, Jakob Goldbach wrote:
> > On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote:
> > > Can you please file a bug with the original details, so that this
gets
> > > fixed in the next release.
> > > 
> > 
> > https://bugzilla.lustre.org/show_bug.cgi?id=15975
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Jakob Goldbach

2008-Nov-18 08:40 UTC

head link

[Lustre-discuss] Looping in __d_lookup

On Mon, 2008-11-17 at 10:59 -0600, Andreas Dilger wrote:> On Nov 17, 2008  16:07 +0100, Reto Gantenbein wrote:
> > Why can''t this bug being accessed? Even when I login with a
bugzilla
> > account there is the message "You are not authorized to access
bug
> > #15975".
> 
> This bug was marked private by the customer that filed it. 
I filed that bug. I never intended it to be private. Its now public
(again).

/Jakob

Lustre discuss - May 2008 - Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup

[Lustre-discuss] Looping in __d_lookup