Hi, I run 1.6.4.3-patchless againt a vanilla 2.6.18 - but with OpenVZ patches on my clients. OpenVZ is virtualization (like vserver, bsd jails etc.). On a regular basis a process get stuck in __d_lookup. When I dig in it seems I''m caught in the hlist_for_each_entry_rcu loop never satisfying the exit-from-loop condition. Any ideas on how to solve this ? Is this even Lustre issues ? dmesg and gdb output below. Thanks /Jakob (gdb) list *__d_lookup+0x10b 0x12f0 is in __d_lookup (fs/dcache.c:1153). 1148 struct dentry *dentry, *found; 1149 1150 rcu_read_lock(); 1151 1152 found = NULL; 1153 hlist_for_each_entry_rcu(dentry, node, head, d_hash) { 1154 struct qstr *qstr; 1155 1156 if (dentry->d_name.hash != hash) 1157 continue; [186124.494329] SysRq: Show Regs [186124.495218] ----------- IPI show regs ----------- [186124.496136] CPU 3, VCPU 0:1 [186124.496804] Modules linked in: simfs vznetdev vzethdev vzrst ip_nat vzcpt ip_conntrack nfnetlink vzdquota vzmon vzdev xt_length ipt_ttl xt_ tcpmss ipt_TCPMSS iptable_mangle xt_multiport xt_limit ipt_tos ipt_REJECT iptable_filter ip_tables x_tables 8021q osc mgc lustre lov lquota mdc ksocklnd ptlrpc obdclass lnet lvfs libcfs bonding xfs [186124.503636] Pid: 22699, comm: find Not tainted 2.6.18.8-openvz-028stab053-bnx2-1.6.7b-arpannounce1 #3 028stab053 [186124.505535] RIP: 0060:[<ffffffff8029b314>] [<ffffffff8029b314>] __d_lookup+0x10b/0x142 [186124.507265] RSP: 0068:ffff810073d63bc8 EFLAGS: 00000282 [186124.508296] RAX: ffff8101016dc298 RBX: ffff8101016dc270 RCX: 0000000000000013 [186124.509768] RDX: 0000000000025ff5 RSI: 00c38320c56a5ff5 RDI: ffff810118b056b0 [186124.511480] RBP: ffff810073d63c08 R08: ffff8100ac9e8000 R09: ffff810118b056b0 [186124.512963] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8101016dc298 [186124.514452] R13: ffff810073d63e38 R14: ffff810118b056b0 R15: ffff810073d63c78 [186124.515931] FS: 00002ba786cb56d0(0000) GS:ffff81012a693340(0000) knlGS:0000000000000000 [186124.517538] CS: 0060 DS: 0000 ES: 0000 CR0: 0000000080050033 [186124.518587] CR2: 0000000000539938 CR3: 0000000073f06000 CR4: 00000000000006e0 [186124.520022] [186124.520023] Call Trace: [186124.521245] [<ffffffff8029105d>] do_lookup+0x2c/0x193 [186124.522363] [<ffffffff80293122>] __link_path_walk+0xb07/0x10ac [186124.523642] [<ffffffff8029374e>] link_path_walk+0x87/0x140 [186124.524818] [<ffffffff80293c76>] do_path_lookup+0x2d3/0x2f8 [186124.526000] [<ffffffff802945e2>] __user_walk_fd+0x41/0x62 [186124.527156] [<ffffffff8028cecb>] vfs_lstat_fd+0x24/0x5a [186124.528278] [<ffffffff8028cf23>] sys_newlstat+0x22/0x3c [186124.529383] [<ffffffff80209902>] system_call+0x7e/0x83 [186124.530362] DWARF2 unwinder stuck at system_call+0x7e/0x83 [186124.531460] Leftover inexact backtrace: [186124.532563]
Hello! On May 15, 2008, at 5:34 AM, Jakob Goldbach wrote:> On a regular basis a process get stuck in __d_lookup. When I dig in it > seems I''m caught in the hlist_for_each_entry_rcu loop never satisfying > the exit-from-loop condition.Hm, so you actually have a circular loop? I wonder if you can e.g. run your (or similar setup) in vmware on or other environment that allows to attach gdb to running kernel, reproduce and then see what''s going on.> Any ideas on how to solve this ? Is this even Lustre issues ?I did some searches and while I remember some similar thing being fixed, I cannot find it and I think it was only happening on server anyway. Might be somebody else can remember more. Bye, Oleg
> > Hm, so you actually have a circular loop?Yes - I''ve asked for help on the OpenVZ list as well - Pavel Emelyanov provided me with a debug patch. This patch has now confirmed the loop in __d_lookup.> I wonder if you can e.g. run > your (or similar setup) > in vmware [snip]Hmm, I''m not really vmware enabled but I''ll see what I can do. What should I look for in the debugger ? Thanks /Jakob
On Tue, May 20, 2008 at 10:24:25PM +0200, Jakob Goldbach wrote:>> Hm, so you actually have a circular loop? >Yes - I''ve asked for help on the OpenVZ list as well - Pavel Emelyanov >provided me with a debug patch. This patch has now confirmed the loop in >__d_lookup.we''re also seeing __d_lookup soft lockups. 4 are attached. one was a cached_lookup but the first line of that fn is a __d_lookup which is where I suspect is where the real soft lockup occurred. after a while the node is toast and has to be rebooted. kernel is 2.6.23.17 with patchless lustre 1.6.4.3, modified with patches from bz 14250 (attachment 14109) and 13378 (attachment 12276). using o2ib, x86_64, rhel5.1 cheers, robin ps. I''m glad to see some Lustre support for modern linux kernels and would like to see a lot more! the VM''s of distro kernel sometimes behave erratically enough that we can''t use them on Lustre client nodes in production. -------------- next part -------------- May 6 14:36:53 x6 kernel: BUG: soft lockup - CPU#2 stuck for 11s! [rsync:14200] May 6 14:36:53 x6 kernel: CPU 2: May 6 14:36:53 x6 kernel: Modules linked in: ext3 jbd loop osc mgc lustre lov lquota mdc ko2iblnd ptlrpc obdclass lnet lvfs libcfs rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc dm_mirror dm_multipath dm_mod battery ac sg sd_mod i2c_i801 i2c_core ahci ata_piix libata ib_mthca scsi_mod i5000_edac ib_mad edac_core ib_core ehci_hcd shpchp uhci_hcd rng_core button nfs nfs_acl lockd sunrpc e1000 May 6 14:36:53 x6 kernel: Pid: 14200, comm: rsync Not tainted 2.6.23.17 #1 May 6 14:36:53 x6 kernel: RIP: 0010:[<ffffffff80282cef>] [<ffffffff80282cef>] __d_lookup+0xed/0x110 May 6 14:36:53 x6 kernel: RSP: 0018:ffff81011b7ebbe8 EFLAGS: 00000286 May 6 14:36:53 x6 kernel: RAX: ffff81011d71aea0 RBX: ffff81011d71aea0 RCX: 0000000000000014 May 6 14:36:53 x6 kernel: RDX: 00000000000cb941 RSI: ffff81011b7ebcb8 RDI: ffff810120522cb8 May 6 14:36:53 x6 kernel: RBP: ffff8101d2707408 R08: 0000000000000007 R09: 0000000000000007 May 6 14:36:53 x6 kernel: R10: 00007fffffffb750 R11: ffffffff802c7d3e R12: ffffffff8853dc60 May 6 14:36:53 x6 kernel: R13: ffff8100992a5470 R14: ffff81024adb3af8 R15: ffff810219defe80 May 6 14:36:53 x6 kernel: FS: 00002aaaab6296e0(0000) GS:ffff81025fc6d840(0000) knlGS:0000000000000000 May 6 14:36:53 x6 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 6 14:36:53 x6 kernel: CR2: 000000000079f018 CR3: 000000018c1d0000 CR4: 00000000000006e0 May 6 14:36:53 x6 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 6 14:36:54 x6 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 May 6 14:36:54 x6 kernel: May 6 14:36:54 x6 kernel: Call Trace: May 6 14:36:54 x6 kernel: [<ffffffff80282cd5>] __d_lookup+0xd3/0x110 May 6 14:36:54 x6 kernel: [<ffffffff80279b1e>] do_lookup+0x2a/0x1ae May 6 14:36:54 x6 kernel: [<ffffffff8027bc4d>] __link_path_walk+0x924/0xde9 May 6 14:36:54 x6 kernel: [<ffffffff8027c16a>] link_path_walk+0x58/0xe0 May 6 14:36:54 x6 kernel: [<ffffffff8027c536>] do_path_lookup+0x1ab/0x1cf May 6 14:36:54 x6 kernel: [<ffffffff8027afe4>] getname+0x14c/0x191 May 6 14:36:54 x6 kernel: [<ffffffff8027cd67>] __user_walk_fd+0x37/0x53 May 6 14:36:54 x6 kernel: [<ffffffff80275a7b>] vfs_lstat_fd+0x18/0x47 May 6 14:36:54 x6 kernel: [<ffffffff80275c6d>] sys_newlstat+0x19/0x31 May 6 14:36:54 x6 kernel: [<ffffffff8020b3ae>] system_call+0x7e/0x83 May 8 09:32:36 x10 kernel: BUG: soft lockup - CPU#1 stuck for 11s! [bonnie++.mpi:31720] May 8 09:32:36 x10 kernel: CPU 1: May 8 09:32:36 x10 kernel: Modules linked in: loop osc mgc lustre lov lquota mdc ko2iblnd ptlrpc obdclass lnet lvfs libcfs rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc dm_mirror dm_multipath dm_mod battery ac sg sd_mod i5000_edac edac_core ehci_hcd ahci rng_core ata_piix libata i2c_i801 scsi_mod i2c_core ib_mthca ib_mad shpchp uhci_hcd ib_core button nfs nfs_acl lockd sunrpc e1000 May 8 09:32:36 x10 kernel: Pid: 31720, comm: bonnie++.mpi Not tainted 2.6.23.17 #1 May 8 09:32:36 x10 kernel: RIP: 0010:[<ffffffff80282cef>] [<ffffffff80282cef>] __d_lookup+0xed/0x110 May 8 09:32:36 x10 kernel: RSP: 0018:ffff81014248fd78 EFLAGS: 00000286 May 8 09:32:36 x10 kernel: RAX: ffff8101a79cc500 RBX: ffff8101a79cc500 RCX: 0000000000000014 May 8 09:32:36 x10 kernel: RDX: 00000000000ac16f RSI: ffff81014248feb8 RDI: ffff8101ff617898 May 8 09:32:36 x10 kernel: RBP: ffff8101ff617898 R08: 0000000000000000 R09: ffff81025fd3d5c0 May 8 09:32:37 x10 kernel: R10: 0000000000000001 R11: ffffffff802c7d3e R12: ffffffff8027c1e0 May 8 09:32:37 x10 kernel: R13: ffff81014248fea8 R14: 0000000000000000 R15: ffff81025fd3d5c0 May 8 09:32:37 x10 kernel: FS: 00002aaaac95a990(0000) GS:ffff81025fc6de40(0000) knlGS:0000000000000000 May 8 09:32:37 x10 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 8 09:32:37 x10 kernel: CR2: 00002aaaab1abe20 CR3: 00000001b5525000 CR4: 00000000000006e0 May 8 09:32:37 x10 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 8 09:32:37 x10 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 May 8 09:32:37 x10 kernel: May 8 09:32:37 x10 kernel: Call Trace: May 8 09:32:37 x10 kernel: [<ffffffff80279cba>] cached_lookup+0x18/0x86 May 8 09:32:37 x10 kernel: [<ffffffff80279d87>] lookup_hash+0x5f/0xb5 May 8 09:32:37 x10 kernel: [<ffffffff8027d146>] open_namei+0xf4/0x631 May 8 09:32:37 x10 kernel: [<ffffffff802738b1>] file_move+0x1d/0x51 May 8 09:32:37 x10 kernel: [<ffffffff8027148a>] __dentry_open+0x101/0x1ad May 8 09:32:37 x10 kernel: [<ffffffff802715dc>] do_filp_open+0x1c/0x38 May 8 09:32:37 x10 kernel: [<ffffffff80271371>] get_unused_fd_flags+0x115/0x126 May 8 09:32:37 x10 kernel: [<ffffffff8027163e>] do_sys_open+0x46/0xca May 8 09:32:37 x10 kernel: [<ffffffff8020b3ae>] system_call+0x7e/0x83 May 16 06:10:27 x3 kernel: BUG: soft lockup - CPU#1 stuck for 11s! [python:2684] May 16 06:10:27 x3 kernel: CPU 1: May 16 06:10:27 x3 kernel: Modules linked in: loop osc mgc lustre lov lquota mdc ko2iblnd ptlrpc obdclass lnet lvfs libcfs rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc dm_mirror dm_multipath dm_mod battery ac sg sd_mod i5000_edac rng_core ata_piix ahci edac_core ib_mthca uhci_hcd i2c_i801 libata i2c_core ib_mad scsi_mod ib_core button shpchp ehci_hcd nfs nfs_acl lockd sunrpc e1000 May 16 06:10:28 x3 kernel: Pid: 2684, comm: python Not tainted 2.6.23.17 #1 May 20 12:27:55 x14 kernel: BUG: soft lockup - CPU#0 stuck for 11s! [ps:18064] May 20 12:27:55 x14 kernel: CPU 0: May 20 12:27:55 x14 kernel: Modules linked in: loop osc mgc lustre lov lquota mdc ko2iblnd ptlrpc obdclass lnet lvfs libcfs rdma_ucm ib_ucm rdma_cm iw_cm ib_addr ib_srp ib_ipoib ib_cm ib_sa ib_uverbs ib_umad binfmt_misc dm_mirror dm_multipath dm_mod battery ac sg sd_mod i5000_edac edac_core rng_core ata_piix ehci_hcd ahci libata ib_mthca shpchp uhci_hcd scsi_mod ib_mad i2c_i801 ib_core i2c_core button nfs nfs_acl lockd sunrpc e1000 May 20 12:27:55 x14 kernel: Pid: 18064, comm: ps Not tainted 2.6.23.17 #1 May 20 12:27:55 x14 kernel: RIP: 0010:[<ffffffff80282cef>] [<ffffffff80282cef>] __d_lookup+0xed/0x110 May 20 12:27:55 x14 kernel: RSP: 0018:ffff81023bea7bb8 EFLAGS: 00000286 May 20 12:27:55 x14 kernel: RAX: ffff81018ce277a0 RBX: ffff81018ce277a0 RCX: 0000000000000014 May 20 12:27:55 x14 kernel: RDX: 000000000008af04 RSI: ffff81023bea7c88 RDI: ffff810202099dd8 May 20 12:27:55 x14 kernel: RBP: ffff8101c18c49c0 R08: 0000000000000007 R09: 0000000000000007 May 20 12:27:55 x14 kernel: R10: ffff81025fef7a18 R11: ffffffff802c7d3e R12: ffffffff88061600 May 20 12:27:55 x14 kernel: R13: ffffffff88061640 R14: ffff8101c18c49c0 R15: ffffffff8045a22d May 20 12:27:55 x14 kernel: FS: 00002aaaab43cf30(0000) GS:ffffffff804d4000(0000) knlGS:0000000000000000 May 20 12:27:55 x14 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 20 12:27:55 x14 kernel: CR2: 00002aaaab17a548 CR3: 000000012dad3000 CR4: 00000000000006e0 May 20 12:27:56 x14 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 20 12:27:56 x14 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 May 20 12:27:56 x14 kernel: May 20 12:27:56 x14 kernel: Call Trace: May 20 12:27:56 x14 kernel: [<ffffffff80282cd5>] __d_lookup+0xd3/0x110 May 20 12:27:56 x14 kernel: [<ffffffff80279b1e>] do_lookup+0x2a/0x1ae May 20 12:27:56 x14 kernel: [<ffffffff8027bc4d>] __link_path_walk+0x924/0xde9 May 20 12:27:56 x14 kernel: [<ffffffff8027c16a>] link_path_walk+0x58/0xe0 May 20 12:27:56 x14 kernel: [<ffffffff80271371>] get_unused_fd_flags+0x115/0x126 May 20 12:27:56 x14 kernel: [<ffffffff802d742e>] bitmap_scnprintf+0x8d/0xac May 20 12:27:56 x14 kernel: [<ffffffff8027c536>] do_path_lookup+0x1ab/0x1cf May 20 12:27:56 x14 kernel: [<ffffffff8027cf26>] __path_lookup_intent_open+0x56/0x96 May 20 12:27:56 x14 kernel: [<ffffffff8027d0ca>] open_namei+0x78/0x631 May 20 12:27:56 x14 kernel: [<ffffffff80257561>] __alloc_pages+0x6c/0x2c1 May 20 12:27:56 x14 kernel: [<ffffffff802715dc>] do_filp_open+0x1c/0x38 May 20 12:27:56 x14 kernel: [<ffffffff80271371>] get_unused_fd_flags+0x115/0x126 May 20 12:27:56 x14 kernel: [<ffffffff8027163e>] do_sys_open+0x46/0xca May 20 12:27:56 x14 kernel: [<ffffffff8020b3ae>] system_call+0x7e/0x83 May 20 12:27:56 x14 kernel:
> > kernel is 2.6.23.17 with patchless lustre 1.6.4.3,I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel. Or at least that is what I thought. OpenVz patch effectively makes the kernel a 2.6.18++ kernel because they add features from newer kernels in their maintained 2.6.18 based kernel. So the lockup in __d_lookup may just relate to newer patchless clients. I got a debug patch from the OpenVz community which indicate dcache chain corruption in a lustre code path. Patch snippet is --- ./fs/dcache.c.ddebug2 2008-05-21 14:52:15.000000000 +0400 +++ ./fs/dcache.c 2008-05-21 15:10:06.000000000 +0400 @@ -1350,6 +1350,18 @@ static void __d_rehash(struct dentry * e { entry->d_flags &= ~DCACHE_UNHASHED; + if (!spin_is_locked(&dcache_lock)) { + printk(KERN_ERR "Dcache lock is not taken on add\n"); + dump_stack(); + } else if (list->first != NULL && + list->first->pprev != &list->first) { + printk(KERN_ERR "Dcache chain corruption:\n"); + printk(KERN_ERR "Chain %p --next-> %p\n", + list, list->first); + printk(KERN_ERR "First %p <-pprev- %p\n", + list->first, list->first->pprev); + dump_stack(); + } hlist_add_head_rcu(&entry->d_hash, list); } and stack trace [ 6447.548789] Dcache chain corruption: [ 6447.549529] Chain ffff8100010de880 --next-> ffff8100b4ce00b0 [ 6447.550699] First ffff8100b4ce00b0 <-pprev- 0000000000200200 [ 6447.551711] [ 6447.551713] Call Trace: [ 6447.552809] [<ffffffff8020ae20>] show_trace+0xae/0x360 [ 6447.553784] [<ffffffff8020b0e7>] dump_stack+0x15/0x17 [ 6447.554727] [<ffffffff8029ee94>] __d_rehash+0x75/0x97 [ 6447.555797] [<ffffffff8029ef2a>] d_rehash+0x74/0x91 [ 6447.556846] [<ffffffff883b4c6a>] :lustre:ll_revalidate_it+0xa1a/0xd90 [ 6447.557966] [<ffffffff883b529c>] :lustre:ll_revalidate_nd+0x2bc/0x360 [ 6447.559082] [<ffffffff80295741>] do_lookup+0x15d/0x193 [ 6447.560142] [<ffffffff80296fd9>] __link_path_walk+0x409/0x10ac [snip] See details in http://bugzilla.openvz.org/show_bug.cgi?id=895 /Jakob
On May 21, 2008 21:05 +0200, Jakob Goldbach wrote:> I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel. > Or at least that is what I thought. OpenVz patch effectively makes the > kernel a 2.6.18++ kernel because they add features from newer kernels in > their maintained 2.6.18 based kernel. > > So the lockup in __d_lookup may just relate to newer patchless clients. > > I got a debug patch from the OpenVz community which indicate dcache > chain corruption in a lustre code path.Do you have the fixes for the statahead patches, disable statahead via "echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you try out the v1_6_5_RC4 tag from CVS (which also contains those patches)? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Wed, 2008-05-21 at 20:04 -0600, Andreas Dilger wrote:> Do you have the fixes for the statahead patches, disable statahead via > "echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you try out the > v1_6_5_RC4 tag from CVS (which also contains those patches)? >I''m running without statahead fixes. We''ll try to disable statahead for now and see if I can trigger the lockup again. Thanks, Jakob
On Wed, May 21, 2008 at 08:04:27PM -0600, Andreas Dilger wrote:>On May 21, 2008 21:05 +0200, Jakob Goldbach wrote: >> I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel. >> Or at least that is what I thought. OpenVz patch effectively makes the >> kernel a 2.6.18++ kernel because they add features from newer kernels in >> their maintained 2.6.18 based kernel. >> >> So the lockup in __d_lookup may just relate to newer patchless clients. >> >> I got a debug patch from the OpenVz community which indicate dcache >> chain corruption in a lustre code path. > >Do you have the fixes for the statahead patches, disable statahead via >"echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you try out the >v1_6_5_RC4 tag from CVS (which also contains those patches)?all of our __d_lookup soft lockups have been when running with 0 in /proc/fs/lustre/llite/*/statahead_max I''ll try v1_6_5_RC4 now - should be fun :-) cheers, robin
On Thu, 2008-05-22 at 05:46 +0200, Jakob Goldbach wrote:> On Wed, 2008-05-21 at 20:04 -0600, Andreas Dilger wrote: > > > Do you have the fixes for the statahead patches, disable statahead via > > "echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you try out the > > v1_6_5_RC4 tag from CVS (which also contains those patches)? > > > > I''m running without statahead fixes. We''ll try to disable statahead for > now and see if I can trigger the lockup again. >Unfortunately, I was able to trigger this error with statahead disabled. /Jakob
On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote:> > > > kernel is 2.6.23.17 with patchless lustre 1.6.4.3, > > I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel. > Or at least that is what I thought. OpenVz patch effectively makes the > kernel a 2.6.18++ kernel because they add features from newer kernels in > their maintained 2.6.18 based kernel. > > So the lockup in __d_lookup may just relate to newer patchless clients. > > I got a debug patch from the OpenVz community which indicate dcache > chain corruption in a lustre code path. > > Patch snippet is > > --- ./fs/dcache.c.ddebug2 2008-05-21 14:52:15.000000000 +0400 > +++ ./fs/dcache.c 2008-05-21 15:10:06.000000000 +0400 > @@ -1350,6 +1350,18 @@ static void __d_rehash(struct dentry * e > { > > entry->d_flags &= ~DCACHE_UNHASHED; > + if (!spin_is_locked(&dcache_lock)) { > + printk(KERN_ERR "Dcache lock is not taken on add\n"); > + dump_stack(); > + } else if (list->first != NULL && > + list->first->pprev != &list->first) { > + printk(KERN_ERR "Dcache chain corruption:\n"); > + printk(KERN_ERR "Chain %p --next-> %p\n", > + list, list->first); > + printk(KERN_ERR "First %p <-pprev- %p\n", > + list->first, list->first->pprev); > + dump_stack(); > + } > hlist_add_head_rcu(&entry->d_hash, list); > } > > and stack trace > > [ 6447.548789] Dcache chain corruption: > [ 6447.549529] Chain ffff8100010de880 --next-> ffff8100b4ce00b0 > [ 6447.550699] First ffff8100b4ce00b0 <-pprev- 0000000000200200 > [ 6447.551711] > [ 6447.551713] Call Trace: > [ 6447.552809] [<ffffffff8020ae20>] show_trace+0xae/0x360 > [ 6447.553784] [<ffffffff8020b0e7>] dump_stack+0x15/0x17 > [ 6447.554727] [<ffffffff8029ee94>] __d_rehash+0x75/0x97 > [ 6447.555797] [<ffffffff8029ef2a>] d_rehash+0x74/0x91 > [ 6447.556846] [<ffffffff883b4c6a>] :lustre:ll_revalidate_it+0xa1a/0xd90 > [ 6447.557966] [<ffffffff883b529c>] :lustre:ll_revalidate_nd+0x2bc/0x360 > [ 6447.559082] [<ffffffff80295741>] do_lookup+0x15d/0x193 > [ 6447.560142] [<ffffffff80296fd9>] __link_path_walk+0x409/0x10ac > [snip] >This patch and backtrace say - dcache chain was damaged _before_ enter to lustre, lustre start add entry to new position in dentry cache, and find damaged entry in list. -- Alex Lyashkov <Alexey.lyashkov at sun.com> Lustre Group, Sun Microsystems
> This patch and backtrace say - dcache chain was damaged _before_ enter > to lustre, lustre start add entry to new position in dentry cache, and > find damaged entry in list. >Allright. Any idea on how to detect the bad entry upon insertion ? Thanks, Jakob
On Thu, May 22, 2008 at 12:51:42AM -0400, Robin Humble wrote:>On Wed, May 21, 2008 at 08:04:27PM -0600, Andreas Dilger wrote: >>On May 21, 2008 21:05 +0200, Jakob Goldbach wrote: >>> I''m running 1.6.4.3 patchless as well against an 2.6.18 vanilla kernel. >>> Or at least that is what I thought. OpenVz patch effectively makes the >>> kernel a 2.6.18++ kernel because they add features from newer kernels in >>> their maintained 2.6.18 based kernel. >>> >>> So the lockup in __d_lookup may just relate to newer patchless clients. >>> >>> I got a debug patch from the OpenVz community which indicate dcache >>> chain corruption in a lustre code path. >> >>Do you have the fixes for the statahead patches, disable statahead via >>"echo 0 > /proc/fs/lustre/llite/*/statahead_max", or can you try out the >>v1_6_5_RC4 tag from CVS (which also contains those patches)? > >all of our __d_lookup soft lockups have been when running with 0 in >/proc/fs/lustre/llite/*/statahead_max > >I''ll try v1_6_5_RC4 now - should be fun :-)unfortunately patched v1_6_5_RC4 doesn''t appear to be workable on 2.6.23 and 2.6.24 kernels. see the other thread. cheers, robin
> On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: > > > > So the lockup in __d_lookup may just relate to newer patchless clients. > >I got rid of my dcache chain corruption by adding patch below and exporting _d_rehash from kernel (offcourse, no longer patchless). Could this fix a race in patchless code ? /Jakob p.s: A kind soul came up with this. Im not a kernel hacker. --- ./lustre/llite/dcache.c.xxx 2007-09-27 16:04:08.000000000 +0400 +++ ./lustre/llite/dcache.c 2008-05-29 11:53:07.000000000 +0400 @@ -470,8 +470,8 @@ revalidate_finish: spin_lock(&dcache_lock); lock_dentry(de); __d_drop(de); + _d_rehash(de); unlock_dentry(de); - __d_rehash(de, 0); spin_unlock(&dcache_lock);
On Jun 04, 2008 00:19 +0200, Jakob Goldbach wrote:> > On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: > > > > > > So the lockup in __d_lookup may just relate to newer patchless clients. > > I got rid of my dcache chain corruption by adding patch below and > exporting _d_rehash from kernel (offcourse, no longer patchless). > > Could this fix a race in patchless code ? > > /Jakob > > p.s: A kind soul came up with this. Im not a kernel hacker. > > > --- ./lustre/llite/dcache.c.xxx 2007-09-27 16:04:08.000000000 +0400 > +++ ./lustre/llite/dcache.c 2008-05-29 11:53:07.000000000 +0400 > @@ -470,8 +470,8 @@ revalidate_finish: > spin_lock(&dcache_lock); > lock_dentry(de); > __d_drop(de); > + _d_rehash(de); > unlock_dentry(de); > - __d_rehash(de, 0); > spin_unlock(&dcache_lock);It seems (with 2.6.18-RHEL5 at least) that it is possible to implement _d_rehash() without any exported kernel functions: static inline void _d_rehash(struct dentry *dentry) { dentry->d_flags &= ~DCACHE_UNHASHED; hlist_add_head_rcu(&dentry->d_hash, d_hash(entry->d_parent, entry->d_name.hash); } This should be added to lustre_patchless_compat.h. We have an existing function "d_rehash_cond(dentry, lock)" which is only ever called from Lustre when we don''t want to lock the dcache_lock, but the above usage is racy. This should be replaced with _d_rehash() as above for patchless clients (where d_rehash_cond() is never implemented). Can you please file a bug with the original details, so that this gets fixed in the next release. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote:> Can you please file a bug with the original details, so that this gets > fixed in the next release. >https://bugzilla.lustre.org/show_bug.cgi?id=15975
On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote:> On Jun 04, 2008 00:19 +0200, Jakob Goldbach wrote: > > > On Wed, 2008-05-21 at 21:05 +0200, Jakob Goldbach wrote: > > > > > > > > So the lockup in __d_lookup may just relate to newer patchless clients. > > > > I got rid of my dcache chain corruption by adding patch below and > > exporting _d_rehash from kernel (offcourse, no longer patchless). > > > > Could this fix a race in patchless code ? > > > > /Jakob > > > > p.s: A kind soul came up with this. Im not a kernel hacker. > > > > > > --- ./lustre/llite/dcache.c.xxx 2007-09-27 16:04:08.000000000 +0400 > > +++ ./lustre/llite/dcache.c 2008-05-29 11:53:07.000000000 +0400 > > @@ -470,8 +470,8 @@ revalidate_finish: > > spin_lock(&dcache_lock); > > lock_dentry(de); > > __d_drop(de); > > + _d_rehash(de); > > unlock_dentry(de); > > - __d_rehash(de, 0); > > spin_unlock(&dcache_lock);this will be deadlock with RHEL4 kernel. cut from rhel4: void __d_rehash(struct dentry * entry, int lock) { struct hlist_head *list = d_hash(entry->d_parent, entry->d_name.hash); if (lock) spin_lock(&dcache_lock); spin_lock(&entry->d_lock); entry->d_flags &= ~DCACHE_UNHASHED; spin_unlock(&entry->d_lock); entry->d_bucket = list; hlist_add_head_rcu(&entry->d_hash, list); if (lock) spin_unlock(&dcache_lock); }
On Wed, 2008-06-04 at 12:10 +0300, Alex Lyashkov wrote:> > > > > > --- ./lustre/llite/dcache.c.xxx 2007-09-27 16:04:08.000000000 +0400 > > > +++ ./lustre/llite/dcache.c 2008-05-29 11:53:07.000000000 +0400 > > > @@ -470,8 +470,8 @@ revalidate_finish: > > > spin_lock(&dcache_lock); > > > lock_dentry(de); > > > __d_drop(de); > > > + _d_rehash(de); > > > unlock_dentry(de); > > > - __d_rehash(de, 0); > > > spin_unlock(&dcache_lock); > this will be deadlock with RHEL4 kernel. > > cut from rhel4: > > void __d_rehash(struct dentry * entry, int lock) > {But Im not calling __d_rehash. Im calling _d_rehash which Andres suggest to be defined in lustre_pathcless_compat.h. Your cut is from kernel pathces anyway. My race happens on _patchless_ client. /Jakob
On Wed, Jun 04, 2008 at 11:25:34AM +0200, Jakob Goldbach wrote:> On Wed, 2008-06-04 at 12:10 +0300, Alex Lyashkov wrote: > > > > > > > > --- ./lustre/llite/dcache.c.xxx 2007-09-27 16:04:08.000000000 +0400 > > > > +++ ./lustre/llite/dcache.c 2008-05-29 11:53:07.000000000 +0400 > > > > @@ -470,8 +470,8 @@ revalidate_finish: > > > > spin_lock(&dcache_lock); > > > > lock_dentry(de); > > > > __d_drop(de); > > > > + _d_rehash(de); > > > > unlock_dentry(de); > > > > - __d_rehash(de, 0); > > > > spin_unlock(&dcache_lock); > > this will be deadlock with RHEL4 kernel. > > > > cut from rhel4: > > > > void __d_rehash(struct dentry * entry, int lock) > > { > > > But Im not calling __d_rehash.Actually, _d_rehash() is not exported by the kernel, so I assume that you have patched the kernel. At least, that''s what I understand from the OpenVZ ticket: http://bugzilla.openvz.org/show_bug.cgi?id=895#c27 Of course, we cannot do this for patchless clients :(> Im calling _d_rehash which Andres suggest > to be defined in lustre_pathcless_compat.h. Your cut is from kernel > pathces anyway. My race happens on _patchless_ client.As Yangsheng pointed out, the problem is that d_hash() cannot be used by Lustre since the dentry_hashtable symbol is not exported. Johann
> Actually, _d_rehash() is not exported by the kernel, so I assume that you have > patched the kernel. At least, that''s what I understand from the OpenVZ ticket: > http://bugzilla.openvz.org/show_bug.cgi?id=895#c27 > > Of course, we cannot do this for patchless clients :( >True - I patched my kernel breaking the concept of patchless. Initially just to test that it fixed the problem.> > Im calling _d_rehash which Andres suggest > > to be defined in lustre_pathcless_compat.h. Your cut is from kernel > > pathces anyway. My race happens on _patchless_ client. > > As Yangsheng pointed out, the problem is that d_hash() cannot be used > by Lustre since the dentry_hashtable symbol is not exported.I ran into this one too trying to avoid to patch kernel. Decided to keep with patched kernel and let you do the Lustre magic :-) /Jakob
On Jun 13, 2008 10:16 +0200, Johann Lombardi wrote:> On Wed, Jun 04, 2008 at 11:25:34AM +0200, Jakob Goldbach wrote: > > > > > @@ -470,8 +470,8 @@ revalidate_finish: > > > > > spin_lock(&dcache_lock); > > > > > lock_dentry(de); > > > > > __d_drop(de); > > > > > + _d_rehash(de); > > > > > unlock_dentry(de); > > > > > - __d_rehash(de, 0); > > > > > spin_unlock(&dcache_lock); > > Actually, _d_rehash() is not exported by the kernel, so I assume that you have > patched the kernel. At least, that''s what I understand from the OpenVZ ticket: > http://bugzilla.openvz.org/show_bug.cgi?id=895#c27 > > Of course, we cannot do this for patchless clients :(> > Im calling _d_rehash which Andres suggest > > to be defined in lustre_pathcless_compat.h. Your cut is from kernel > > pathces anyway. My race happens on _patchless_ client. > > As Yangsheng pointed out, the problem is that d_hash() cannot be used > by Lustre since the dentry_hashtable symbol is not exported.Doh! I looked at this code a couple of times, but didn''t think that d_hash() would be using an internal pointer. What could be pretty easy to do is to extract the value for the dentry_hashtable pointer during the filesystem init and make our own pointer with the same value. struct hlist_head *ll_dentry_hashtable; struct hlist_head *ll_d_hash(struct dentry *parent, unsigned long hash) { hash += ((unsigned long) parent ^ GOLDEN_RATIO_PRIME) / L1_CACHE_BYTES; hash = hash ^ ((hash ^ GOLDEN_RATIO_PRIME) >> D_HASHBITS); return ll_dentry_hashtable + (hash & D_HASHMASK); } static inline void ll_d_rehash(struct dentry *dentry) { dentry->d_flags &= ~DCACHE_UNHASHED; hlist_add_head_rcu(&dentry->d_hash, ll_d_hash(dentry->d_parent, dentry->d_name.hash)); } static int init_lustre_lite() { struct dentry dummy; unsigned int dummy_hash; struct hlist_node *pprev = NULL, *pprev2; /* The first time ll_d_hash() is called, ll_dentry_hashtable == NULL, * so it returns only the hash table bucket offset. We use this to * figure out the un-exported dentry_hashtable pointer. This might * be racy (we aren''t guaranteed to still be the first entry on the * hash chain, so it should be called a few times to verify. */ memset(&dummy, 0, sizeof(dummy)); spin_lock_init(&dummy.d_lock); dummy_hash = ll_d_hash(&dummy, 0); while (pprev != pprev2) { d_rehash(&dummy); pprev2 = pprev; pprev = dummy.d_hash.pprev hlist_del_rcu(&dummy.d_hash); } ll_dentry_hashtable = pprev - dummy_hash; Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Hi everybody Why can''t this bug being accessed? Even when I login with a bugzilla account there is the message "You are not authorized to access bug #15975". I experience this bug with gentoo-sources-2.6.22 and lustre-1.6.3 and would like to fix it because of lacking hardware support in the patched 2.6.18 kernel. Kind regards Reto Gantenbein On Wed, 2008-06-04 at 08:21 +0200, Jakob Goldbach wrote:> On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote: > > Can you please file a bug with the original details, so that this gets > > fixed in the next release. > > > > https://bugzilla.lustre.org/show_bug.cgi?id=15975 > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On Nov 17, 2008 16:07 +0100, Reto Gantenbein wrote:> Why can''t this bug being accessed? Even when I login with a bugzilla > account there is the message "You are not authorized to access bug > #15975".This bug was marked private by the customer that filed it. The patch was included in the recently-released 1.6.6 version, and that also includes fixes for a few other statahead problems among other things, so is the preferred route to solve this problem.> I experience this bug with gentoo-sources-2.6.22 and lustre-1.6.3 and > would like to fix it because of lacking hardware support in the patched > 2.6.18 kernel. > > Kind regards > Reto Gantenbein > > > On Wed, 2008-06-04 at 08:21 +0200, Jakob Goldbach wrote: > > On Tue, 2008-06-03 at 17:05 -0600, Andreas Dilger wrote: > > > Can you please file a bug with the original details, so that this gets > > > fixed in the next release. > > > > > > > https://bugzilla.lustre.org/show_bug.cgi?id=15975 > > > > > > > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Mon, 2008-11-17 at 10:59 -0600, Andreas Dilger wrote:> On Nov 17, 2008 16:07 +0100, Reto Gantenbein wrote: > > Why can''t this bug being accessed? Even when I login with a bugzilla > > account there is the message "You are not authorized to access bug > > #15975". > > This bug was marked private by the customer that filed it.I filed that bug. I never intended it to be private. Its now public (again). /Jakob