michael.a.jaquays at verizon.com
2010-Feb-18 16:33 UTC
[Ocfs2-users] Kernel Panic ocfs2_inode_lock_full
All, I have a 3 node cluster that is experiencing kernel panics once every few days. We are sharing some of the ocfs2 filesystems via nfs to some web app servers. The app servers mount the filesystems with the nordirplus option. Are there any known pitfalls with using an nfs4 server and ocfs2? I haven't seen a case where all three nodes are down at the same time, but the issue seems to travel from node to node. Here are the node details: OS: RHEL5.4 Kernel: 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux OCFS2 Packages: ocfs2console-1.4.3-1.el5 ocfs2-tools-1.4.3-1.el5 ocfs2-2.6.18-164.11.1.el5-1.4.4-1.el5 The following is always logged in /var/log/messages right before the node panics: kernel: (11915,0):ocfs2_inode_lock_update:1970 ERROR: bug expression: inode->i_generation != le32_to_cpu(fe->i_generation) kernel: (11915,0):ocfs2_inode_lock_update:1970 ERROR: Invalid dinode 446146 disk generation: 1276645928 inode->i_generation: 1276645926 kernel: ----------- [cut here ] --------- [please bite here ] --------- The following is part of the kernel panic: Call Trace: [<ffffffff885a2940>] :ocfs2:ocfs2_delete_inode+0x187/0x73f [<ffffffff885a27b9>] :ocfs2:ocfs2_delete_inode+0x0/0x73f [<ffffffff8002f463>] generic_delete_inode+0xc6/0x143 [<ffffffff885a22e3>] :ocfs2:ocfs2_drop_inode+0xca/0x12b [<ffffffff885a693f>] :ocfs2:ocfs2_complete_recovery+0x77e/0x910 [<ffffffff885a61c1>] :ocfs2:ocfs2_complete_recovery+0x0/0x910 [<ffffffff8004d8ed>] run_workqueue+0x94/0xe4 [<ffffffff8004a12f>] worker_thread+0x0/0x122 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff8004a21f>] worker_thread+0xf0/0x122 [<ffffffff8008c86c>] default_wake_function+0x0/0xe [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032950>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032852>] kthread+0x0/0x132 [<ffffffff8005dfa6>] child_rip+0x0/0x11 Code: 0f 0b 68 1b 3d 5c 88 c2 b2 07 48 83 7b 48 00 75 0a f6 43 2c RIP [<ffffffff885928f5>] :ocfs2:ocfs2_inode_lock_full+0x99e/0xe3c RSP <ffff810c0af0fc70> <0>Kernel panic - not syncing: Fatal exception Any help anyone could provide would be appreciated. -Mike
Yes, this is a known issue. Only occurs when nfs is in the equation. This issue has been fixed in mainline quite some time ago. We are in the process of backporting that to 1.4. michael.a.jaquays at verizon.com wrote:> All, > > I have a 3 node cluster that is experiencing kernel panics once every few days. We are sharing some of the ocfs2 filesystems via nfs to some web app servers. The app servers mount the filesystems with the nordirplus option. Are there any known pitfalls with using an nfs4 server and ocfs2? I haven't seen a case where all three nodes are down at the same time, but the issue seems to travel from node to node. Here are the node details: > > OS: RHEL5.4 > Kernel: 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux > > OCFS2 Packages: ocfs2console-1.4.3-1.el5 > ocfs2-tools-1.4.3-1.el5 > ocfs2-2.6.18-164.11.1.el5-1.4.4-1.el5 > > > The following is always logged in /var/log/messages right before the node panics: > > kernel: (11915,0):ocfs2_inode_lock_update:1970 ERROR: bug expression: inode->i_generation != le32_to_cpu(fe->i_generation) > > kernel: (11915,0):ocfs2_inode_lock_update:1970 ERROR: Invalid dinode 446146 disk generation: 1276645928 inode->i_generation: 1276645926 > > kernel: ----------- [cut here ] --------- [please bite here ] --------- > > The following is part of the kernel panic: > > Call Trace: > [<ffffffff885a2940>] :ocfs2:ocfs2_delete_inode+0x187/0x73f > [<ffffffff885a27b9>] :ocfs2:ocfs2_delete_inode+0x0/0x73f > [<ffffffff8002f463>] generic_delete_inode+0xc6/0x143 > [<ffffffff885a22e3>] :ocfs2:ocfs2_drop_inode+0xca/0x12b > [<ffffffff885a693f>] :ocfs2:ocfs2_complete_recovery+0x77e/0x910 > [<ffffffff885a61c1>] :ocfs2:ocfs2_complete_recovery+0x0/0x910 > [<ffffffff8004d8ed>] run_workqueue+0x94/0xe4 > [<ffffffff8004a12f>] worker_thread+0x0/0x122 > [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 > [<ffffffff8004a21f>] worker_thread+0xf0/0x122 > [<ffffffff8008c86c>] default_wake_function+0x0/0xe > [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 > [<ffffffff80032950>] kthread+0xfe/0x132 > [<ffffffff8005dfb1>] child_rip+0xa/0x11 > [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 > [<ffffffff80032852>] kthread+0x0/0x132 > [<ffffffff8005dfa6>] child_rip+0x0/0x11 > > > > Code: 0f 0b 68 1b 3d 5c 88 c2 b2 07 48 83 7b 48 00 75 0a f6 43 2c > RIP [<ffffffff885928f5>] :ocfs2:ocfs2_inode_lock_full+0x99e/0xe3c > RSP <ffff810c0af0fc70> > <0>Kernel panic - not syncing: Fatal exception > > Any help anyone could provide would be appreciated. > > > -Mike > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users >