Junxiao Bi
2016-Oct-11 02:58 UTC
[Ocfs2-devel] [Question] deadlock on chmod when running discontigous block group multiple node testing
Hi Eric, On 10/11/2016 10:42 AM, Eric Ren wrote:> Hi Junxiao, > > As the subject, the testing hung there on a kernel without your patches: > > "ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang" > and > "ocfs2: fix posix_acl_create deadlock" > > The stack trace is: > ``` > ocfs2cts1:~ # pstree -pl 24133 > discontig_runne(24133)???activate_discon(21156)???mpirun(15146)???fillup_contig_b(15149)???sudo(15231)???chmod(15232) > > ocfs2cts1:~ # pgrep -a chmod > 15232 /bin/chmod -R 777 /mnt/ocfs2 > > ocfs2cts1:~ # cat /proc/15232/stack > [<ffffffffa05377ef>] __ocfs2_cluster_lock.isra.39+0x1bf/0x620 [ocfs2] > [<ffffffffa053856d>] ocfs2_inode_lock_full_nested+0x12d/0x840 [ocfs2] > [<ffffffffa0538dbb>] ocfs2_inode_lock_atime+0xcb/0x170 [ocfs2] > [<ffffffffa0531e61>] ocfs2_readdir+0x41/0x1b0 [ocfs2] > [<ffffffff8120d03c>] iterate_dir+0x9c/0x110 > [<ffffffff8120d453>] SyS_getdents+0x83/0xf0 > [<ffffffff815e126e>] entry_SYSCALL_64_fastpath+0x12/0x6d > [<ffffffffffffffff>] 0xffffffffffffffff > ``` > > Do you think this issue can be fixed by your patches?Looks not. Those two patches are to fix recursive locking deadlock. But from above call trace, there is no recursive lock. Thanks, Junxiao.> > I will try your patches later, but I am little worried the possibility > of reproduction may not be 100%. > So ask you to confirm;-) > > Eric
Eric Ren
2016-Oct-11 03:30 UTC
[Ocfs2-devel] [Question] deadlock on chmod when running discontigous block group multiple node testing
Hi Junxiao, On 10/11/2016 10:58 AM, Junxiao Bi wrote:>> Do you think this issue can be fixed by your patches? > Looks not. Those two patches are to fix recursive locking deadlock. But > from above call trace, there is no recursive lock.OK, thanks a lot! Eric> > Thanks, > Junxiao. >> I will try your patches later, but I am little worried the possibility >> of reproduction may not be 100%. >> So ask you to confirm;-) >> >> Eric >
Eric Ren
2016-Oct-12 01:23 UTC
[Ocfs2-devel] [Question] deadlock on chmod when running discontigous block group multiple node testing
Hi Junxiao,> Hi Eric, > > On 10/11/2016 10:42 AM, Eric Ren wrote: >> Hi Junxiao, >> >> As the subject, the testing hung there on a kernel without your patches: >> >> "ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang" >> and >> "ocfs2: fix posix_acl_create deadlock" >> >> The stack trace is: >> ``` >> ocfs2cts1:~ # pstree -pl 24133 >> discontig_runne(24133)???activate_discon(21156)???mpirun(15146)???fillup_contig_b(15149)???sudo(15231)???chmod(15232) >> >> ocfs2cts1:~ # pgrep -a chmod >> 15232 /bin/chmod -R 777 /mnt/ocfs2 >> >> ocfs2cts1:~ # cat /proc/15232/stack >> [<ffffffffa05377ef>] __ocfs2_cluster_lock.isra.39+0x1bf/0x620 [ocfs2] >> [<ffffffffa053856d>] ocfs2_inode_lock_full_nested+0x12d/0x840 [ocfs2] >> [<ffffffffa0538dbb>] ocfs2_inode_lock_atime+0xcb/0x170 [ocfs2] >> [<ffffffffa0531e61>] ocfs2_readdir+0x41/0x1b0 [ocfs2] >> [<ffffffff8120d03c>] iterate_dir+0x9c/0x110 >> [<ffffffff8120d453>] SyS_getdents+0x83/0xf0 >> [<ffffffff815e126e>] entry_SYSCALL_64_fastpath+0x12/0x6d >> [<ffffffffffffffff>] 0xffffffffffffffff >> ``` >> >> Do you think this issue can be fixed by your patches? > Looks not. Those two patches are to fix recursive locking deadlock. But > from above call trace, there is no recursive lock.Sorry, the call trace on another node was missing. Here it is: ocfs2cts2:~ # pstree -lp sshd(4292)???sshd(4745)???sshd(4753)???bash(4754)???orted(4781)???fillup_contig_b(4782)???sudo(4864)???chmod(4865) ocfs2cts2:~ # cat /proc/4865/stack [<ffffffffa053e7ef>] __ocfs2_cluster_lock.isra.39+0x1bf/0x620 [ocfs2] [<ffffffffa053f56d>] ocfs2_inode_lock_full_nested+0x12d/0x840 [ocfs2] [<ffffffffa059c860>] ocfs2_iop_get_acl+0x40/0xf0 [ocfs2] [<ffffffff812044e6>] generic_permission+0x166/0x1c0 [<ffffffffa0542aca>] ocfs2_permission+0xaa/0xd0 [ocfs2] [<ffffffff81204596>] __inode_permission+0x56/0xb0 [<ffffffff812068fa>] link_path_walk+0x29a/0x560 [<ffffffff81206cbf>] path_lookupat+0x7f/0x110 [<ffffffff8120929c>] filename_lookup+0x9c/0x150 [<ffffffff811f96c3>] SyS_fchmodat+0x33/0x90 [<ffffffff815e126e>] entry_SYSCALL_64_fastpath+0x12/0x6d [<ffffffffffffffff>] 0xffffffffffffffff Thanks, Eric> > Thanks, > Junxiao. >> I will try your patches later, but I am little worried the possibility >> of reproduction may not be 100%. >> So ask you to confirm;-) >> >> Eric >