thr3ads.net - Ocfs2 devel - [Ocfs2-devel] OCFS2 encountered a hang issue when running on Multi-nodes RefcountBlock Group Test [Jan 2018]

If this information is useful, please help other people find it:
Share via:

Gang He

2018-Jan-24 06:24 UTC

[Ocfs2-devel] OCFS2 encountered a hang issue when running on Multi-nodes RefcountBlock Group Test

Hello Guys,

I met a ocfs2 hang issue when running on Multi-nodes Refcount Block Group Test
on three nodes cluster, this hang issue can not reproduced each time, but it
looks there is a bug in dlmglue layer.
The bug symptom looks as below,
Node1: 
ocfs2te+  2403  2372  2403  0    1 Jan18 ?        00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/orted --hnp-topo-sig
0N:4S:0L3:4L2:4L1:4C:4H:x86_64 -mca ess env -mca orte_ess_jobid 174456832 -mca
orte_es
ocfs2te+  2404  2403  2404  0    2 Jan18 ?        00:00:06
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s         <<== hang
process
ocfs2te+  2404  2403  2405  0    2 Jan18 ?        00:00:00
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
sle15b4-nd1:/ # cat /proc/2404/stack
[<ffffffffc06878e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
[<ffffffffc0688918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
[<ffffffffc068912d>] ocfs2_inode_lock_tracker+0xbd/0x180 [ocfs2]
[<ffffffffc068bfcc>] ocfs2_permission+0x3c/0xc0 [ocfs2]
[<ffffffff8e23def9>] __inode_permission+0x69/0xc0
[<ffffffff8e23ff9c>] link_path_walk+0x26c/0x4e0
[<ffffffff8e241c93>] path_openat+0xd3/0x1570
[<ffffffff8e24429c>] do_filp_open+0x8c/0xf0
[<ffffffff8e231586>] do_sys_open+0x1a6/0x230
[<ffffffff8e6acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
[<ffffffffffffffff>] 0xffffffffffffffff

Node2:
ocfs2te+ 21312 28669 21312  0    2 Jan18 pts/2    00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/mpirun -mca plm_rsh_agent ssh:rsh -mca btl
tcp,self --host sle15b4-nd2,sle15b4-nd1,sle15b4-nd3 /usr/local/o
ocfs2te+ 21312 28669 21313  0    2 Jan18 pts/2    00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/mpirun -mca plm_rsh_agent ssh:rsh -mca btl
tcp,self --host sle15b4-nd2,sle15b4-nd1,sle15b4-nd3 /usr/local/o
ocfs2te+ 21314 21312 21314  0    1 Jan18 pts/2    00:00:00 /usr/bin/ssh -x
sle15b4-nd1     PATH=/usr/lib64/mpi/gcc/openmpi/bin:$PATH ; export PATH ;
LD_LIBRARY_PATH=/usr/lib64/mpi/gcc/openmpi/lib64
ocfs2te+ 21315 21312 21315  0    1 Jan18 pts/2    00:00:00 /usr/bin/ssh -x
sle15b4-nd3     PATH=/usr/lib64/mpi/gcc/openmpi/bin:$PATH ; export PATH ;
LD_LIBRARY_PATH=/usr/lib64/mpi/gcc/openmpi/lib64
ocfs2te+ 21316 21312 21316  0    2 Jan18 pts/2    00:00:07
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s <<== hang process
ocfs2te+ 21316 21312 21317  0    2 Jan18 pts/2    00:00:00
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
sle15b4-nd2:/ # cat /proc/21316/stack
[<ffffffffc06bc8e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
[<ffffffffc06bd918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
[<ffffffffc06d6cd0>] ocfs2_mknod+0x1e0/0xeb0 [ocfs2]
[<ffffffffc06d7af5>] ocfs2_create+0x35/0x120 [ocfs2]
[<ffffffffa6242e41>] path_openat+0x1281/0x1570
[<ffffffffa624429c>] do_filp_open+0x8c/0xf0
[<ffffffffa6231586>] do_sys_open+0x1a6/0x230
[<ffffffffa66acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
[<ffffffffffffffff>] 0x

Node3:
ocfs2te+ 13108 13079 13108  0    1 Jan18 ?        00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/orted --hnp-topo-sig
0N:4S:0L3:4L2:4L1:4C:4H:x86_64 -mca ess env -mca orte_ess_jobid 174456832 -mca
orte_es
ocfs2te+ 13109 13108 13109  0    2 Jan18 ?        00:00:06
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s <<== hang process
ocfs2te+ 13109 13108 13110  0    2 Jan18 ?        00:00:00
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
sle15b4-nd3:~ # cat /proc/13109/stack
[<ffffffffc08b48e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
[<ffffffffc08b5918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
[<ffffffffc08d2907>] ocfs2_mv_orphaned_inode_to_new+0x77/0x7b0 [ocfs2]
[<ffffffffc08dd019>] ocfs2_reflink+0x309/0x430 [ocfs2]
[<ffffffffc08dd866>] ocfs2_reflink_ioctl+0x2a6/0x330 [ocfs2]
[<ffffffffc08c3537>] ocfs2_ioctl+0x237/0x640 [ocfs2]
[<ffffffff81247b2d>] do_vfs_ioctl+0x8d/0x5d0
[<ffffffff812480e4>] SyS_ioctl+0x74/0x80
[<ffffffff816acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
[<ffffffffffffffff>] 0xffffffffffffffff

These three processes in the cluster have been hanged for more than one days, 
The processes on Node1 and Node2 are hanged at getting meta-dlm-lock (from NL to
PR or EX) of inode 20122639 (dentry is
"ocfs2-discontig-bg-test-multi-nodes"),
Node3 has hold this inode meta-dlm-lock with EX level, but the process on Node3
is trying to get meta-dlm-lock again and is hanged at waitting for lockres
blocked flag clear.
But this lockres is not queued on Node3, the memory dump looks like,
 ip_inode_lockres = {
    l_priv = 0xffff9edb6af75458,
    l_ops = 0xffffffffc0952160 <ocfs2_inode_inode_lops>,
    l_blocked_list = {
      next = 0xffff9edb6af75160, <<= NOT in block lock list of super obj
      prev = 0xffff9edb6af75160
    },
    l_mask_waiters = {
      next = 0xffffb6e140453bd0,
      prev = 0xffffb6e141bc3d10
    },
    l_holders = {
      next = 0xffff9edb6af75180,
      prev = 0xffff9edb6af75180
    },
    l_flags = 69,  <<== No OCFS2_LOCK_QUEUED flag
    l_name = "M0000000000000001330c0f00000000",
    l_ro_holders = 0,
    l_ex_holders = 0,
    l_level = 5 '\005',
    l_requested = 0 '\000',
    l_blocking = 5 '\005',
    l_type = 0 '\000',
    l_action = 0 '\000',
    l_unlock_action = 0 '\000',
    l_pending_gen = 4997,

In fact, OSB on Node 3 has a empty block lock list, that means this lockres will
not be handled anymore.
  dc_work_sequence = 174228,
  blocked_lock_list = {   << == empty list
    next = 0xffff9edb991e4660,
    prev = 0xffff9edb991e4660
  },
  blocked_lock_count = 0,
  dquot_drop_list = {
    first = 0x0
  },
  dquot_drop_work = { 

So, these processes on each node will be hanged forever, it looks there is a bug
for queuing lockres.
Did you meet this bug before I investigate it further?

Thanks a lot.
Gang

Changwei Ge

2018-Jan-24 06:43 UTC

head link

[Ocfs2-devel] OCFS2 encountered a hang issue when running on Multi-nodes RefcountBlock Group Test

Hi Gang,

Never met such a lockup issue so far with o2cb stack applied.
And I don't know how to debug this with fs/dlm.
What I suspect currently is that this lock resource was once QUEUED but fail to
re-QUEUED since BLOCKED has been marked.

Thanks,
Changwei


On 2018/1/24 14:30, Gang He wrote:> Hello Guys,
> 
> I met a ocfs2 hang issue when running on Multi-nodes Refcount Block Group
Test on three nodes cluster, this hang issue can not reproduced each time, but
it looks there is a bug in dlmglue layer.
> The bug symptom looks as below,
> Node1:
> ocfs2te+  2403  2372  2403  0    1 Jan18 ?        00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/orted --hnp-topo-sig
0N:4S:0L3:4L2:4L1:4C:4H:x86_64 -mca ess env -mca orte_ess_jobid 174456832 -mca
orte_es
> ocfs2te+  2404  2403  2404  0    2 Jan18 ?        00:00:06
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s         <<== hang
process
> ocfs2te+  2404  2403  2405  0    2 Jan18 ?        00:00:00
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
> sle15b4-nd1:/ # cat /proc/2404/stack
> [<ffffffffc06878e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
> [<ffffffffc0688918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
> [<ffffffffc068912d>] ocfs2_inode_lock_tracker+0xbd/0x180 [ocfs2]
> [<ffffffffc068bfcc>] ocfs2_permission+0x3c/0xc0 [ocfs2]
> [<ffffffff8e23def9>] __inode_permission+0x69/0xc0
> [<ffffffff8e23ff9c>] link_path_walk+0x26c/0x4e0
> [<ffffffff8e241c93>] path_openat+0xd3/0x1570
> [<ffffffff8e24429c>] do_filp_open+0x8c/0xf0
> [<ffffffff8e231586>] do_sys_open+0x1a6/0x230
> [<ffffffff8e6acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Node2:
> ocfs2te+ 21312 28669 21312  0    2 Jan18 pts/2    00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/mpirun -mca plm_rsh_agent ssh:rsh -mca btl
tcp,self --host sle15b4-nd2,sle15b4-nd1,sle15b4-nd3 /usr/local/o
> ocfs2te+ 21312 28669 21313  0    2 Jan18 pts/2    00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/mpirun -mca plm_rsh_agent ssh:rsh -mca btl
tcp,self --host sle15b4-nd2,sle15b4-nd1,sle15b4-nd3 /usr/local/o
> ocfs2te+ 21314 21312 21314  0    1 Jan18 pts/2    00:00:00 /usr/bin/ssh -x
sle15b4-nd1     PATH=/usr/lib64/mpi/gcc/openmpi/bin:$PATH ; export PATH ;
LD_LIBRARY_PATH=/usr/lib64/mpi/gcc/openmpi/lib64
> ocfs2te+ 21315 21312 21315  0    1 Jan18 pts/2    00:00:00 /usr/bin/ssh -x
sle15b4-nd3     PATH=/usr/lib64/mpi/gcc/openmpi/bin:$PATH ; export PATH ;
LD_LIBRARY_PATH=/usr/lib64/mpi/gcc/openmpi/lib64
> ocfs2te+ 21316 21312 21316  0    2 Jan18 pts/2    00:00:07
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s <<== hang process
> ocfs2te+ 21316 21312 21317  0    2 Jan18 pts/2    00:00:00
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
> sle15b4-nd2:/ # cat /proc/21316/stack
> [<ffffffffc06bc8e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
> [<ffffffffc06bd918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
> [<ffffffffc06d6cd0>] ocfs2_mknod+0x1e0/0xeb0 [ocfs2]
> [<ffffffffc06d7af5>] ocfs2_create+0x35/0x120 [ocfs2]
> [<ffffffffa6242e41>] path_openat+0x1281/0x1570
> [<ffffffffa624429c>] do_filp_open+0x8c/0xf0
> [<ffffffffa6231586>] do_sys_open+0x1a6/0x230
> [<ffffffffa66acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
> [<ffffffffffffffff>] 0x
> 
> Node3:
> ocfs2te+ 13108 13079 13108  0    1 Jan18 ?        00:00:00
/usr/lib64/mpi/gcc/openmpi/bin/orted --hnp-topo-sig
0N:4S:0L3:4L2:4L1:4C:4H:x86_64 -mca ess env -mca orte_ess_jobid 174456832 -mca
orte_es
> ocfs2te+ 13109 13108 13109  0    2 Jan18 ?        00:00:06
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s <<== hang process
> ocfs2te+ 13109 13108 13110  0    2 Jan18 ?        00:00:00
/usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 -w
/mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
> sle15b4-nd3:~ # cat /proc/13109/stack
> [<ffffffffc08b48e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
> [<ffffffffc08b5918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
> [<ffffffffc08d2907>] ocfs2_mv_orphaned_inode_to_new+0x77/0x7b0
[ocfs2]
> [<ffffffffc08dd019>] ocfs2_reflink+0x309/0x430 [ocfs2]
> [<ffffffffc08dd866>] ocfs2_reflink_ioctl+0x2a6/0x330 [ocfs2]
> [<ffffffffc08c3537>] ocfs2_ioctl+0x237/0x640 [ocfs2]
> [<ffffffff81247b2d>] do_vfs_ioctl+0x8d/0x5d0
> [<ffffffff812480e4>] SyS_ioctl+0x74/0x80
> [<ffffffff816acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> These three processes in the cluster have been hanged for more than one
days,
> The processes on Node1 and Node2 are hanged at getting meta-dlm-lock (from
NL to PR or EX) of inode 20122639 (dentry is
"ocfs2-discontig-bg-test-multi-nodes"),
> Node3 has hold this inode meta-dlm-lock with EX level, but the process on
Node3 is trying to get meta-dlm-lock again and is hanged at waitting for lockres
blocked flag clear.
> But this lockres is not queued on Node3, the memory dump looks like,
>   ip_inode_lockres = {
>      l_priv = 0xffff9edb6af75458,
>      l_ops = 0xffffffffc0952160 <ocfs2_inode_inode_lops>,
>      l_blocked_list = {
>        next = 0xffff9edb6af75160, <<= NOT in block lock list of super
obj
>        prev = 0xffff9edb6af75160
>      },
>      l_mask_waiters = {
>        next = 0xffffb6e140453bd0,
>        prev = 0xffffb6e141bc3d10
>      },
>      l_holders = {
>        next = 0xffff9edb6af75180,
>        prev = 0xffff9edb6af75180
>      },
>      l_flags = 69,  <<== No OCFS2_LOCK_QUEUED flag
>      l_name = "M0000000000000001330c0f00000000",
>      l_ro_holders = 0,
>      l_ex_holders = 0,
>      l_level = 5 '\005',
>      l_requested = 0 '\000',
>      l_blocking = 5 '\005',
>      l_type = 0 '\000',
>      l_action = 0 '\000',
>      l_unlock_action = 0 '\000',
>      l_pending_gen = 4997,
> 
> In fact, OSB on Node 3 has a empty block lock list, that means this lockres
will not be handled anymore.
>    dc_work_sequence = 174228,
>    blocked_lock_list = {   << == empty list
>      next = 0xffff9edb991e4660,
>      prev = 0xffff9edb991e4660
>    },
>    blocked_lock_count = 0,
>    dquot_drop_list = {
>      first = 0x0
>    },
>    dquot_drop_work = {
> 
> So, these processes on each node will be hanged forever, it looks there is
a bug for queuing lockres.
> Did you meet this bug before I investigate it further?
> 
> Thanks a lot.
> Gang
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>

Ocfs2 devel - Jan 2018 - OCFS2 encountered a hang issue when running on Multi-nodes RefcountBlock Group Test

[Ocfs2-devel] OCFS2 encountered a hang issue when running on Multi-nodes RefcountBlock Group Test

[Ocfs2-devel] OCFS2 encountered a hang issue when running on Multi-nodes RefcountBlock Group Test