thr3ads.net - Gluster users - [Gluster-users] query about glusterd epoll thread get stuck [Jan 2019]

If this information is useful, please help other people find it:
Share via:

Zhou, Cynthia (NSB - CN/Hangzhou)

2019-Jan-29 06:41 UTC

[Gluster-users] query about glusterd epoll thread get stuck

Hi,
We are using glusterfs version 3.12 for 3 brick I find that occasionally after
reboot all 3 sn nodes simultaneously, the glusterd process on one sn nodes may
get stuck, when you try to execute glusterd command it does not response.

Following is the glusterd stuck log and the gdb info of the stuck glusterd
process,
[2019-01-28 14:38:24.999329] I [rpc-clnt.c:1048:rpc_clnt_connection_init]
0-snapd: setting frame-timeout to 600
[2019-01-28 14:38:24.999450] I [rpc-clnt.c:1048:rpc_clnt_connection_init]
0-snapd: setting frame-timeout to 600
[2019-01-28 14:38:24.999597] I [MSGID: 106493]
[glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management: Received ACC
from uuid: a4315121-e127-42ca-9869-0fe451216a80
[2019-01-28 14:38:24.999692] I [MSGID: 106493]
[glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from
uuid: 7f694d7e-7613-4298-8da1-50cbb73ed47e, host: mn-0.local, port: 0
[2019-01-28 14:38:25.010624] W [socket.c:593:__socket_rwv] 0-management: readv
on 192.168.1.14:24007 failed (No data available)
[2019-01-28 14:38:25.010774] I [MSGID: 106004]
[glusterd-handler.c:6317:__glusterd_peer_rpc_notify] 0-management: Peer
<mn-0.local> (<7f694d7e-7613-4298-8da1-50cbb73ed47e>), in state
<Peer in Cluster>, has disconnected from glusterd.
[2019-01-28 14:38:25.010860] W [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
[0x7f1828aa0ba4]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
[0x7f1828ab3d5d]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
[0x7f1828b7c635] ) 0-management: Lock for vol ccs not held
[2019-01-28 14:38:25.010903] W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
released for ccs
[2019-01-28 14:38:25.010931] W [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
[0x7f1828aa0ba4]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
[0x7f1828ab3d5d]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
[0x7f1828b7c635] ) 0-management: Lock for vol encryptfile not held
[2019-01-28 14:38:25.010945] W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
released for encryptfile
[2019-01-28 14:38:25.010971] W [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
[0x7f1828aa0ba4]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
[0x7f1828ab3d5d]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
[0x7f1828b7c635] ) 0-management: Lock for vol export not held
[2019-01-28 14:38:25.010983] W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
released for export
[2019-01-28 14:38:25.011006] W [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
[0x7f1828aa0ba4]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
[0x7f1828ab3d5d]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
[0x7f1828b7c635] ) 0-management: Lock for vol log not held
[2019-01-28 14:38:25.011046] W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
released for log
[2019-01-28 14:38:25.011070] W [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
[0x7f1828aa0ba4]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
[0x7f1828ab3d5d]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
[0x7f1828b7c635] ) 0-management: Lock for vol mstate not held
[2019-01-28 14:38:25.011082] W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
released for mstate
[2019-01-28 14:38:25.011104] W [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
(-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
[0x7f1828aa0ba4]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
[0x7f1828ab3d5d]
-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
[0x7f1828b7c635] ) 0-management: Lock for vol services not held
[2019-01-28 14:38:25.011115] W [MSGID: 106118]
[glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
released for services
[2019-01-28 14:38:25.011268] E [rpc-clnt.c:350:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e9)[0x7f182df24a99] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1f9)[0x7f182dce6f27] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0x1f)[0x7f182dce701a] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x11e)[0x7f182dce756e] (-->
/lib64/libgfrpc.so.0(+0x12f9e)[0x7f182dce7f9e] ))))) 0-management: forced
unwinding frame type(Peer mgmt) op(--(4)) called at 2019-01-28 14:38:24.999851
(xid=0x7)
[2019-01-28 14:38:25.011286] E [MSGID: 106158]
[glusterd-rpc-ops.c:684:__glusterd_friend_update_cbk] 0-management: RPC Error
[2019-01-28 14:38:25.011301] I [MSGID: 106493]
[glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management: Received RJT
from uuid: 00000000-0000-0000-0000-000000000000
[2019-01-28 14:38:25.011424] I [MSGID: 106006]
[glusterd-svc-mgmt.c:328:glusterd_svc_common_rpc_notify] 0-management:
glustershd has connected with glusterd.
[2019-01-28 14:38:25.011599] I [MSGID: 106143]
[glusterd-pmap.c:295:pmap_registry_bind] 0-pmap: adding brick
/mnt/bricks/ccs/brick on port 53952
[2019-01-28 14:38:25.011862] I [MSGID: 106492]
[glusterd-handler.c:2718:__glusterd_handle_friend_update] 0-glusterd: Received
friend update from uuid: 7f694d7e-7613-4298-8da1-50cbb73ed47e
[2019-01-28 14:38:25.021058] I [MSGID: 106502]
[glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management: Received
my uuid as Friend
[2019-01-28 14:38:25.021090] I [socket.c:3704:socket_submit_reply]
0-socket.management: not connected (priv->connected = -1)
[2019-01-28 14:38:25.021099] E [rpcsvc.c:1364:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xa, Program: GlusterD svc peer,
ProgVers: 2, Proc: 4) to rpc-transport (socket.management)
[2019-01-28 14:38:25.021108] E [MSGID: 106430]
[glusterd-utils.c:568:glusterd_submit_reply] 0-glusterd: Reply submission failed
[2019-01-28 14:38:25.021126] E [rpcsvc.c:559:rpcsvc_check_and_reply_error]
0-rpcsvc: rpc actor failed to complete successfully
[2019-01-28 14:38:25.021135] E [rpcsvc.c:1364:rpcsvc_submit_generic]
0-rpc-service: failed to submit message (XID: 0xa, Program: GlusterD svc peer,
ProgVers: 2, Proc: 4) to rpc-transport (socket.management)
[2019-01-28 14:38:25.021147] W [rpcsvc.c:565:rpcsvc_check_and_reply_error]
0-rpcsvc: failed to queue error reply


Gdb info the stuck thread of glusterd is:
(gdb) thread 8
[Switching to thread 8 (Thread 0x7f1826ec2700 (LWP 2418))]
#0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f182ccea677 in __lll_lock_elision () from /lib64/libpthread.so.0
#2  0x00007f182df5cae6 in iobref_unref () from /lib64/libglusterfs.so.0
#3  0x00007f182dce2f29 in rpc_transport_pollin_destroy () from
/lib64/libgfrpc.so.0
#4  0x00007f1827ccf319 in socket_event_poll_in () from
/usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
#5  0x00007f1827ccf932 in socket_event_handler () from
/usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
#6  0x00007f182df925d4 in event_dispatch_epoll_handler () from
/lib64/libglusterfs.so.0
#7  0x00007f182df928ab in event_dispatch_epoll_worker () from
/lib64/libglusterfs.so.0
#8  0x00007f182ccde5da in start_thread () from /lib64/libpthread.so.0
#9  0x00007f182c5b4e8f in clone () from /lib64/libc.so.6
(gdb) thread 9
[Switching to thread 9 (Thread 0x7f18266c1700 (LWP 2419))]
#0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f182cce2b42 in __pthread_mutex_cond_lock () from
/lib64/libpthread.so.0
#2  0x00007f182cce44c8 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#3  0x00007f1827ccadab in socket_event_poll_err () from
/usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
#4  0x00007f1827ccf99c in socket_event_handler () from
/usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
#5  0x00007f182df925d4 in event_dispatch_epoll_handler () from
/lib64/libglusterfs.so.0
#6  0x00007f182df928ab in event_dispatch_epoll_worker () from
/lib64/libglusterfs.so.0
#7  0x00007f182ccde5da in start_thread () from /lib64/libpthread.so.0
#8  0x00007f182c5b4e8f in clone () from /lib64/libc.so.6


Have you ever encounter this issue? From the gdb info it seem the epoll thread
get dead lock.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190129/67cd746c/attachment.html>

Raghavendra Gowdappa

2019-Jan-29 11:32 UTC

head link

[Gluster-users] query about glusterd epoll thread get stuck

On Tue, Jan 29, 2019 at 12:11 PM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.zhou at nokia-sbell.com> wrote:
> Hi,
>
> We are using glusterfs version 3.12 for 3 brick I find that occasionally
> after reboot all 3 sn nodes simultaneously, the glusterd process on one sn
> nodes may get stuck, when you try to execute glusterd command it does not
> response.
>
>
>
> Following is the glusterd stuck log and the gdb info of the stuck glusterd
> process,
>
> [2019-01-28 14:38:24.999329] I [rpc-clnt.c:1048:rpc_clnt_connection_init]
> 0-snapd: setting frame-timeout to 600
>
> [2019-01-28 14:38:24.999450] I [rpc-clnt.c:1048:rpc_clnt_connection_init]
> 0-snapd: setting frame-timeout to 600
>
> [2019-01-28 14:38:24.999597] I [MSGID: 106493]
> [glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management:
> Received ACC from uuid: a4315121-e127-42ca-9869-0fe451216a80
>
> [2019-01-28 14:38:24.999692] I [MSGID: 106493]
> [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC
> from uuid: 7f694d7e-7613-4298-8da1-50cbb73ed47e, host: mn-0.local, port: 0
>
> [2019-01-28 14:38:25.010624] W [socket.c:593:__socket_rwv] 0-management:
> readv on 192.168.1.14:24007 failed (No data available)
>
> [2019-01-28 14:38:25.010774] I [MSGID: 106004]
> [glusterd-handler.c:6317:__glusterd_peer_rpc_notify] 0-management: Peer
> <mn-0.local> (<7f694d7e-7613-4298-8da1-50cbb73ed47e>), in state
<Peer in
> Cluster>, has disconnected from glusterd.
>
> [2019-01-28 14:38:25.010860] W
> [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
> [0x7f1828aa0ba4]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
> [0x7f1828ab3d5d]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
> [0x7f1828b7c635] ) 0-management: Lock for vol ccs not held
>
> [2019-01-28 14:38:25.010903] W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for ccs
>
> [2019-01-28 14:38:25.010931] W
> [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
> [0x7f1828aa0ba4]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
> [0x7f1828ab3d5d]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
> [0x7f1828b7c635] ) 0-management: Lock for vol encryptfile not held
>
> [2019-01-28 14:38:25.010945] W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for encryptfile
>
> [2019-01-28 14:38:25.010971] W
> [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
> [0x7f1828aa0ba4]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
> [0x7f1828ab3d5d]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
> [0x7f1828b7c635] ) 0-management: Lock for vol export not held
>
> [2019-01-28 14:38:25.010983] W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for export
>
> [2019-01-28 14:38:25.011006] W
> [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
> [0x7f1828aa0ba4]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
> [0x7f1828ab3d5d]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
> [0x7f1828b7c635] ) 0-management: Lock for vol log not held
>
> [2019-01-28 14:38:25.011046] W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for log
>
> [2019-01-28 14:38:25.011070] W
> [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
> [0x7f1828aa0ba4]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
> [0x7f1828ab3d5d]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
> [0x7f1828b7c635] ) 0-management: Lock for vol mstate not held
>
> [2019-01-28 14:38:25.011082] W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for mstate
>
> [2019-01-28 14:38:25.011104] W
> [glusterd-locks.c:846:glusterd_mgmt_v3_unlock]
> (-->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x1bba4)
> [0x7f1828aa0ba4]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x2ed5d)
> [0x7f1828ab3d5d]
> -->/usr/lib64/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0xf7635)
> [0x7f1828b7c635] ) 0-management: Lock for vol services not held
>
> [2019-01-28 14:38:25.011115] W [MSGID: 106118]
> [glusterd-handler.c:6342:__glusterd_peer_rpc_notify] 0-management: Lock not
> released for services
>
> [2019-01-28 14:38:25.011268] E [rpc-clnt.c:350:saved_frames_unwind] (-->
> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e9)[0x7f182df24a99] (-->
> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1f9)[0x7f182dce6f27] (-->
> /lib64/libgfrpc.so.0(saved_frames_destroy+0x1f)[0x7f182dce701a] (-->
> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x11e)[0x7f182dce756e]
> (--> /lib64/libgfrpc.so.0(+0x12f9e)[0x7f182dce7f9e] ))))) 0-management:
> forced unwinding frame type(Peer mgmt) op(--(4)) called at 2019-01-28
> 14:38:24.999851 (xid=0x7)
>
> [2019-01-28 14:38:25.011286] E [MSGID: 106158]
> [glusterd-rpc-ops.c:684:__glusterd_friend_update_cbk] 0-management: RPC
> Error
>
> [2019-01-28 14:38:25.011301] I [MSGID: 106493]
> [glusterd-rpc-ops.c:701:__glusterd_friend_update_cbk] 0-management:
> Received RJT from uuid: 00000000-0000-0000-0000-000000000000
>
> [2019-01-28 14:38:25.011424] I [MSGID: 106006]
> [glusterd-svc-mgmt.c:328:glusterd_svc_common_rpc_notify] 0-management:
> glustershd has connected with glusterd.
>
> [2019-01-28 14:38:25.011599] I [MSGID: 106143]
> [glusterd-pmap.c:295:pmap_registry_bind] 0-pmap: adding brick
> /mnt/bricks/ccs/brick on port 53952
>
> [2019-01-28 14:38:25.011862] I [MSGID: 106492] [glusterd-handler.c:2718:
> __glusterd_handle_friend_update] 0-glusterd: Received friend update from
> uuid: 7f694d7e-7613-4298-8da1-50cbb73ed47e
>
> [2019-01-28 14:38:25.021058] I [MSGID: 106502]
> [glusterd-handler.c:2763:__glusterd_handle_friend_update] 0-management:
> Received my uuid as Friend
>
> [2019-01-28 14:38:25.021090] I [socket.c:3704:socket_submit_reply]
> 0-socket.management: not connected (priv->connected = -1)
>
> [2019-01-28 14:38:25.021099] E [rpcsvc.c:1364:rpcsvc_submit_generic]
> 0-rpc-service: failed to submit message (XID: 0xa, Program: GlusterD svc
> peer, ProgVers: 2, Proc: 4) to rpc-transport (socket.management)
>
> [2019-01-28 14:38:25.021108] E [MSGID: 106430]
> [glusterd-utils.c:568:glusterd_submit_reply] 0-glusterd: Reply submission
> failed
>
> [2019-01-28 14:38:25.021126] E [rpcsvc.c:559:rpcsvc_check_and_reply_error]
> 0-rpcsvc: rpc actor failed to complete successfully
>
> [2019-01-28 14:38:25.021135] E [rpcsvc.c:1364:rpcsvc_submit_generic]
> 0-rpc-service: failed to submit message (XID: 0xa, Program: GlusterD svc
> peer, ProgVers: 2, Proc: 4) to rpc-transport (socket.management)
>
> [2019-01-28 14:38:25.021147] W [rpcsvc.c:565:rpcsvc_check_and_reply_error]
> 0-rpcsvc: failed to queue error reply
>
>
>
>
>
> Gdb info the stuck thread of glusterd is:
>
> (gdb) thread 8
>
> [Switching to thread 8 (Thread 0x7f1826ec2700 (LWP 2418))]
>
> #0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
>
> (gdb) bt
>
> #0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
>
> #1  0x00007f182ccea677 in __lll_lock_elision () from /lib64/libpthread.so.0
>
> #2  0x00007f182df5cae6 in iobref_unref () from /lib64/libglusterfs.so.0
>
> #3  0x00007f182dce2f29 in rpc_transport_pollin_destroy () from
> /lib64/libgfrpc.so.0
>
> #4  0x00007f1827ccf319 in socket_event_poll_in () from
> /usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
>
> #5  0x00007f1827ccf932 in socket_event_handler () from
> /usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
>
> #6  0x00007f182df925d4 in event_dispatch_epoll_handler () from
> /lib64/libglusterfs.so.0
>
> #7  0x00007f182df928ab in event_dispatch_epoll_worker () from
> /lib64/libglusterfs.so.0
>
> #8  0x00007f182ccde5da in start_thread () from /lib64/libpthread.so.0
>
> #9  0x00007f182c5b4e8f in clone () from /lib64/libc.so.6
>
> (gdb) thread 9
>
> [Switching to thread 9 (Thread 0x7f18266c1700 (LWP 2419))]
>
> #0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
>
> (gdb) bt
>
> #0  0x00007f182cce787c in __lll_lock_wait () from /lib64/libpthread.so.0
>
> #1  0x00007f182cce2b42 in __pthread_mutex_cond_lock () from
> /lib64/libpthread.so.0
>
> #2  0x00007f182cce44c8 in pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib64/libpthread.so.0
>
> #3  0x00007f1827ccadab in socket_event_poll_err () from
> /usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
>
> #4  0x00007f1827ccf99c in socket_event_handler () from
> /usr/lib64/glusterfs/3.12.3/rpc-transport/socket.so
>
> #5  0x00007f182df925d4 in event_dispatch_epoll_handler () from
> /lib64/libglusterfs.so.0
>
> #6  0x00007f182df928ab in event_dispatch_epoll_worker () from
> /lib64/libglusterfs.so.0
>
> #7  0x00007f182ccde5da in start_thread () from /lib64/libpthread.so.0
>
> #8  0x00007f182c5b4e8f in clone () from /lib64/libc.so.6
>
>
>
>
>
> Have you ever encounter this issue? From the gdb info it seem the epoll
> thread get dead lock.
>
Its not a deadlock. Pollerr is serialized with any ongoing pollin/pollout
request processing. The problem here is iobuf_unref is waiting on a lock
and hence pollin didn't complete. Question is why is iobuf_unref is stuck?
Who has the lock of iobuf?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190129/f72274e9/attachment.html>

Gluster users - Jan 2019 - query about glusterd epoll thread get stuck

[Gluster-users] query about glusterd epoll thread get stuck

[Gluster-users] query about glusterd epoll thread get stuck