01.10.2018 23:09, Danny Lee ?????:> Ran into this issue too with 4.1.5 with an arbiter setup.? Also could > not run a statedump due to "Segmentation fault". > > Tried with 3.12.13 and had issues with locked files as well.? We were > able to do a statedump and found that some of our files were "BLOCKED" > (xlator.features.locks.vol-locks.inode).? Attached part of statedump. > > Also tried clearing the locks using clear-locks, which did remove the > lock, but as soon as I tried to cat the file, it got locked again and > the cat process hung.I created issue in bugzilla, can't find it though :-( Looks like there is no activity after I sent all logs...> > On Wed, Aug 29, 2018, 3:13 AM Dmitry Melekhov <dm at belkam.com > <mailto:dm at belkam.com>> wrote: > > 28.08.2018 10:43, Amar Tumballi ?????: >> >> >> On Tue, Aug 28, 2018 at 11:24 AM, Dmitry Melekhov <dm at belkam.com >> <mailto:dm at belkam.com>> wrote: >> >> Hello! >> >> >> Yesterday we hit something like this on 4.1.2 >> >> Centos 7.5. >> >> >> Volume is replicated - two bricks and one arbiter. >> >> >> We rebooted arbiter, waited for heal end,? and tried to live >> migrate VM to another node ( we run VMs on gluster nodes ): >> >> >> [2018-08-27 09:56:22.085411] I [MSGID: 115029] >> [server-handshake.c:763:server_setvolume] 0-pool-server: >> accepted client from >> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool- >> client-6-RECON_NO:-0 (version: 4.1.2) >> [2018-08-27 09:56:22.107609] I [MSGID: 115036] >> [server.c:483:server_rpc_notify] 0-pool-server: disconnecting >> connection from >> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool- >> client-6-RECON_NO:-0 >> [2018-08-27 09:56:22.107747] I [MSGID: 101055] >> [client_t.c:444:gf_client_unref] 0-pool-server: Shutting down >> connection >> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien >> t-6-RECON_NO:-0 >> [2018-08-27 09:58:37.905829] I [MSGID: 115036] >> [server.c:483:server_rpc_notify] 0-pool-server: disconnecting >> connection from >> CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p >> ool-client-6-RECON_NO:-0 >> [2018-08-27 09:58:37.905926] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000} >> [2018-08-27 09:58:37.905959] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000} >> [2018-08-27 09:58:37.905979] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000} >> [2018-08-27 09:58:37.905997] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000} >> [2018-08-27 09:58:37.906016] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000} >> [2018-08-27 09:58:37.906034] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000} >> [2018-08-27 09:58:37.906056] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000} >> [2018-08-27 09:58:37.906079] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000} >> [2018-08-27 09:58:37.906098] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000} >> [2018-08-27 09:58:37.906121] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000} >> ... >> >> [2018-08-27 09:58:37.907375] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000} >> [2018-08-27 09:58:37.907393] W >> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000} >> [2018-08-27 09:58:37.907476] I >> [socket.c:3837:socket_submit_reply] 0-tcp.pool-server: not >> connected (priv->connected = -1) >> [2018-08-27 09:58:37.907520] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >> to submit message (XID: 0xcb88cb, Program: GlusterFS 4.x v1, >> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >> [2018-08-27 09:58:37.910727] E >> [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >> [0x7ffb64379084] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >> ba) [0x7ffb5fddf5ba] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7ffb5fd89fce] ) 0-: Reply submission failed >> [2018-08-27 09:58:37.910814] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >> to submit message (XID: 0xcb88ce, Program: GlusterFS 4.x v1, >> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >> [2018-08-27 09:58:37.910861] E >> [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >> [0x7ffb64379084] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >> ba) [0x7ffb5fddf5ba] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7ffb5fd89fce] ) 0-: Reply submission failed >> [2018-08-27 09:58:37.910904] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >> to submit message (XID: 0xcb88cf, Program: GlusterFS 4.x v1, >> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >> [2018-08-27 09:58:37.910940] E >> [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >> [0x7ffb64379084] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >> ba) [0x7ffb5fddf5ba] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7ffb5fd89fce] ) 0-: Reply submission failed >> [2018-08-27 09:58:37.910979] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >> to submit message (XID: 0xcb88d1, Program: GlusterFS 4.x v1, >> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >> [2018-08-27 09:58:37.911012] E >> [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >> [0x7ffb64379084] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >> ba) [0x7ffb5fddf5ba] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7ffb5fd89fce] ) 0-: Reply submission failed >> [2018-08-27 09:58:37.911050] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >> to submit message (XID: 0xcb88d8, Program: GlusterFS 4.x v1, >> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >> [2018-08-27 09:58:37.911083] E >> [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >> [0x7ffb64379084] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >> ba) [0x7ffb5fddf5ba] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7ffb5fd89fce] ) 0-: Reply submission failed >> [2018-08-27 09:58:37.916217] E >> [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >> [0x7ffb64379084] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >> ba) [0x7ffb5fddf5ba] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7ffb5fd89fce] ) 0-: Reply submission failed >> [2018-08-27 09:58:37.916520] I [MSGID: 115013] >> [server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd >> cleanup on /balamak.img >> >> >> after this I/O on? /balamak.img was blocked. >> >> >> Only solution we found was to reboot all 3 nodes. >> >> >> Is there any bug report in bugzilla we can add logs? >> >> >> Not aware of such bugs! >> >> Is it possible to turn of these locks? >> >> >> Not sure, will get back on this one! > > > btw, found this link > https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/ > > tried on another (test) cluster: > > ?[root at marduk ~]# gluster volume statedump pool > Segmentation fault (core dumped) > > > 4.1.2 too... > > something is wrong here. > > >> Thank you! >> >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> >> -- >> Amar Tumballi (amarts) > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181002/6eae77b9/attachment.html>
Recently, in one of the situation, we found that locks were not freed up due to not getting TCP timeout.. Can you try the option like below and let us know? `gluster volume set $volname tcp-user-timeout 42` (ref: https://review.gluster.org/21170/ ) Regards, Amar On Tue, Oct 2, 2018 at 10:40 AM Dmitry Melekhov <dm at belkam.com> wrote:> 01.10.2018 23:09, Danny Lee ?????: > > Ran into this issue too with 4.1.5 with an arbiter setup. Also could not > run a statedump due to "Segmentation fault". > > Tried with 3.12.13 and had issues with locked files as well. We were able > to do a statedump and found that some of our files were "BLOCKED" > (xlator.features.locks.vol-locks.inode). Attached part of statedump. > > Also tried clearing the locks using clear-locks, which did remove the > lock, but as soon as I tried to cat the file, it got locked again and the > cat process hung. > > > I created issue in bugzilla, can't find it though :-( > Looks like there is no activity after I sent all logs... > > > > On Wed, Aug 29, 2018, 3:13 AM Dmitry Melekhov <dm at belkam.com> wrote: > >> 28.08.2018 10:43, Amar Tumballi ?????: >> >> >> >> On Tue, Aug 28, 2018 at 11:24 AM, Dmitry Melekhov <dm at belkam.com> wrote: >> >>> Hello! >>> >>> >>> Yesterday we hit something like this on 4.1.2 >>> >>> Centos 7.5. >>> >>> >>> Volume is replicated - two bricks and one arbiter. >>> >>> >>> We rebooted arbiter, waited for heal end, and tried to live migrate VM >>> to another node ( we run VMs on gluster nodes ): >>> >>> >>> [2018-08-27 09:56:22.085411] I [MSGID: 115029] >>> [server-handshake.c:763:server_setvolume] 0-pool-server: accepted client >>> from >>> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool- >>> client-6-RECON_NO:-0 (version: 4.1.2) >>> [2018-08-27 09:56:22.107609] I [MSGID: 115036] >>> [server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection >>> from >>> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool- >>> client-6-RECON_NO:-0 >>> [2018-08-27 09:56:22.107747] I [MSGID: 101055] >>> [client_t.c:444:gf_client_unref] 0-pool-server: Shutting down connection >>> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien >>> t-6-RECON_NO:-0 >>> [2018-08-27 09:58:37.905829] I [MSGID: 115036] >>> [server.c:483:server_rpc_notify] 0-pool-server: disconnecting connection >>> from >>> CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p >>> ool-client-6-RECON_NO:-0 >>> [2018-08-27 09:58:37.905926] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000} >>> [2018-08-27 09:58:37.905959] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000} >>> [2018-08-27 09:58:37.905979] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000} >>> [2018-08-27 09:58:37.905997] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000} >>> [2018-08-27 09:58:37.906016] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000} >>> [2018-08-27 09:58:37.906034] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000} >>> [2018-08-27 09:58:37.906056] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000} >>> [2018-08-27 09:58:37.906079] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000} >>> [2018-08-27 09:58:37.906098] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000} >>> [2018-08-27 09:58:37.906121] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000} >>> ... >>> >>> [2018-08-27 09:58:37.907375] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000} >>> [2018-08-27 09:58:37.907393] W [inodelk.c:610:pl_inodelk_log_cleanup] >>> 0-pool-server: releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000} >>> [2018-08-27 09:58:37.907476] I [socket.c:3837:socket_submit_reply] >>> 0-tcp.pool-server: not connected (priv->connected = -1) >>> [2018-08-27 09:58:37.907520] E [rpcsvc.c:1378:rpcsvc_submit_generic] >>> 0-rpc-service: failed to submit message (XID: 0xcb88cb, Program: GlusterFS >>> 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.910727] E [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.910814] E [rpcsvc.c:1378:rpcsvc_submit_generic] >>> 0-rpc-service: failed to submit message (XID: 0xcb88ce, Program: GlusterFS >>> 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.910861] E [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.910904] E [rpcsvc.c:1378:rpcsvc_submit_generic] >>> 0-rpc-service: failed to submit message (XID: 0xcb88cf, Program: GlusterFS >>> 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.910940] E [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.910979] E [rpcsvc.c:1378:rpcsvc_submit_generic] >>> 0-rpc-service: failed to submit message (XID: 0xcb88d1, Program: GlusterFS >>> 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.911012] E [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.911050] E [rpcsvc.c:1378:rpcsvc_submit_generic] >>> 0-rpc-service: failed to submit message (XID: 0xcb88d8, Program: GlusterFS >>> 4.x v1, ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.911083] E [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.916217] E [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.916520] I [MSGID: 115013] >>> [server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd cleanup on >>> /balamak.img >>> >>> >>> after this I/O on /balamak.img was blocked. >>> >>> >>> Only solution we found was to reboot all 3 nodes. >>> >>> >>> Is there any bug report in bugzilla we can add logs? >>> >>> >> Not aware of such bugs! >> >> >>> Is it possible to turn of these locks? >>> >>> >> Not sure, will get back on this one! >> >> >> >> btw, found this link >> https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/ >> >> tried on another (test) cluster: >> >> [root at marduk ~]# gluster volume statedump pool >> Segmentation fault (core dumped) >> >> >> 4.1.2 too... >> >> something is wrong here. >> >> >> >> >>> Thank you! >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> >> -- >> Amar Tumballi (amarts) >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181002/32ee8686/attachment.html>
Hi, Sorry for the delay, should have gotten to this earlier. We uncovered the issue in our internal QE testing and it is a regression. Details/patch is available in BZ 1637802.? I'll back-port it to the release branches once it gets merged.? Shout out to Pranith for helping with the RCA! Regards, Ravi On 10/02/2018 10:39 AM, Dmitry Melekhov wrote:> 01.10.2018 23:09, Danny Lee ?????: >> Ran into this issue too with 4.1.5 with an arbiter setup.? Also could >> not run a statedump due to "Segmentation fault". >> >> Tried with 3.12.13 and had issues with locked files as well.? We were >> able to do a statedump and found that some of our files were >> "BLOCKED" (xlator.features.locks.vol-locks.inode).? Attached part of >> statedump. >> >> Also tried clearing the locks using clear-locks, which did remove the >> lock, but as soon as I tried to cat the file, it got locked again and >> the cat process hung. > > I created issue in bugzilla, can't find it though :-( > Looks like there is no activity after I sent all logs... > > >> >> On Wed, Aug 29, 2018, 3:13 AM Dmitry Melekhov <dm at belkam.com >> <mailto:dm at belkam.com>> wrote: >> >> 28.08.2018 10:43, Amar Tumballi ?????: >>> >>> >>> On Tue, Aug 28, 2018 at 11:24 AM, Dmitry Melekhov <dm at belkam.com >>> <mailto:dm at belkam.com>> wrote: >>> >>> Hello! >>> >>> >>> Yesterday we hit something like this on 4.1.2 >>> >>> Centos 7.5. >>> >>> >>> Volume is replicated - two bricks and one arbiter. >>> >>> >>> We rebooted arbiter, waited for heal end,? and tried to live >>> migrate VM to another node ( we run VMs on gluster nodes ): >>> >>> >>> [2018-08-27 09:56:22.085411] I [MSGID: 115029] >>> [server-handshake.c:763:server_setvolume] 0-pool-server: >>> accepted client from >>> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool- >>> client-6-RECON_NO:-0 (version: 4.1.2) >>> [2018-08-27 09:56:22.107609] I [MSGID: 115036] >>> [server.c:483:server_rpc_notify] 0-pool-server: >>> disconnecting connection from >>> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool- >>> client-6-RECON_NO:-0 >>> [2018-08-27 09:56:22.107747] I [MSGID: 101055] >>> [client_t.c:444:gf_client_unref] 0-pool-server: Shutting >>> down connection >>> CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien >>> t-6-RECON_NO:-0 >>> [2018-08-27 09:58:37.905829] I [MSGID: 115036] >>> [server.c:483:server_rpc_notify] 0-pool-server: >>> disconnecting connection from >>> CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p >>> ool-client-6-RECON_NO:-0 >>> [2018-08-27 09:58:37.905926] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28c831d8bc550000} >>> [2018-08-27 09:58:37.905959] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2870a7d6bc550000} >>> [2018-08-27 09:58:37.905979] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880a7d6bc550000} >>> [2018-08-27 09:58:37.905997] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f031d8bc550000} >>> [2018-08-27 09:58:37.906016] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b07dd5bc550000} >>> [2018-08-27 09:58:37.906034] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28e0a7d6bc550000} >>> [2018-08-27 09:58:37.906056] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28b845d8bc550000} >>> [2018-08-27 09:58:37.906079] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2858a7d8bc550000} >>> [2018-08-27 09:58:37.906098] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2868a8d7bc550000} >>> [2018-08-27 09:58:37.906121] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28f80bd7bc550000} >>> ... >>> >>> [2018-08-27 09:58:37.907375] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=28a8cdd6bc550000} >>> [2018-08-27 09:58:37.907393] W >>> [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server: >>> releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318 held >>> by {client=0x7ffb58035bc0, pid=30292 lk-owner=2880cdd6bc550000} >>> [2018-08-27 09:58:37.907476] I >>> [socket.c:3837:socket_submit_reply] 0-tcp.pool-server: not >>> connected (priv->connected = -1) >>> [2018-08-27 09:58:37.907520] E >>> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >>> to submit message (XID: 0xcb88cb, Program: GlusterFS 4.x v1, >>> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.910727] E >>> [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.910814] E >>> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >>> to submit message (XID: 0xcb88ce, Program: GlusterFS 4.x v1, >>> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.910861] E >>> [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.910904] E >>> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >>> to submit message (XID: 0xcb88cf, Program: GlusterFS 4.x v1, >>> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.910940] E >>> [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.910979] E >>> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >>> to submit message (XID: 0xcb88d1, Program: GlusterFS 4.x v1, >>> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.911012] E >>> [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.911050] E >>> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed >>> to submit message (XID: 0xcb88d8, Program: GlusterFS 4.x v1, >>> ProgVers: 400, Proc: 30) to rpc-transport (tcp.pool-server) >>> [2018-08-27 09:58:37.911083] E >>> [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.916217] E >>> [server.c:137:server_submit_reply] >>> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084) >>> [0x7ffb64379084] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605 >>> ba) [0x7ffb5fddf5ba] >>> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >>> [0x7ffb5fd89fce] ) 0-: Reply submission failed >>> [2018-08-27 09:58:37.916520] I [MSGID: 115013] >>> [server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd >>> cleanup on /balamak.img >>> >>> >>> after this I/O on? /balamak.img was blocked. >>> >>> >>> Only solution we found was to reboot all 3 nodes. >>> >>> >>> Is there any bug report in bugzilla we can add logs? >>> >>> >>> Not aware of such bugs! >>> >>> Is it possible to turn of these locks? >>> >>> >>> Not sure, will get back on this one! >> >> >> btw, found this link >> https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/ >> >> tried on another (test) cluster: >> >> ?[root at marduk ~]# gluster volume statedump pool >> Segmentation fault (core dumped) >> >> >> 4.1.2 too... >> >> something is wrong here. >> >> >>> Thank you! >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> >>> -- >>> Amar Tumballi (amarts) >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181010/f8f1208b/attachment-0001.html>