Amar Tumballi
2018-Sep-14 02:16 UTC
[Gluster-users] Gluster clients intermittently hang until first gluster server in a Replica 1 Arbiter 1 cluster is rebooted, server error: 0-management: Unlocking failed & client error: bailing out frame type(GlusterFS 4.x v1) op(INODELK(29)) xid = 0x2131 sent = <datestamp>. timeout = 1800
On Mon, Sep 3, 2018 at 3:41 PM, Sam McLeod <mailinglists at smcleod.net> wrote:> I apologise for this being posted twice - I'm not sure if that was user > error or a bug in the mailing list, but the list wasn't showing my post > after quite some time so I sent a second email which near immediately > showed up - that's mailing lists I guess... > > Anyway, if anyone has any input, advice or abuse I'm welcome any input! > >We got little late to get back on this. But after running tests internally, we found possibly missing an volume option is the reason for this: Try gluster volume set <volname> server.tcp-user-timeout 42 on your volume. Let us know if this helps. (Ref: https://review.gluster.org/#/c/glusterfs/+/21170/)> -- > Sam McLeod > https://smcleod.net > https://twitter.com/s_mcleod > > On 3 Sep 2018, at 1:20 pm, Sam McLeod <mailinglists at smcleod.net> wrote: > > We've got an odd problem where clients are blocked from writing to Gluster > volumes until the first node of the Gluster cluster is rebooted. > > I suspect I've either configured something incorrectly with the arbiter / > replica configuration of the volumes, or there is some sort of bug in the > gluster client-server connection that we're triggering. > > I was wondering if anyone has seen this or could point me in the right > direction? > > > *Environment:* > > - Typology: 3 node cluster, replica 2, arbiter 1 (third node is > metadata only). > - Version: Client and Servers both running 4.1.3, both on CentOS 7, > kernel 4.18.x, (Xen) VMs with relatively fast networked SSD storage backing > them, XFS. > - Client: Native Gluster FUSE client mounting via the kubernetes > provider > > > *Problem:* > > - Seemingly randomly some clients will be blocked / are unable to > write to what should be a highly available gluster volume. > - The client gluster logs show it failing to do new file operations > across various volumes and all three nodes of the gluster. > - The server gluster (or OS) logs do not show any warnings or errors. > - The client recovers and is able to write to volumes again after the > first node of the gluster cluster is rebooted. > - Until the first node of the gluster cluster is rebooted, the client > fails to write to the volume that is (or should be) available on the second > node (a replica) and third node (an arbiter only node). > > > *What 'fixes' the issue:* > > - Although the clients (kubernetes hosts) connect to all 3 nodes of > the Gluster cluster - restarting the first gluster node always unblocks the > IO and allows the client to continue writing. > - Stopping and starting the glusterd service on the gluster server is > not enough to fix the issue, nor is restarting its networking. > - This suggests to me that the volume is unavailable for writing for > some reason and restarting the first node in the cluster either clears some > sort of TCP sessions between the client-server or between the server-server > replication. > > > *Expected behaviour:* > > > - If the first gluster node / server had failed or was blocked from > performing operations for some reason (which it doesn't seem it is), I'd > expect the clients to access data from the second gluster node and write > metadata to the third gluster node as well as it's an arbiter / metadata > only node. > - If for some reason the a gluster node was not able to serve > connections to clients, I'd expect to see errors in the volume, glusterd or > brick log files (there are none on the first gluster node). > - If the first gluster node was for some reason blocking IO on a > volume, I'd expect that node either to show as unhealthy or unavailable in > the gluster peer status or gluster volume status. > > > > *Client gluster errors:* > > > - staging_static in this example is a volume name. > - You can see the client trying to connect to the second and third > nodes of the gluster cluster and failing (unsure as to why?) > - The server side logs on the first gluster node do not show any > errors or problems, but the second / third node show errors in the > glusterd.log when trying to 'unlock' the 0-management volume on the first > node. > > > > *On a gluster client* (a kubernetes host using the kubernetes connector > which uses the native fuse client) when its blocked from writing but the > gluster appears healthy (other than the errors mentioned later): > > [2018-09-02 15:33:22.750874] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x1cce sent = 2018-09-02 15:03:22.417773. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 15:33:22.750989] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 16:03:23.097905] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x2e21 sent = 2018-09-02 15:33:22.765751. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 16:03:23.097988] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 16:33:23.439172] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x1d4b sent = 2018-09-02 16:03:23.098133. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 16:33:23.439282] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 17:03:23.786858] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x2ee7 sent = 2018-09-02 16:33:23.455171. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 17:03:23.786971] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 17:33:24.160607] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x1dc8 sent = 2018-09-02 17:03:23.787120. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 17:33:24.160720] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 18:03:24.505092] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x2faf sent = 2018-09-02 17:33:24.173153. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 18:03:24.505185] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 18:33:24.841248] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x1e45 sent = 2018-09-02 18:03:24.505328. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 18:33:24.841311] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 19:03:25.204711] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x3074 sent = 2018-09-02 18:33:24.855372. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 19:03:25.204784] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 19:33:25.533545] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x1ec2 sent = 2018-09-02 19:03:25.204977. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 19:33:25.533611] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 20:03:25.877020] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x3138 sent = 2018-09-02 19:33:25.545921. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 20:03:25.877098] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 20:33:26.217858] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x1f3e sent = 2018-09-02 20:03:25.877264. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 20:33:26.217973] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 21:03:26.588237] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x31ff sent = 2018-09-02 20:33:26.233010. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 21:03:26.588316] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 21:33:26.912334] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x1fbb sent = 2018-09-02 21:03:26.588456. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 21:33:26.912449] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 22:03:37.258915] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x32c5 sent = 2018-09-02 21:33:32.091009. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 22:03:37.259000] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 22:33:37.615497] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x2039 sent = 2018-09-02 22:03:37.259147. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 22:33:37.615574] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 23:03:37.940969] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x3386 sent = 2018-09-02 22:33:37.629655. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-02 23:03:37.941049] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-02 23:33:38.270998] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x20b5 sent = 2018-09-02 23:03:37.941199. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-02 23:33:38.271078] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-03 00:03:38.607186] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x3447 sent = 2018-09-02 23:33:38.285899. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-03 00:03:38.607263] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-03 00:33:38.934385] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x2131 sent = 2018-09-03 00:03:38.607410. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-03 00:33:38.934479] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > [2018-09-03 01:03:39.256842] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x350c sent = 2018-09-03 00:33:38.948570. timeout > 1800 for <ip of second gluster node>:49154 > [2018-09-03 01:03:39.256972] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-1: remote operation failed [Transport endpoint is > not connected] > [2018-09-03 01:33:39.614402] E [rpc-clnt.c:184:call_bail] > 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x v1) > op(INODELK(29)) xid = 0x21ae sent = 2018-09-03 01:03:39.258166. timeout > 1800 for <ip of third gluster node>:49154 > [2018-09-03 01:33:39.614483] E [MSGID: 114031] [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] > 0-staging_static-client-2: remote operation failed [Transport endpoint is > not connected] > > > *On the second gluster server:* > > > We are seeing the following error in the glusterd.log file when the client > is blocked from writing the volume, I think this is probably the most > important information about the error and suggests a problem with the first > node but doesn't explain the client behaviour: > > [2018-09-02 08:31:03.902272] E [MSGID: 106115] > [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking > failed on <FQDN of the first gluster node>. Please check log file for > details. > [2018-09-02 08:31:03.902477] E [MSGID: 106151] [glusterd-syncop.c:1640:gd_unlock_op_phase] > 0-management: Failed to unlock on some peer(s) > > Note in the above error: > > 1. I'm not sure which log to check (there doesn't seem to be a management > brick / brick log)? > 2. If there's a problem with the first node, why isn't it rejected from > the gluster / taken offline / the health of the peers or volume list > degraded? > 3. Why does the client fail to write to the volume rather than (I'm > assuming) trying the second (or third I guess) node to write to the volume? > > > We are also seeing the following errors repeated a lot in the logs, both > when the volumes are working and when there's an issue in the brick log > (/var/log/glusterfs/bricks/mnt-gluster-storage-staging_static.log): > > [2018-09-03 01:58:35.128923] E [server.c:137:server_submit_reply] > (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) > [0x7f8470319d14] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) > [0x7f846bdde24a] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) > [0x7f846bd89fce] ) 0-: Reply submission failed > [2018-09-03 01:58:35.128957] E [rpcsvc.c:1378:rpcsvc_submit_generic] > 0-rpc-service: failed to submit message (XID: 0x3d60, Program: GlusterFS > 4.x v1, ProgVers: 400, Proc: 29) to rpc-transport > (tcp.staging_static-server) > [2018-09-03 01:58:35.128983] E [server.c:137:server_submit_reply] > (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) > [0x7f8470319d14] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) > [0x7f846bdde24a] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) > [0x7f846bd89fce] ) 0-: Reply submission failed > [2018-09-03 01:58:35.129016] E [rpcsvc.c:1378:rpcsvc_submit_generic] > 0-rpc-service: failed to submit message (XID: 0x3e2a, Program: GlusterFS > 4.x v1, ProgVers: 400, Proc: 29) to rpc-transport > (tcp.staging_static-server) > [2018-09-03 01:58:35.129042] E [server.c:137:server_submit_reply] > (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) > [0x7f8470319d14] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) > [0x7f846bdde24a] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) > [0x7f846bd89fce] ) 0-: Reply submission failed > [2018-09-03 01:58:35.129077] E [rpcsvc.c:1378:rpcsvc_submit_generic] > 0-rpc-service: failed to submit message (XID: 0x3ef6, Program: GlusterFS > 4.x v1, ProgVers: 400, Proc: 29) to rpc-transport > (tcp.staging_static-server) > [2018-09-03 01:58:35.129149] E [server.c:137:server_submit_reply] > (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) > [0x7f8470319d14] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) > [0x7f846bdde24a] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) > [0x7f846bd89fce] ) 0-: Reply submission failed > [2018-09-03 01:58:35.129191] E [rpcsvc.c:1378:rpcsvc_submit_generic] > 0-rpc-service: failed to submit message (XID: 0x3fc6, Program: GlusterFS > 4.x v1, ProgVers: 400, Proc: 29) to rpc-transport > (tcp.staging_static-server) > [2018-09-03 01:58:35.129219] E [server.c:137:server_submit_reply] > (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) > [0x7f8470319d14] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) > [0x7f846bdde24a] -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) > [0x7f846bd89fce] ) 0-: Reply submission failed > > > > *Gluster volume information:* > > > # gluster volume info staging_static > > Volume Name: staging_static > Type: Replicate > Volume ID: 7f3b8e91-afea-4fc6-be83-3399a089b6f3 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: <first gluster node.fqdn>:/mnt/gluster-storage/staging_static > Brick2: <second gluster node.fqdn>:/mnt/gluster-storage/staging_static > Brick3: <third gluster node.fqdn>:/mnt/gluster-storage/staging_static > (arbiter) > Options Reconfigured: > storage.fips-mode-rchecksum: true > cluster.self-heal-window-size: 16 > cluster.shd-wait-qlength: 4096 > cluster.shd-max-threads: 8 > performance.cache-min-file-size: 2KB > performance.rda-cache-limit: 1GB > network.inode-lru-limit: 50000 > server.outstanding-rpc-limit: 256 > transport.listen-backlog: 2048 > performance.write-behind-window-size: 512MB > performance.stat-prefetch: true > performance.io-thread-count: 16 > performance.client-io-threads: true > performance.cache-size: 1GB > performance.cache-refresh-timeout: 60 > performance.cache-invalidation: true > cluster.use-compound-fops: true > cluster.readdir-optimize: true > cluster.lookup-optimize: true > cluster.favorite-child-policy: size > cluster.eager-lock: true > client.event-threads: 4 > nfs.disable: on > transport.address-family: inet > diagnostics.brick-log-level: ERROR > diagnostics.client-log-level: ERROR > features.cache-invalidation-timeout: 300 > features.cache-invalidation: true > network.ping-timeout: 15 > performance.cache-max-file-size: 3MB > performance.md-cache-timeout: 300 > server.event-threads: 4 > > Thanks in advance, > > > -- > Sam McLeod (protoporpoise on IRC) > https://smcleod.net > https://twitter.com/s_mcleod > > Words are my own opinions and do not necessarily represent those of > my employer or partners. > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180914/f4886c34/attachment.html>
Hoggins!
2018-Oct-23 16:31 UTC
[Gluster-users] Gluster clients intermittently hang until first gluster server in a Replica 1 Arbiter 1 cluster is rebooted, server error: 0-management: Unlocking failed & client error: bailing out frame type(GlusterFS 4.x v1) op(INODELK(29)) xid = 0x2131 sent = <datestamp>. timeout = 1800
Hello there, I'm stumbling upon the *exact same issue*, and unfortunately setting the server.tcp-user-timeout to 42 does not help. Any other suggestion? I'm running a replica 3 arbiter 1 GlusterFS cluster, all nodes running version 4.1.5 (Fedora 28), and /sometimes/ the workaround (rebooting a node) suggested by Sam works, but it often doesn't. You may ask how I got into this, well it's simple: I needed to replace my brick 1 and brick 2 with two brand new machines, so here's what I did: ??? - add brick 3 and brick 4 into the cluster (gluster peer probe, gluster volume add-brick, etc., with the issue regarding the arbiter node that has to be first removed from the cluster before being able to add bricks 3 and 4) ??? - wait for all the files on my volumes to heal. It took a few days. ??? - remove bricks 1 and 2 ??? - after having "reset" the arbiter, re-add the arbiter into the cluster And now it's intermittently hanging on writing *on existing files*. There is *no problem for writing new files* on the volumes. I'm lost here, thanks for your inputs! ??? Hoggins! Le 14/09/2018 ? 04:16, Amar Tumballi a ?crit?:> > > On Mon, Sep 3, 2018 at 3:41 PM, Sam McLeod <mailinglists at smcleod.net > <mailto:mailinglists at smcleod.net>> wrote: > > I apologise for this being posted twice - I'm not sure if that was > user error or a bug in the mailing list, but the list wasn't > showing my post after quite some time so I sent a second email > which near immediately showed up - that's mailing lists I guess... > > Anyway, if anyone has any input, advice or abuse I'm welcome any > input! > > > We got little late to get back on this. But after running tests > internally, we found possibly missing an volume option is the reason > for this: > > Try? > > gluster volume set <volname> server.tcp-user-timeout 42 > on your volume. Let us know if this helps. > (Ref:?https://review.gluster.org/#/c/glusterfs/+/21170/) > ? > > -- > Sam McLeod > https://smcleod.net > https://twitter.com/s_mcleod > >> On 3 Sep 2018, at 1:20 pm, Sam McLeod <mailinglists at smcleod.net >> <mailto:mailinglists at smcleod.net>> wrote: >> >> We've got an odd problem where clients are blocked from writing >> to Gluster volumes until the first node of the Gluster cluster is >> rebooted. >> >> I suspect I've either configured something incorrectly with the >> arbiter / replica configuration of the volumes, or there is some >> sort of bug in the gluster client-server connection that we're >> triggering. >> >> I was wondering if anyone has seen this or could point me in the >> right direction? >> >> >> *Environment:* >> >> * Typology: 3 node cluster, replica 2, arbiter 1 (third node is >> metadata only). >> * Version: Client and Servers both running 4.1.3, both on >> CentOS 7, kernel 4.18.x, (Xen) VMs with relatively fast >> networked SSD storage backing them, XFS. >> * Client: Native Gluster FUSE client mounting via the >> kubernetes provider >> >> >> *Problem:* >> >> * Seemingly randomly some clients will be blocked / are unable >> to write to what should be a highly available gluster volume. >> * The client gluster logs show it failing to do new file >> operations across various volumes and all three nodes of the >> gluster. >> * The server gluster (or OS) logs do not show any warnings or >> errors. >> * The client recovers and is able to write to volumes again >> after the first node of the gluster cluster is rebooted. >> * Until the first node of the gluster cluster is rebooted, the >> client fails to write to the volume that is (or should be) >> available on the second node (a replica) and third node (an >> arbiter only node). >> >> >> *What 'fixes' the issue:* >> >> * Although the clients (kubernetes hosts) connect to all 3 >> nodes of the Gluster cluster - restarting the first gluster >> node always?unblocks the IO and allows the client to continue >> writing. >> * Stopping and starting the glusterd service on the gluster >> server is not enough to fix the issue, nor is restarting its >> networking. >> * This suggests to me that the volume is unavailable for >> writing for some reason and restarting the first node in the >> cluster either clears some sort of TCP sessions between the >> client-server or between the server-server replication. >> >> >> *Expected behaviour:* >> >> * If the first gluster node / server had failed or was blocked >> from performing operations for some reason (which it doesn't >> seem it is), I'd expect the clients to access data from the >> second gluster node and write metadata to the third gluster >> node as well as it's an arbiter / metadata only node. >> * If for some reason the a gluster node was not able to serve >> connections to clients, I'd expect to see errors in the >> volume, glusterd or brick log files (there are none on the >> first gluster node). >> * If the first gluster node was for some reason blocking IO on >> a volume, I'd expect that node either to show as unhealthy or >> unavailable in the gluster peer status or gluster volume status. >> >> >> >> *Client gluster errors:* >> >> * staging_static in this example is a volume name. >> * You can see the client trying to connect to the second and >> third nodes of the gluster cluster and failing (unsure as to >> why?) >> * The server side logs on the first gluster node do not show >> any errors or problems, but the second / third node show >> errors in the glusterd.log when trying to 'unlock' the >> 0-management volume on the first node. >> >> >> >> *On a gluster client*?(a kubernetes host using the kubernetes >> connector which uses the native fuse client) when its blocked >> from writing but the gluster appears healthy (other than the >> errors mentioned later): >> >> [2018-09-02 15:33:22.750874] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x1cce sent = 2018-09-02 >> 15:03:22.417773. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 15:33:22.750989] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 16:03:23.097905] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x2e21 sent = 2018-09-02 >> 15:33:22.765751. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 16:03:23.097988] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 16:33:23.439172] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x1d4b sent = 2018-09-02 >> 16:03:23.098133. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 16:33:23.439282] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 17:03:23.786858] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x2ee7 sent = 2018-09-02 >> 16:33:23.455171. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 17:03:23.786971] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 17:33:24.160607] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x1dc8 sent = 2018-09-02 >> 17:03:23.787120. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 17:33:24.160720] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 18:03:24.505092] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x2faf sent = 2018-09-02 >> 17:33:24.173153. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 18:03:24.505185] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 18:33:24.841248] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x1e45 sent = 2018-09-02 >> 18:03:24.505328. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 18:33:24.841311] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 19:03:25.204711] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x3074 sent = 2018-09-02 >> 18:33:24.855372. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 19:03:25.204784] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 19:33:25.533545] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x1ec2 sent = 2018-09-02 >> 19:03:25.204977. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 19:33:25.533611] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 20:03:25.877020] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x3138 sent = 2018-09-02 >> 19:33:25.545921. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 20:03:25.877098] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 20:33:26.217858] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x1f3e sent = 2018-09-02 >> 20:03:25.877264. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 20:33:26.217973] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 21:03:26.588237] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x31ff sent = 2018-09-02 >> 20:33:26.233010. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 21:03:26.588316] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 21:33:26.912334] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x1fbb sent = 2018-09-02 >> 21:03:26.588456. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 21:33:26.912449] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 22:03:37.258915] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x32c5 sent = 2018-09-02 >> 21:33:32.091009. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 22:03:37.259000] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 22:33:37.615497] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x2039 sent = 2018-09-02 >> 22:03:37.259147. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 22:33:37.615574] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 23:03:37.940969] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x3386 sent = 2018-09-02 >> 22:33:37.629655. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-02 23:03:37.941049] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-02 23:33:38.270998] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x20b5 sent = 2018-09-02 >> 23:03:37.941199. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-02 23:33:38.271078] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-03 00:03:38.607186] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x3447 sent = 2018-09-02 >> 23:33:38.285899. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-03 00:03:38.607263] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-03 00:33:38.934385] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x2131 sent = 2018-09-03 >> 00:03:38.607410. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-03 00:33:38.934479] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-03 01:03:39.256842] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-1: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x350c sent = 2018-09-03 >> 00:33:38.948570. timeout = 1800 for <ip of second gluster node>:49154 >> [2018-09-03 01:03:39.256972] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-1: remote operation failed [Transport >> endpoint is not connected] >> [2018-09-03 01:33:39.614402] E [rpc-clnt.c:184:call_bail] >> 0-staging_static-client-2: bailing out frame type(GlusterFS 4.x >> v1) op(INODELK(29)) xid = 0x21ae sent = 2018-09-03 >> 01:03:39.258166. timeout = 1800 for <ip of third gluster node>:49154 >> [2018-09-03 01:33:39.614483] E [MSGID: 114031] >> [client-rpc-fops_v2.c:1306:client4_0_inodelk_cbk] >> 0-staging_static-client-2: remote operation failed [Transport >> endpoint is not connected] >> >> >> *On the second gluster server:* >> >> >> We are seeing the following error in the glusterd.log file when >> the client is blocked from writing the volume, I think this is >> probably the most important information about the error and >> suggests a problem with the first node but doesn't explain the >> client behaviour: >> >> [2018-09-02 08:31:03.902272] E [MSGID: 106115] >> [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: >> Unlocking failed on <FQDN of the first gluster node>. Please >> check log file for details. >> [2018-09-02 08:31:03.902477] E [MSGID: 106151] >> [glusterd-syncop.c:1640:gd_unlock_op_phase] 0-management: Failed >> to unlock on some peer(s) >> >> Note in the above error: >> >> 1. I'm not sure which log to check (there doesn't seem to be a >> management brick / brick log)? >> 2. If there's a problem with the first node, why isn't it >> rejected from the gluster / taken offline / the health of the >> peers or volume list degraded? >> 3. Why does the client fail to write to the volume rather than >> (I'm assuming) trying the second (or third I guess) node to write >> to the volume? >> >> >> We are also seeing the following errors repeated a lot in the >> logs, both when the volumes are working and when there's an issue >> in the brick log >> (/var/log/glusterfs/bricks/mnt-gluster-storage-staging_static.log): >> >> [2018-09-03 01:58:35.128923] E [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) >> [0x7f8470319d14] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) >> [0x7f846bdde24a] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7f846bd89fce] ) 0-: Reply submission failed >> [2018-09-03 01:58:35.128957] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to >> submit message (XID: 0x3d60, Program: GlusterFS 4.x v1, ProgVers: >> 400, Proc: 29) to rpc-transport (tcp.staging_static-server) >> [2018-09-03 01:58:35.128983] E [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) >> [0x7f8470319d14] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) >> [0x7f846bdde24a] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7f846bd89fce] ) 0-: Reply submission failed >> [2018-09-03 01:58:35.129016] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to >> submit message (XID: 0x3e2a, Program: GlusterFS 4.x v1, ProgVers: >> 400, Proc: 29) to rpc-transport (tcp.staging_static-server) >> [2018-09-03 01:58:35.129042] E [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) >> [0x7f8470319d14] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) >> [0x7f846bdde24a] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7f846bd89fce] ) 0-: Reply submission failed >> [2018-09-03 01:58:35.129077] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to >> submit message (XID: 0x3ef6, Program: GlusterFS 4.x v1, ProgVers: >> 400, Proc: 29) to rpc-transport (tcp.staging_static-server) >> [2018-09-03 01:58:35.129149] E [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) >> [0x7f8470319d14] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) >> [0x7f846bdde24a] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7f846bd89fce] ) 0-: Reply submission failed >> [2018-09-03 01:58:35.129191] E >> [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service: failed to >> submit message (XID: 0x3fc6, Program: GlusterFS 4.x v1, ProgVers: >> 400, Proc: 29) to rpc-transport (tcp.staging_static-server) >> [2018-09-03 01:58:35.129219] E [server.c:137:server_submit_reply] >> (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x1fd14) >> [0x7f8470319d14] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x5f24a) >> [0x7f846bdde24a] >> -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce) >> [0x7f846bd89fce] ) 0-: Reply submission failed >> >> >> >> *Gluster volume information:* >> >> >> # gluster volume info staging_static >> >> Volume Name: staging_static >> Type: Replicate >> Volume ID: 7f3b8e91-afea-4fc6-be83-3399a089b6f3 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x (2 + 1) = 3 >> Transport-type: tcp >> Bricks: >> Brick1: <first gluster node.fqdn>:/mnt/gluster-storage/staging_static >> Brick2: <second gluster >> node.fqdn>:/mnt/gluster-storage/staging_static >> Brick3: <third gluster >> node.fqdn>:/mnt/gluster-storage/staging_static (arbiter) >> Options Reconfigured: >> storage.fips-mode-rchecksum: true >> cluster.self-heal-window-size: 16 >> cluster.shd-wait-qlength: 4096 >> cluster.shd-max-threads: 8 >> performance.cache-min-file-size: 2KB >> performance.rda-cache-limit: 1GB >> network.inode-lru-limit: 50000 >> server.outstanding-rpc-limit: 256 >> transport.listen-backlog: 2048 >> performance.write-behind-window-size: 512MB >> performance.stat-prefetch: true >> performance.io <http://performance.io/>-thread-count: 16 >> performance.client-io-threads: true >> performance.cache-size: 1GB >> performance.cache-refresh-timeout: 60 >> performance.cache-invalidation: true >> cluster.use-compound-fops: true >> cluster.readdir-optimize: true >> cluster.lookup-optimize: true >> cluster.favorite-child-policy: size >> cluster.eager-lock: true >> client.event-threads: 4 >> nfs.disable: on >> transport.address-family: inet >> diagnostics.brick-log-level: ERROR >> diagnostics.client-log-level: ERROR >> features.cache-invalidation-timeout: 300 >> features.cache-invalidation: true >> network.ping-timeout: 15 >> performance.cache-max-file-size: 3MB >> performance.md-cache-timeout: 300 >> server.event-threads: 4 >> >> Thanks in advance, >> >> >> -- >> Sam McLeod (protoporpoise on IRC) >> https://smcleod.net <https://smcleod.net/> >> https://twitter.com/s_mcleod >> >> Words are my own opinions and do not?necessarily represent those >> of my?employer or partners. >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> https://lists.gluster.org/mailman/listinfo/gluster-users >> <https://lists.gluster.org/mailman/listinfo/gluster-users> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users > <https://lists.gluster.org/mailman/listinfo/gluster-users> > > > > > -- > Amar Tumballi (amarts) > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 195 bytes Desc: OpenPGP digital signature URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181023/7628e663/attachment.sig>