Gambit15
2017-May-22 20:50 UTC
[Gluster-users] Quorum lost with single failed peer in rep 3...?
Hey guys, I use a replica 3 arbiter 1 setup for hosting VMs, and have just had an issue where taking one of the non-arbiter peers offline caused gluster to complain of lost quorum & pause the volumes. The two "full" peers host the VMs and data, and the arbiter is a VM on a neighbouring cluster. Before taking the peer offline, I migrated all VMs from it, verified there were no running heal processes, and that all peers were connected. Quorum is configured with the following... cluster.server-quorum-type: server cluster.quorum-type: auto As I understand it, quorum auto means 51%, so quorum should be maintained if any one of the peers fails. There have been a couple of occasions where the arbiter went offline, but quorum was maintained as expected & everything continued to function. When the volumes were paused, I connected to the remaining node to see what was going on. "gluster peer status" reported the offline node as disconnected & the arbiter as connected, as expected. All "gluster volume" commands hung. When the offline node was rebooted, quorum returned & all services resumed.>From the logs (pasted below), it appears the primary node & the arbiterdisconnected from each other around the time the secondary node went offline, although that's contrary to what was reported by "gluster peer status". s0, 10.123.123.10: Full peer s1, 10.123.123.11: Full peer (taken offline) s2, 10.123.123.12: Arbiter ========= s0 ======== [2017-05-22 18:24:20.854775] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 18:31:30.398272] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x7ab6 sent = 2017-05-22 18:21:20.549877. timeout = 600 for 10.123.123.12:2 4007 [2017-05-22 18:35:20.420878] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(glusterd mgmt) op(--(3)) xid = 0x7ab7 sent 2017-05-22 18:25:11.187323. timeout = 600 for 10.123.123. 12:24007 [2017-05-22 18:35:20.420943] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on s2. Please check log file for details. [2017-05-22 18:35:20.421103] I [socket.c:3465:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2017-05-22 18:35:20.421126] E [rpcsvc.c:1325:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.ma nagement) [2017-05-22 18:35:20.421145] E [MSGID: 106430] [glusterd-utils.c:470:glusterd_submit_reply] 0-glusterd: Reply submission failed [2017-05-22 18:36:24.732098] W [socket.c:590:__socket_rwv] 0-management: readv on 10.123.123.11:24007 failed (N?o h? dados dispon?veis) [2017-05-22 18:36:24.732214] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s1> (<b3e4cf8e-3acd-412c-bfdb-a39a122a61b6>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:36:24.732293] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol data not held [2017-05-22 18:36:24.732303] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:36:24.732323] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:36:24.732330] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:36:24.732350] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:36:24.732382] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 18:36:24.732405] C [MSGID: 106002] [glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action] 0-management: Server quorum lost for volume data. Stopping local bricks. [2017-05-22 18:36:24.740516] C [MSGID: 106002] [glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action] 0-management: Server quorum lost for volume engine. Stopping local bricks. [2017-05-22 18:36:24.742215] C [MSGID: 106002] [glusterd-server-quorum.c:346:glusterd_do_volume_quorum_action] 0-management: Server quorum lost for volume iso. Stopping local bricks. [2017-05-22 18:36:24.744460] I [MSGID: 101053] [mem-pool.c:617:mem_pool_destroy] 0-management: size=588 max=0 total=0 [2017-05-22 18:36:24.744481] I [MSGID: 101053] [mem-pool.c:617:mem_pool_destroy] 0-management: size=124 max=0 total=0 [2017-05-22 18:36:24.744561] I [MSGID: 106144] [glusterd-pmap.c:272:pmap_registry_remove] 0-pmap: removing brick /gluster/data/brick on port 49155 [2017-05-22 18:36:24.746219] I [MSGID: 101053] [mem-pool.c:617:mem_pool_destroy] 0-management: size=588 max=0 total=0 [2017-05-22 18:36:24.746238] I [MSGID: 101053] [mem-pool.c:617:mem_pool_destroy] 0-management: size=124 max=0 total=0 [2017-05-22 18:36:24.747833] I [MSGID: 101053] [mem-pool.c:617:mem_pool_destroy] 0-management: size=588 max=0 total=0 [2017-05-22 18:36:24.747852] I [MSGID: 101053] [mem-pool.c:617:mem_pool_destroy] 0-management: size=124 max=0 total=0 [2017-05-22 18:36:24.747895] I [MSGID: 106144] [glusterd-pmap.c:272:pmap_registry_remove] 0-pmap: removing brick /gluster/engine/brick on port 49154 [2017-05-22 18:36:24.747955] I [MSGID: 106144] [glusterd-pmap.c:272:pmap_registry_remove] 0-pmap: removing brick /gluster/iso/brick on port 49156 [2017-05-22 18:36:35.431035] E [socket.c:2309:socket_connect_finish] 0-management: connection to 10.123.123.11:24007 failed (Conex?o recusada) [2017-05-22 18:36:53.977055] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f17b9c83002] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f17b9a4a84e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f17b9a4a95e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f17b9a4c0b4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f17b9a4c990] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2017-05-22 18:28:12.298772 (xid=0x7ab8) [2017-05-22 18:36:53.977758] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f17b9c83002] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f17b9a4a84e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f17b9a4a95e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f17b9a4c0b4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f17b9a4c990] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2017-05-22 18:31:13.395988 (xid=0x7ab9) [2017-05-22 18:36:53.977849] I [socket.c:3465:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2017-05-22 18:36:53.977864] E [rpcsvc.c:1325:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) The message "E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on s2. Please check log file for details." repeated 2 times between [2017-05-22 18:35:20.420943] and [2017-05-22 18:36:53.977999] [2017-05-22 18:36:53.978018] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s2> (<5e8a70d1-e305-4eeb-a557-a6152f4ba901>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:36:53.978080] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol data not held [2017-05-22 18:36:53.978111] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:36:53.978089] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:36:53.978117] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:36:53.978139] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:36:53.978145] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 18:36:53.978443] I [socket.c:3465:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2017-05-22 18:36:53.978461] E [rpcsvc.c:1325:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) The message "E [MSGID: 106430] [glusterd-utils.c:470:glusterd_submit_reply] 0-glusterd: Reply submission failed" repeated 2 times between [2017-05-22 18:35:20.421145] and [2017-05-22 18:36:53.978473] [2017-05-22 18:40:00.130129] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 18:47:14.988082] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x7abd sent = 2017-05-22 18:37:04.526550. timeout = 600 for 10.123.123.12:24007 [2017-05-22 18:52:37.592505] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s2> (<5e8a70d1-e305-4eeb-a557-a6152f4ba901>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:52:37.592601] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol data not held [2017-05-22 18:52:37.592619] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:52:37.592636] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:52:37.592643] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:52:37.592659] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:52:37.592665] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 18:55:38.761423] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 19:02:58.838267] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x7ac1 sent = 2017-05-22 18:52:48.591391. timeout = 600 for 10.123.123.12:24007 [2017-05-22 19:03:54.628819] I [MSGID: 106487] [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2017-05-22 19:04:04.548779] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data [2017-05-22 19:08:21.720512] W [socket.c:590:__socket_rwv] 0-management: readv on 10.123.123.12:24007 failed (Tempo esgotado para conex?o) [2017-05-22 19:08:21.721297] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f17b9c83002] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f17b9a4a84e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f17b9a4a95e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f17b9a4c0b4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f17b9a4c990] ))))) 0-management: forced unwinding frame type(glusterd mgmt) op(--(3)) called at 2017-05-22 19:03:09.392577 (xid=0x7ac2) [2017-05-22 19:08:21.721342] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on s2. Please check log file for details. [2017-05-22 19:08:21.721518] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f17b9c83002] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f17b9a4a84e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f17b9a4a95e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f17b9a4c0b4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f17b9a4c990] ))))) 0-management: forced unwinding frame type(glusterd mgmt v3) op(--(1)) called at 2017-05-22 19:04:04.552716 (xid=0x7ac3) [2017-05-22 19:08:21.721595] I [socket.c:3465:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2017-05-22 19:08:21.721607] E [rpcsvc.c:1325:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2017-05-22 19:08:21.721625] E [MSGID: 106430] [glusterd-utils.c:470:glusterd_submit_reply] 0-glusterd: Reply submission failed [2017-05-22 19:08:21.721746] E [MSGID: 106116] [glusterd-mgmt.c:135:gd_mgmt_v3_collate_errors] 0-management: Locking failed on s2. Please check log file for details. [2017-05-22 19:08:21.721780] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s2> (<5e8a70d1-e305-4eeb-a557-a6152f4ba901>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 19:08:21.721818] W [glusterd-locks.c:686:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd05ea) [0x7f17ae4b35ea] ) 0-management: Lock owner mismatch. Lock for vol data held by ae81e207-f383-4b07-a52c-c4e08b0d3361 [2017-05-22 19:08:21.721831] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 19:08:21.721848] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol engine not held [2017-05-22 19:08:21.721860] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 19:08:21.721882] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f17ae400e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f17ae40aa08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f17ae4b37fa] ) 0-management: Lock for vol iso not held [2017-05-22 19:08:21.721898] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 19:08:21.722117] E [MSGID: 106151] [glusterd-syncop.c:1884:gd_sync_task_begin] 0-management: Locking Peers Failed. [2017-05-22 19:08:21.722169] I [socket.c:3465:socket_submit_reply] 0-socket.management: not connected (priv->connected = -1) [2017-05-22 19:08:21.722176] E [rpcsvc.c:1325:rpcsvc_submit_generic] 0-rpc-service: failed to submit message (XID: 0x1, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to rpc-transport (socket.management) [2017-05-22 19:08:21.722185] E [MSGID: 106430] [glusterd-utils.c:470:glusterd_submit_reply] 0-glusterd: Reply submission failed [2017-05-22 19:11:17.417438] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 19:15:16.108096] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 19:15:16.110580] I [MSGID: 106490] [glusterd-handler.c:2608:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: b3e4cf8e-3acd-412c-bfdb-a39a122a61b6 [2017-05-22 19:15:16.507814] I [MSGID: 106493] [glusterd-handler.c:3852:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to s1 (0), ret: 0, op_ret: 0 [2017-05-22 19:15:16.510688] C [MSGID: 106003] [glusterd-server-quorum.c:341:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume data. Starting local bricks. [2017-05-22 19:15:16.519121] I [rpc-clnt.c:1033:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-05-22 19:15:16.548446] C [MSGID: 106003] [glusterd-server-quorum.c:341:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume engine. Starting local bricks. [2017-05-22 19:15:16.554878] I [rpc-clnt.c:1033:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-05-22 19:15:16.555051] C [MSGID: 106003] [glusterd-server-quorum.c:341:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume iso. Starting local bricks. [2017-05-22 19:15:16.561907] I [rpc-clnt.c:1033:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 ========= s2 ======== [2017-05-22 18:24:10.637350] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s0> (<ae81e207-f383-4b07-a52c-c4e08b0d3361>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:24:10.637465] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol data not held [2017-05-22 18:24:10.637483] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:24:10.637512] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:24:10.637525] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:24:10.637553] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:24:10.637566] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 18:24:46.820233] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x2f2e sent = 2017-05-22 18:14:36.744402. timeout = 600 for 10.123.123.11:2 4007 [2017-05-22 18:30:04.941304] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s1> (<b3e4cf8e-3acd-412c-bfdb-a39a122a61b6>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:30:04.941375] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol data not held [2017-05-22 18:30:04.941383] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:30:04.941397] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:30:04.941402] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:30:04.941415] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:30:04.941420] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 18:33:41.086536] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 18:34:30.901078] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x2f32 sent = 2017-05-22 18:24:20.820320. timeout = 600 for 10.123.123.10:2 4007 [2017-05-22 18:36:24.695019] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f03dd7ab002] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f0 3dd57284e] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f03dd57295e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f03dd5740b4] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+ 0x120)[0x7f03dd574990] ))))) 0-management: forced unwinding frame type(Peer mgmt) op(--(2)) called at 2017-05-22 18:30:15.865813 (xid=0x2f32) [2017-05-22 18:36:24.695063] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s1> (<b3e4cf8e-3acd-412c-bfdb-a39a122a61b6>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:36:24.695094] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol data not held [2017-05-22 18:36:24.695101] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:36:24.695114] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:36:24.695119] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:36:24.695131] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:36:24.695168] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 18:36:34.920338] E [socket.c:2309:socket_connect_finish] 0-management: connection to 10.123.123.11:24007 failed (Conex?o recusada) [2017-05-22 18:37:04.421376] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 18:39:49.133296] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s0> (<ae81e207-f383-4b07-a52c-c4e08b0d3361>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:39:49.133386] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/ mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol data not held [2017-05-22 18:39:49.133399] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:39:49.133420] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:39:49.133429] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:39:49.133450] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:39:49.133476] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 18:50:10.474716] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x2f36 sent = 2017-05-22 18:40:00.093742. timeout = 600 for 10.123.123.10:24007 [2017-05-22 18:52:48.256080] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 18:55:28.141245] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s0> (<ae81e207-f383-4b07-a52c-c4e08b0d3361>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 18:55:28.141354] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol data not held [2017-05-22 18:55:28.141368] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 18:55:28.141384] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol engine not held [2017-05-22 18:55:28.141390] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 18:55:28.141404] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol iso not held [2017-05-22 18:55:28.141410] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 19:05:39.136721] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x2f3a sent = 2017-05-22 18:55:38.721768. timeout = 600 for 10.123.123.10:24007 [2017-05-22 19:08:32.089775] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 19:11:07.149344] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s0> (<ae81e207-f383-4b07-a52c-c4e08b0d3361>), in state <Peer in Cluster>, has disconnected from glusterd. [2017-05-22 19:11:07.149420] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol data not held [2017-05-22 19:11:07.149428] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for data [2017-05-22 19:11:07.149442] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol engine not held [2017-05-22 19:11:07.149448] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for engine [2017-05-22 19:11:07.149479] W [glusterd-locks.c:675:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x1de5c) [0x7f03d1f28e5c] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0x27a08) [0x7f03d1f32a08] -->/usr/lib64/glusterfs/3.8.5/xlator/mgmt/glusterd.so(+0xd07fa) [0x7f03d1fdb7fa] ) 0-management: Lock for vol iso not held [2017-05-22 19:11:07.149485] W [MSGID: 106118] [glusterd-handler.c:5241:__glusterd_peer_rpc_notify] 0-management: Lock not released for iso [2017-05-22 19:15:16.067325] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 19:21:27.593908] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x2f3e sent = 2017-05-22 19:11:17.378388. timeout = 600 for 10.123.123.10:24007 [2017-05-22 19:24:15.517638] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-05-22 19:25:25.623628] E [rpc-clnt.c:200:call_bail] 0-management: bailing out frame type(Peer mgmt) op(--(2)) xid = 0x2f36 sent = 2017-05-22 19:15:16.138881. timeout = 600 for 10.123.123.11:24007 [2017-05-22 19:26:45.645812] I [MSGID: 106004] [glusterd-handler.c:5219:__glusterd_peer_rpc_notify] 0-management: Peer <s0> (<ae81e207-f383-4b07-a52c-c4e08b0d3361>), in state <Peer in Cluster>, has disconnected from glusterd. ==================================== Something odd definitely seems to be going on with the arbiter (s2)... Many thanks in advance! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170522/3d98a63f/attachment.html>