Gmail
2016-Dec-16 08:24 UTC
[Gluster-users] Quorum sets node RO even after restoring the heartbeat
Hi All, I?ve a three nodes replica 3 cluster. A network split happened which marked one of the three nodes offline on two nodes. And this very node set itself as RO. After the network split was fixed, the who cluster became healthy again, and all the three peers status is connected on the three nodes. But the very node where the network split happened is still RO, it didn?t return back again to RW. From the node where the network split happened, I see the following in /var/log/message: Dec 15 06:37:32 SN02 nfs[19840]: [2016-12-15 06:37:32.626927] C [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired] 0-gvol001-client-0: server SN01:49152 has not responded in the last 42 seconds, disconnecting. Dec 15 06:40:27 SN02 nfs[19840]: [2016-12-15 06:40:27.645780] C [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired] 0-gvol001-client-2: server SN03:49152 has not responded in the last 42 seconds, disconnecting. Dec 15 06:42:05 SN02 nfs[19840]: [2016-12-15 06:42:05.656223] C [rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired] 0-gvol001-client-2: server SN03:49152 has not responded in the last 42 seconds, disconnecting. log from /var/log/glusterfs/nfs.log [2016-12-15 06:40:41.222456] I [MSGID: 108002] [afr-common.c:4086:afr_notify] 0-gvol001-replicate-0: Client-quorum is met [2016-12-15 06:40:41.222604] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-gvol001-client-2: Server lk version = 1 [2016-12-15 06:40:41.225404] W [MSGID: 114031] [client-rpc-fops.c:2974:client3_3_lookup_cbk] 0-gvol001-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid argument] [2016-12-15 06:40:41.230934] W [MSGID: 112199] [nfs3-helpers.c:3582:nfs3_log_commit_res] 0-nfs-nfsv3: /test2/tools/statistics/user_connections/last.summary => (XID: 65af65c, COMMIT: NFS: 30(Read-only file system), POSIX: 30(Read-only file system)), wverf: 1477951811 [2016-12-15 06:40:41.231068] W [MSGID: 112199] [nfs3-helpers.c:3498:nfs3_log_write_res] 0-nfs-nfsv3: /test/datdiffs/diffs.client12.1481782734/5.base.client11 => (XID: 17e4dc6a, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected)), count: 0, UNSTABLE,wverf: 1477951811 ? Bishoy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161216/bf3e4ba3/attachment.html>