Ziemowit Pierzycki
2017-Nov-30 20:25 UTC
[Gluster-users] Problems joining new gluster 3.10 nodes to existing 3.8
Hi, I have a problem joining four Gluster 3.10 nodes to an existing Gluster 3.8 nodes. My understanding that this should work and not be too much of a problem. Peer robe is successful but the node is rejected: gluster> peer detach elkpinfglt07 peer detach: success gluster> peer probe elkpinfglt07 peer probe: success. gluster> peer status Number of Peers: 6 Hostname: elkpinfglt02 Uuid: 926e9b8a-94ff-4924-b133-a30f2dd48054 State: Peer in Cluster (Connected) Hostname: elkpinfglt03 Uuid: 34d1a409-acc8-41f6-9b11-938317ad3421 State: Peer in Cluster (Connected) Hostname: elkpinfglt04 Uuid: 93255842-e190-4e67-ae8b-917583917855 State: Peer in Cluster (Connected) Hostname: elkpinfglt05 Uuid: 263f8d43-d83e-4465-9de3-e6a285072b02 State: Peer in Cluster (Connected) Hostname: elkpinfglt06 Uuid: aeaa998a-e8e7-405e-bf21-f25de8d82c25 State: Peer in Cluster (Connected) Hostname: elkpinfglt07 Uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 State: Peer Rejected (Connected) The node where I'm probing from complains about not able to find information on elkpinfglt07 but then it's found anyway and checksums on data0 volume aren't the same: [2017-11-30 20:12:24.278996] I [MSGID: 106487] [glusterd-handler.c:1241:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req elkpinfglt07 24007 [2017-11-30 20:12:24.279999] I [MSGID: 106129] [glusterd-handler.c:3670:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: elkpinfglt07 (24007) [2017-11-30 20:12:24.281020] I [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-11-30 20:12:24.288605] I [MSGID: 106498] [glusterd-handler.c:3598:glusterd_friend_add] 0-management: connect returned 0 [2017-11-30 20:12:24.301962] I [MSGID: 106511] [glusterd-rpc-ops.c:252:__glusterd_probe_cbk] 0-management: Received probe resp from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3, host: elkpinfglt07 [2017-11-30 20:12:24.301989] I [MSGID: 106511] [glusterd-rpc-ops.c:412:__glusterd_probe_cbk] 0-glusterd: Received resp to probe req [2017-11-30 20:12:25.425294] I [MSGID: 106493] [glusterd-rpc-ops.c:476:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3, host: elkpinfglt07, port: 0 [2017-11-30 20:12:25.429679] I [MSGID: 106163] [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-11-30 20:12:25.432426] I [MSGID: 106490] [glusterd-handler.c:2954:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 [2017-11-30 20:12:25.432490] I [MSGID: 106493] [glusterd-handler.c:3017:__glusterd_handle_probe_query] 0-glusterd: Responded to elkpinfglt07, op_ret: 0, op_errno: 0, ret: 0 [2017-11-30 20:12:25.436435] I [MSGID: 106490] [glusterd-handler.c:2608:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 [2017-11-30 20:12:25.436683] E [MSGID: 106010] [glusterd-utils.c:2938:glusterd_compare_friend_volume] 0-management: Version of Cksums data0 differ. local cksum = 3011020419, remote cksum = 729330920 on peer elkpinfglt07 [2017-11-30 20:12:25.436716] I [MSGID: 106493] [glusterd-handler.c:3852:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to elkpinfglt07 (0), ret: 0, op_ret: -1 [2017-11-30 20:12:31.494646] I [MSGID: 106487] [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2017-11-30 20:14:06.174548] I [MSGID: 106487] [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2017-11-30 20:14:21.518765] I [MSGID: 106487] [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req On the new node the log shows this: [2017-11-30 20:12:25.196229] I [MSGID: 106163] [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30800 [2017-11-30 20:12:25.198228] I [MSGID: 106490] [glusterd-handler.c:2957:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61 [2017-11-30 20:12:25.198447] I [MSGID: 106129] [glusterd-handler.c:2992:__glusterd_handle_probe_query] 0-glusterd: Unable to find peerinfo for host: elkpinfglt01 (24007) [2017-11-30 20:12:25.200587] W [MSGID: 106062] [glusterd-handler.c:3466:glusterd_transport_inet_options_build] 0-glusterd: Failed to get tcp-user-timeout [2017-11-30 20:12:25.200649] I [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600 [2017-11-30 20:12:25.208147] I [MSGID: 106498] [glusterd-handler.c:3616:glusterd_friend_add] 0-management: connect returned 0 [2017-11-30 20:12:25.208318] I [MSGID: 106493] [glusterd-handler.c:3020:__glusterd_handle_probe_query] 0-glusterd: Responded to elkpinfglt01, op_ret: 0, op_errno: 0, ret: 0 [2017-11-30 20:12:25.209824] I [MSGID: 106490] [glusterd-handler.c:2606:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61 [2017-11-30 20:12:25.325953] I [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600 [2017-11-30 20:12:25.326055] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped [2017-11-30 20:12:25.326069] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: nfs service is stopped [2017-11-30 20:12:25.327527] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped [2017-11-30 20:12:25.327540] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: glustershd service is stopped [2017-11-30 20:12:25.327559] I [MSGID: 106567] [glusterd-svc-mgmt.c:197:glusterd_svc_start] 0-management: Starting glustershd service [2017-11-30 20:12:26.329457] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped [2017-11-30 20:12:26.329558] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad service is stopped [2017-11-30 20:12:26.329850] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped [2017-11-30 20:12:26.329879] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service is stopped [2017-11-30 20:12:26.330202] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped [2017-11-30 20:12:26.330240] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: scrub service is stopped [2017-11-30 20:12:26.331621] I [MSGID: 106493] [glusterd-handler.c:3866:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to elkpinfglt01 (0), ret: 0, op_ret: 0 [2017-11-30 20:12:26.340265] I [MSGID: 106511] [glusterd-rpc-ops.c:261:__glusterd_probe_cbk] 0-management: Received probe resp from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61, host: elkpinfglt01 [2017-11-30 20:12:26.340331] I [MSGID: 106511] [glusterd-rpc-ops.c:421:__glusterd_probe_cbk] 0-glusterd: Received resp to probe req [2017-11-30 20:12:26.344327] I [MSGID: 106493] [glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk] 0-glusterd: Received RJT from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61, host: elkpinfglt01, port: 0 Would the checksums cause the peer to be rejected?
Atin Mukherjee
2017-Dec-01 05:25 UTC
[Gluster-users] Problems joining new gluster 3.10 nodes to existing 3.8
On Fri, Dec 1, 2017 at 1:55 AM, Ziemowit Pierzycki <ziemowit at pierzycki.com> wrote:> Hi, > > I have a problem joining four Gluster 3.10 nodes to an existing > Gluster 3.8 nodes. My understanding that this should work and not be > too much of a problem. > > Peer robe is successful but the node is rejected: > > gluster> peer detach elkpinfglt07 > peer detach: success > gluster> peer probe elkpinfglt07 > peer probe: success. > gluster> peer status > Number of Peers: 6 > > Hostname: elkpinfglt02 > Uuid: 926e9b8a-94ff-4924-b133-a30f2dd48054 > State: Peer in Cluster (Connected) > > Hostname: elkpinfglt03 > Uuid: 34d1a409-acc8-41f6-9b11-938317ad3421 > State: Peer in Cluster (Connected) > > Hostname: elkpinfglt04 > Uuid: 93255842-e190-4e67-ae8b-917583917855 > State: Peer in Cluster (Connected) > > Hostname: elkpinfglt05 > Uuid: 263f8d43-d83e-4465-9de3-e6a285072b02 > State: Peer in Cluster (Connected) > > Hostname: elkpinfglt06 > Uuid: aeaa998a-e8e7-405e-bf21-f25de8d82c25 > State: Peer in Cluster (Connected) > > Hostname: elkpinfglt07 > Uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 > State: Peer Rejected (Connected) > > The node where I'm probing from complains about not able to find > information on elkpinfglt07 but then it's found anyway and checksums > on data0 volume aren't the same: > > [2017-11-30 20:12:24.278996] I [MSGID: 106487] > [glusterd-handler.c:1241:__glusterd_handle_cli_probe] 0-glusterd: > Received CLI probe req elkpinfglt07 24007 > [2017-11-30 20:12:24.279999] I [MSGID: 106129] > [glusterd-handler.c:3670:glusterd_probe_begin] 0-glusterd: Unable to > find peerinfo for host: elkpinfglt07 (24007) > [2017-11-30 20:12:24.281020] I > [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: setting > frame-timeout to 600 > [2017-11-30 20:12:24.288605] I [MSGID: 106498] > [glusterd-handler.c:3598:glusterd_friend_add] 0-management: connect > returned 0 > [2017-11-30 20:12:24.301962] I [MSGID: 106511] > [glusterd-rpc-ops.c:252:__glusterd_probe_cbk] 0-management: Received > probe resp from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3, host: > elkpinfglt07 > [2017-11-30 20:12:24.301989] I [MSGID: 106511] > [glusterd-rpc-ops.c:412:__glusterd_probe_cbk] 0-glusterd: Received > resp to probe req > [2017-11-30 20:12:25.425294] I [MSGID: 106493] > [glusterd-rpc-ops.c:476:__glusterd_friend_add_cbk] 0-glusterd: > Received ACC from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3, host: > elkpinfglt07, port: 0 > [2017-11-30 20:12:25.429679] I [MSGID: 106163] > [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] > 0-management: using the op-version 30800 > [2017-11-30 20:12:25.432426] I [MSGID: 106490] > [glusterd-handler.c:2954:__glusterd_handle_probe_query] 0-glusterd: > Received probe from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 > [2017-11-30 20:12:25.432490] I [MSGID: 106493] > [glusterd-handler.c:3017:__glusterd_handle_probe_query] 0-glusterd: > Responded to elkpinfglt07, op_ret: 0, op_errno: 0, ret: 0 > [2017-11-30 20:12:25.436435] I [MSGID: 106490] > [glusterd-handler.c:2608:__glusterd_handle_incoming_friend_req] > 0-glusterd: Received probe from uuid: > 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 > [2017-11-30 20:12:25.436683] E [MSGID: 106010] > [glusterd-utils.c:2938:glusterd_compare_friend_volume] 0-management: > Version of Cksums data0 differ. local cksum = 3011020419, remote cksum > = 729330920 on peer elkpinfglt07 > [2017-11-30 20:12:25.436716] I [MSGID: 106493] > [glusterd-handler.c:3852:glusterd_xfer_friend_add_resp] 0-glusterd: > Responded to elkpinfglt07 (0), ret: 0, op_ret: -1 > [2017-11-30 20:12:31.494646] I [MSGID: 106487] > [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] > 0-glusterd: Received cli list req > [2017-11-30 20:14:06.174548] I [MSGID: 106487] > [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] > 0-glusterd: Received cli list req > [2017-11-30 20:14:21.518765] I [MSGID: 106487] > [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] > 0-glusterd: Received cli list req > > On the new node the log shows this: > > [2017-11-30 20:12:25.196229] I [MSGID: 106163] > [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] > 0-management: using the op-version 30800 > [2017-11-30 20:12:25.198228] I [MSGID: 106490] > [glusterd-handler.c:2957:__glusterd_handle_probe_query] 0-glusterd: > Received probe from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61 > [2017-11-30 20:12:25.198447] I [MSGID: 106129] > [glusterd-handler.c:2992:__glusterd_handle_probe_query] 0-glusterd: > Unable to find peerinfo for host: elkpinfglt01 (24007) > [2017-11-30 20:12:25.200587] W [MSGID: 106062] > [glusterd-handler.c:3466:glusterd_transport_inet_options_build] > 0-glusterd: Failed to get tcp-user-timeout > [2017-11-30 20:12:25.200649] I > [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting > frame-timeout to 600 > [2017-11-30 20:12:25.208147] I [MSGID: 106498] > [glusterd-handler.c:3616:glusterd_friend_add] 0-management: connect > returned 0 > [2017-11-30 20:12:25.208318] I [MSGID: 106493] > [glusterd-handler.c:3020:__glusterd_handle_probe_query] 0-glusterd: > Responded to elkpinfglt01, op_ret: 0, op_errno: 0, ret: 0 > [2017-11-30 20:12:25.209824] I [MSGID: 106490] > [glusterd-handler.c:2606:__glusterd_handle_incoming_friend_req] > 0-glusterd: Received probe from uuid: > f614c686-52c9-4d2c-92e2-7ea6cdcfba61 > [2017-11-30 20:12:25.325953] I > [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-nfs: setting > frame-timeout to 600 > [2017-11-30 20:12:25.326055] I [MSGID: 106132] > [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already > stopped > [2017-11-30 20:12:25.326069] I [MSGID: 106568] > [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: nfs service > is stopped > [2017-11-30 20:12:25.327527] I [MSGID: 106132] > [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd > already stopped > [2017-11-30 20:12:25.327540] I [MSGID: 106568] > [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: glustershd > service is stopped > [2017-11-30 20:12:25.327559] I [MSGID: 106567] > [glusterd-svc-mgmt.c:197:glusterd_svc_start] 0-management: Starting > glustershd service > [2017-11-30 20:12:26.329457] I [MSGID: 106132] > [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad > already stopped > [2017-11-30 20:12:26.329558] I [MSGID: 106568] > [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad > service is stopped > [2017-11-30 20:12:26.329850] I [MSGID: 106132] > [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd > already stopped > [2017-11-30 20:12:26.329879] I [MSGID: 106568] > [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service > is stopped > [2017-11-30 20:12:26.330202] I [MSGID: 106132] > [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub > already stopped > [2017-11-30 20:12:26.330240] I [MSGID: 106568] > [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: scrub > service is stopped > [2017-11-30 20:12:26.331621] I [MSGID: 106493] > [glusterd-handler.c:3866:glusterd_xfer_friend_add_resp] 0-glusterd: > Responded to elkpinfglt01 (0), ret: 0, op_ret: 0 > [2017-11-30 20:12:26.340265] I [MSGID: 106511] > [glusterd-rpc-ops.c:261:__glusterd_probe_cbk] 0-management: Received > probe resp from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61, host: > elkpinfglt01 > [2017-11-30 20:12:26.340331] I [MSGID: 106511] > [glusterd-rpc-ops.c:421:__glusterd_probe_cbk] 0-glusterd: Received > resp to probe req > [2017-11-30 20:12:26.344327] I [MSGID: 106493] > [glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk] 0-glusterd: > Received RJT from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61, host: > elkpinfglt01, port: 0 > > Would the checksums cause the peer to be rejected? >Yes that's the cause and it means that there is a delta between the info file of the volume data0 between the node elkpinfglt07 & the node from where you executed peer probe. Can you please find out the difference of /var/lib/glusterd/vols/data0/info file between these two nodes? _______________________________________________> Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20171201/d677e1a1/attachment.html>
Ziemowit Pierzycki
2017-Dec-06 19:01 UTC
[Gluster-users] Problems joining new gluster 3.10 nodes to existing 3.8
The changes between the configuration files are significant! It appears the configuration has been re-written for 3.10. In addition, I noticed that there are a lot of .rpmsave files on the 3.8 nodes. This is most likely from the upgrades done on the 3.8 nodes in the past. I pretty much gave up on making 3.8 work with 3.10. Instead, I'll use 3.8 on the new nodes and eventually upgrade to 3.10 across the whole cluster using the upgrade procedure... hopefully it won't suffer from the same issues. On Thu, Nov 30, 2017 at 11:25 PM, Atin Mukherjee <amukherj at redhat.com> wrote:> > > On Fri, Dec 1, 2017 at 1:55 AM, Ziemowit Pierzycki <ziemowit at pierzycki.com> > wrote: >> >> Hi, >> >> I have a problem joining four Gluster 3.10 nodes to an existing >> Gluster 3.8 nodes. My understanding that this should work and not be >> too much of a problem. >> >> Peer robe is successful but the node is rejected: >> >> gluster> peer detach elkpinfglt07 >> peer detach: success >> gluster> peer probe elkpinfglt07 >> peer probe: success. >> gluster> peer status >> Number of Peers: 6 >> >> Hostname: elkpinfglt02 >> Uuid: 926e9b8a-94ff-4924-b133-a30f2dd48054 >> State: Peer in Cluster (Connected) >> >> Hostname: elkpinfglt03 >> Uuid: 34d1a409-acc8-41f6-9b11-938317ad3421 >> State: Peer in Cluster (Connected) >> >> Hostname: elkpinfglt04 >> Uuid: 93255842-e190-4e67-ae8b-917583917855 >> State: Peer in Cluster (Connected) >> >> Hostname: elkpinfglt05 >> Uuid: 263f8d43-d83e-4465-9de3-e6a285072b02 >> State: Peer in Cluster (Connected) >> >> Hostname: elkpinfglt06 >> Uuid: aeaa998a-e8e7-405e-bf21-f25de8d82c25 >> State: Peer in Cluster (Connected) >> >> Hostname: elkpinfglt07 >> Uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 >> State: Peer Rejected (Connected) >> >> The node where I'm probing from complains about not able to find >> information on elkpinfglt07 but then it's found anyway and checksums >> on data0 volume aren't the same: >> >> [2017-11-30 20:12:24.278996] I [MSGID: 106487] >> [glusterd-handler.c:1241:__glusterd_handle_cli_probe] 0-glusterd: >> Received CLI probe req elkpinfglt07 24007 >> [2017-11-30 20:12:24.279999] I [MSGID: 106129] >> [glusterd-handler.c:3670:glusterd_probe_begin] 0-glusterd: Unable to >> find peerinfo for host: elkpinfglt07 (24007) >> [2017-11-30 20:12:24.281020] I >> [rpc-clnt.c:1046:rpc_clnt_connection_init] 0-management: setting >> frame-timeout to 600 >> [2017-11-30 20:12:24.288605] I [MSGID: 106498] >> [glusterd-handler.c:3598:glusterd_friend_add] 0-management: connect >> returned 0 >> [2017-11-30 20:12:24.301962] I [MSGID: 106511] >> [glusterd-rpc-ops.c:252:__glusterd_probe_cbk] 0-management: Received >> probe resp from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3, host: >> elkpinfglt07 >> [2017-11-30 20:12:24.301989] I [MSGID: 106511] >> [glusterd-rpc-ops.c:412:__glusterd_probe_cbk] 0-glusterd: Received >> resp to probe req >> [2017-11-30 20:12:25.425294] I [MSGID: 106493] >> [glusterd-rpc-ops.c:476:__glusterd_friend_add_cbk] 0-glusterd: >> Received ACC from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3, host: >> elkpinfglt07, port: 0 >> [2017-11-30 20:12:25.429679] I [MSGID: 106163] >> [glusterd-handshake.c:1271:__glusterd_mgmt_hndsk_versions_ack] >> 0-management: using the op-version 30800 >> [2017-11-30 20:12:25.432426] I [MSGID: 106490] >> [glusterd-handler.c:2954:__glusterd_handle_probe_query] 0-glusterd: >> Received probe from uuid: 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 >> [2017-11-30 20:12:25.432490] I [MSGID: 106493] >> [glusterd-handler.c:3017:__glusterd_handle_probe_query] 0-glusterd: >> Responded to elkpinfglt07, op_ret: 0, op_errno: 0, ret: 0 >> [2017-11-30 20:12:25.436435] I [MSGID: 106490] >> [glusterd-handler.c:2608:__glusterd_handle_incoming_friend_req] >> 0-glusterd: Received probe from uuid: >> 4baff5cf-6e81-4b2e-b31f-be725b2da4b3 >> [2017-11-30 20:12:25.436683] E [MSGID: 106010] >> [glusterd-utils.c:2938:glusterd_compare_friend_volume] 0-management: >> Version of Cksums data0 differ. local cksum = 3011020419, remote cksum >> = 729330920 on peer elkpinfglt07 >> [2017-11-30 20:12:25.436716] I [MSGID: 106493] >> [glusterd-handler.c:3852:glusterd_xfer_friend_add_resp] 0-glusterd: >> Responded to elkpinfglt07 (0), ret: 0, op_ret: -1 >> [2017-11-30 20:12:31.494646] I [MSGID: 106487] >> [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] >> 0-glusterd: Received cli list req >> [2017-11-30 20:14:06.174548] I [MSGID: 106487] >> [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] >> 0-glusterd: Received cli list req >> [2017-11-30 20:14:21.518765] I [MSGID: 106487] >> [glusterd-handler.c:1474:__glusterd_handle_cli_list_friends] >> 0-glusterd: Received cli list req >> >> On the new node the log shows this: >> >> [2017-11-30 20:12:25.196229] I [MSGID: 106163] >> [glusterd-handshake.c:1316:__glusterd_mgmt_hndsk_versions_ack] >> 0-management: using the op-version 30800 >> [2017-11-30 20:12:25.198228] I [MSGID: 106490] >> [glusterd-handler.c:2957:__glusterd_handle_probe_query] 0-glusterd: >> Received probe from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61 >> [2017-11-30 20:12:25.198447] I [MSGID: 106129] >> [glusterd-handler.c:2992:__glusterd_handle_probe_query] 0-glusterd: >> Unable to find peerinfo for host: elkpinfglt01 (24007) >> [2017-11-30 20:12:25.200587] W [MSGID: 106062] >> [glusterd-handler.c:3466:glusterd_transport_inet_options_build] >> 0-glusterd: Failed to get tcp-user-timeout >> [2017-11-30 20:12:25.200649] I >> [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-management: setting >> frame-timeout to 600 >> [2017-11-30 20:12:25.208147] I [MSGID: 106498] >> [glusterd-handler.c:3616:glusterd_friend_add] 0-management: connect >> returned 0 >> [2017-11-30 20:12:25.208318] I [MSGID: 106493] >> [glusterd-handler.c:3020:__glusterd_handle_probe_query] 0-glusterd: >> Responded to elkpinfglt01, op_ret: 0, op_errno: 0, ret: 0 >> [2017-11-30 20:12:25.209824] I [MSGID: 106490] >> [glusterd-handler.c:2606:__glusterd_handle_incoming_friend_req] >> 0-glusterd: Received probe from uuid: >> f614c686-52c9-4d2c-92e2-7ea6cdcfba61 >> [2017-11-30 20:12:25.325953] I >> [rpc-clnt.c:1059:rpc_clnt_connection_init] 0-nfs: setting >> frame-timeout to 600 >> [2017-11-30 20:12:25.326055] I [MSGID: 106132] >> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already >> stopped >> [2017-11-30 20:12:25.326069] I [MSGID: 106568] >> [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: nfs service >> is stopped >> [2017-11-30 20:12:25.327527] I [MSGID: 106132] >> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd >> already stopped >> [2017-11-30 20:12:25.327540] I [MSGID: 106568] >> [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: glustershd >> service is stopped >> [2017-11-30 20:12:25.327559] I [MSGID: 106567] >> [glusterd-svc-mgmt.c:197:glusterd_svc_start] 0-management: Starting >> glustershd service >> [2017-11-30 20:12:26.329457] I [MSGID: 106132] >> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad >> already stopped >> [2017-11-30 20:12:26.329558] I [MSGID: 106568] >> [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad >> service is stopped >> [2017-11-30 20:12:26.329850] I [MSGID: 106132] >> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd >> already stopped >> [2017-11-30 20:12:26.329879] I [MSGID: 106568] >> [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service >> is stopped >> [2017-11-30 20:12:26.330202] I [MSGID: 106132] >> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub >> already stopped >> [2017-11-30 20:12:26.330240] I [MSGID: 106568] >> [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: scrub >> service is stopped >> [2017-11-30 20:12:26.331621] I [MSGID: 106493] >> [glusterd-handler.c:3866:glusterd_xfer_friend_add_resp] 0-glusterd: >> Responded to elkpinfglt01 (0), ret: 0, op_ret: 0 >> [2017-11-30 20:12:26.340265] I [MSGID: 106511] >> [glusterd-rpc-ops.c:261:__glusterd_probe_cbk] 0-management: Received >> probe resp from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61, host: >> elkpinfglt01 >> [2017-11-30 20:12:26.340331] I [MSGID: 106511] >> [glusterd-rpc-ops.c:421:__glusterd_probe_cbk] 0-glusterd: Received >> resp to probe req >> [2017-11-30 20:12:26.344327] I [MSGID: 106493] >> [glusterd-rpc-ops.c:485:__glusterd_friend_add_cbk] 0-glusterd: >> Received RJT from uuid: f614c686-52c9-4d2c-92e2-7ea6cdcfba61, host: >> elkpinfglt01, port: 0 >> >> Would the checksums cause the peer to be rejected? > > > Yes that's the cause and it means that there is a delta between the info > file of the volume data0 between the node elkpinfglt07 & the node from where > you executed peer probe. Can you please find out the difference of > /var/lib/glusterd/vols/data0/info file between these two nodes? > >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users > >
Reasonably Related Threads
- Problems joining new gluster 3.10 nodes to existing 3.8
- Ip based peer probe volume create error
- strange hostname issue on volume create command with famous Peer in Cluster state error message
- strange hostname issue on volume create command with famous Peer in Cluster state error message
- strange hostname issue on volume create command with famous Peer in Cluster state error message