riccardo.murri at gmail.com
2019-Mar-27 09:39 UTC
[Gluster-users] cannot add server back to cluster after reinstallation
Hello, a couple days ago, the OS disk of one of the server of a local GlusterFS cluster suffered a bad crash, and I had to reinstall everything from scratch. However, when I restart the GlusterFS service on the server that has been reinstalled, I see that it sends back a "RJT" response to other servers of the cluster, which then list it as "State: Peer Rejected (Connected)"; the reinstalled server instead shows "Number of peers: 0". The DEBUG level log on the reinstalled machine shows these lines after the peer probe from another server in the cluster: I [MSGID: 106490] [glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318 D [MSGID: 0] [glusterd-peer-utils.c:208:glusterd_peerinfo_find_by_uuid] 0-management: Friend with uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318, not found D [MSGID: 0] [glusterd-peer-utils.c:234:glusterd_peerinfo_find] 0-management: Unable to find peer by uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318 D [MSGID: 0] [glusterd-peer-utils.c:132:glusterd_peerinfo_find_by_hostname] 0-management: Unable to find friend: glusterfs-server-004 D [MSGID: 0] [glusterd-peer-utils.c:246:glusterd_peerinfo_find] 0-management: Unable to find hostname: glusterfs-server-004 I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to glusterfs-server-004 (24007), ret: 0, op_ret: -1 What can I do to re-add the reinstalled server into the cluster? Is it safe (= keeps data) to "peer detach" it and then "peer probe" again? Additional info: * The actual GlusterFS brick data was on a different disk and so is safe and mounted back in the original location. * I copied back the `/etc/glusterfs/glusterd.vol` from the other servers in the cluster and restored the UUID into `/var/lib/glusterfs/glusterd.info` * I have checked that `max.op-version` is the same on all servers of the cluster, including the reinstalled one. * All servers run Ubuntu 16.04 Thanks for any suggestion! Riccardo
Riccardo Murri
2019-Mar-27 09:53 UTC
[Gluster-users] cannot add server back to cluster after reinstallation
I managed to put the reinstalled server back into connected state with this procedure: 1. Run `for other_server in ...; do gluster peer probe $other_server; done` on the reinstalled server 2. Now all the peers on the reinstalled server show up as "Accepted Peer Request", which I fixed with the procedure outlined in the last paragraph of https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd Can anyone confirm that this is a good way to proceed and I won't be heading quickly towards corrupting volume data? Thanks, Riccardo