riccardo.murri at gmail.com
2019-Mar-27  09:39 UTC
[Gluster-users] cannot add server back to cluster after reinstallation
Hello,
a couple days ago, the OS disk of one of the server of a local GlusterFS
cluster suffered a bad crash, and I had to reinstall everything from
scratch.
However, when I restart the GlusterFS service on the server that has
been reinstalled, I see that it sends back a "RJT" response to other
servers of the cluster, which then list it as "State: Peer Rejected
(Connected)"; the reinstalled server instead shows "Number of peers:
0".
The DEBUG level log on the reinstalled machine shows these lines after
the peer probe from another server in the cluster:
    I [MSGID: 106490]
[glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
    D [MSGID: 0] [glusterd-peer-utils.c:208:glusterd_peerinfo_find_by_uuid]
0-management: Friend with uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318, not found
    D [MSGID: 0] [glusterd-peer-utils.c:234:glusterd_peerinfo_find]
0-management: Unable to find peer by uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
    D [MSGID: 0] [glusterd-peer-utils.c:132:glusterd_peerinfo_find_by_hostname]
0-management: Unable to find friend: glusterfs-server-004
    D [MSGID: 0] [glusterd-peer-utils.c:246:glusterd_peerinfo_find]
0-management: Unable to find hostname: glusterfs-server-004
    I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp]
0-glusterd: Responded to glusterfs-server-004 (24007), ret: 0, op_ret: -1
What can I do to re-add the reinstalled server into the cluster?  Is it
safe (= keeps data) to "peer detach" it and then "peer
probe" again?
Additional info:
* The actual GlusterFS brick data was on a different disk and so is safe
  and mounted back in the original location.
* I copied back the `/etc/glusterfs/glusterd.vol` from the other servers
  in the cluster and restored the UUID into
  `/var/lib/glusterfs/glusterd.info`
* I have checked that `max.op-version` is the same on all servers of the
  cluster, including the reinstalled one.
* All servers run Ubuntu 16.04
Thanks for any suggestion!
Riccardo
Riccardo Murri
2019-Mar-27  09:53 UTC
[Gluster-users] cannot add server back to cluster after reinstallation
I managed to put the reinstalled server back into connected state with this procedure: 1. Run `for other_server in ...; do gluster peer probe $other_server; done` on the reinstalled server 2. Now all the peers on the reinstalled server show up as "Accepted Peer Request", which I fixed with the procedure outlined in the last paragraph of https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd Can anyone confirm that this is a good way to proceed and I won't be heading quickly towards corrupting volume data? Thanks, Riccardo