I always used IP addresses instead of names when I added a peer. In the
gluster peer status, I do see IP:
[root at DC-MTL-NAS-01 ~]# gluster peer status
Number of Peers: 2
Hostname: XXX.XXX.XXX.12
Uuid: ec1e10c1-0e38-4d2a-ab51-50fb0c67b6ee
State: Peer in Cluster (Connected)
Hostname: XXX.XXX.XXX.13
Uuid: eef75e55-170a-4621-9d6e-3b5c3a6e5561
State: Accepted peer request (Disconnected)
I can ping those IPs from any server.
>From the Server 3 Gluster logs, I can see this:
[2017-10-24 12:31:33.012446] I [MSGID: 100030] [glusterfsd.c:2503:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.10.6
(args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
[2017-10-24 12:31:33.020739] I [MSGID: 106478] [glusterd.c:1449:init]
0-management: Maximum allowed open file descriptors set to 65536
[2017-10-24 12:31:33.020796] I [MSGID: 106479] [glusterd.c:1496:init]
0-management: Using /var/lib/glusterd as working directory
[2017-10-24 12:31:33.029673] E [rpc-transport.c:283:rpc_transport_load]
0-rpc-transport: /usr/lib64/glusterfs/3.10.6/rpc-transport/rdma.so: cannot
open shared object file: No such file or directory
[2017-10-24 12:31:33.029702] W [rpc-transport.c:287:rpc_transport_load]
0-rpc-transport: volume 'rdma.management': transport-type 'rdma'
is not
valid or not found on this machine
[2017-10-24 12:31:33.029715] W [rpcsvc.c:1661:rpcsvc_create_listener]
0-rpc-service: cannot create listener, initing the transport failed
[2017-10-24 12:31:33.029731] E [MSGID: 106243] [glusterd.c:1720:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2017-10-24 12:31:33.032226] I [MSGID: 106228]
[glusterd.c:500:glusterd_check_gsync_present] 0-glusterd: geo-replication
module not installed in the system [No such file or directory]
[2017-10-24 12:31:33.032816] I [MSGID: 106513]
[glusterd-store.c:2201:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 31000
[2017-10-24 12:31:33.042393] I [MSGID: 106498]
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2017-10-24 12:31:33.042474] W [MSGID: 106062]
[glusterd-handler.c:3466:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2017-10-24 12:31:33.042501] I [rpc-clnt.c:1059:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-10-24 12:31:33.082295] E [MSGID: 101075]
[common-utils.c:307:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or
service not known)
[2017-10-24 12:31:33.082331] E
[name.c:262:af_inet_client_get_remote_sockaddr] 0-management: DNS
resolution failed on host dc-mtl-nas-01.elemenai.lan
[2017-10-24 12:31:33.082563] I [MSGID: 106544]
[glusterd.c:158:glusterd_uuid_init] 0-management: retrieved UUID:
eef75e55-170a-4621-9d6e-3b5c3a6e5561
[2017-10-24 12:31:33.082589] I [MSGID: 106004]
[glusterd-handler.c:5888:__glusterd_peer_rpc_notify] 0-management: Peer
<server1.domain.lan> (<3e190322-78f1-4ef6-80f7-8f48d51c2263>), in
state
<Accepted peer request>, has disconnected from glusterd.
[2017-10-24 12:31:33.117581] E [MSGID: 106187]
[glusterd-store.c:4566:glusterd_resolve_all_bricks] 0-glusterd: resolve
brick failed in restore
[2017-10-24 12:31:33.117658] E [MSGID: 101019] [xlator.c:503:xlator_init]
0-management: Initialization of volume 'management' failed, review your
volfile again
[2017-10-24 12:31:33.117678] E [MSGID: 101066]
[graph.c:325:glusterfs_graph_init] 0-management: initializing translator
failed
[2017-10-24 12:31:33.117696] E [MSGID: 101176]
[graph.c:681:glusterfs_graph_activate] 0-graph: init failed
[2017-10-24 12:31:33.118208] W [glusterfsd.c:1360:cleanup_and_exit]
(-->/usr/sbin/glusterd(glusterfs_volumes_init+0xfd) [0x7f1a34ba1bcd]
-->/usr/sbin/glusterd(glusterfs_process_volfp+0x1b1) [0x7f1a34ba1a71]
-->/usr/sbin/glusterd(cleanup_and_exit+0x6b) [0x7f1a34ba0f5b] ) 0-:
received signum (1), shutting down
server1.domain.lan: Is the server 1 FQDN (not the ip address).
Ludwig
On Tue, Oct 24, 2017 at 2:16 AM, Bartosz Zi?ba <kontakt at avatat.pl>
wrote:
> Are you shure about possibility to resolve all node names on all other
> nodes?
> You need to use names used previously in Gluster - check their by ?gluster
> peer status? or ?gluster pool list?.
>
> Regards,
> Bartosz
>
>
> Wiadomo?? napisana przez Ludwig Gamache <ludwig at elementai.com> w
dniu
> 24.10.2017, o godz. 03:13:
>
> All,
>
> I am trying to add a third peer to my gluster install. The first 2 nodes
> are running since many months and have gluster 3.10.3-1.
>
> I recently installed the 3rd node and gluster 3.10.6-1. I was able to
> start the gluster daemon on it. After, I tried to add the peer from one of
> the 2 previous server (gluster peer probe IPADDRESS).
>
> That first peer started the communication with the 3rd peer. At that
> point, peer status were messed up. Server 1 saw both other servers as
> connected. Server 2 only saw server 1 as connected and did not have server
> 3 as a peer. Server 3 only had server 1 as a peer and saw it as
> disconnected.
>
> I also found errors in the gluster logs of server 3 that could not be done:
>
> [2017-10-24 00:15:20.090462] E
[name.c:262:af_inet_client_get_remote_sockaddr]
> 0-management: DNS resolution failed on host HOST3.DOMAIN.lan
>
> I rebooted node 3 and now gluster does not even restart on that node. It
> keeps giving Name resolution problems. The 2 other nodes are active.
>
> However, I can ping the 3 servers (one from each others) using their DNS
> names.
>
> Any idea about what to look at?
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
--
Ludwig Gamache
IT Director - Element AI
4200 St-Laurent, suite 1200
514-704-0564
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20171024/d4542f13/attachment.html>