Guido De Rosa
2013-Apr-16 14:51 UTC
[Gluster-users] Glusterd refuses to start if a peer doesn't DNS-resolve
Hi, I'm using gluster 3.4.0~qa9realyalpha2-1 (Debian experimental package). 1 volume, 2 hosts, "replica 2". If I use names instead of IP addresses and a peer name does not resolve, glusterd refuses to start, end exit. On the contrary, if DNS resolution works, but the peer is down, or if I use IP addresses (and the peer is down), glusterd starts normally! Is this difference in behavior normal? Is it a bug? Shouldn't a failed DNS resolution (for a peer in a replica-set) be trated just like the event of that peer being down? Here's an excerpt from glusterd --debug output (gluster1 is the name of the remote peer): [2013-04-16 14:46:30.561798] E [glusterd-utils.c:361:glusterd_unlock] 0-management: Cluster lock not held! [2013-04-16 14:46:30.561806] D [glusterd-handler.c:2360:glusterd_rpc_create] 0-management: returning 0 [2013-04-16 14:46:30.561815] D [glusterd-store.c:3078:glusterd_store_retrieve_peers] 0-: Returning with 0 [2013-04-16 14:46:30.641643] E [glusterd-utils.c:4725:glusterd_friend_find_by_hostname] 0-management: error in getaddrinfo: Name or service not known [2013-04-16 14:46:30.641674] D [glusterd-utils.c:4764:glusterd_friend_find_by_hostname] 0-management: Unable to find friend: gluster1 [2013-04-16 14:46:30.721970] E [glusterd-utils.c:283:glusterd_is_local_addr] 0-management: error in getaddrinfo: Name or service not known [2013-04-16 14:46:30.722008] D [glusterd-utils.c:302:glusterd_is_local_addr] 0-management: gluster1 is not local [2013-04-16 14:46:30.722016] D [glusterd-utils.c:4799:glusterd_hostname_to_uuid] 0-management: returning -1 [2013-04-16 14:46:30.722023] D [glusterd-utils.c:792:glusterd_resolve_brick] 0-management: Returning -1 [2013-04-16 14:46:30.722030] E [glusterd-store.c:3101:glusterd_resolve_all_bricks] 0-glusterd: resolve brick failed in restore [2013-04-16 14:46:30.722038] D [glusterd-store.c:3108:glusterd_resolve_all_bricks] 0-: Returning with -1 [2013-04-16 14:46:30.722044] D [glusterd-store.c:3141:glusterd_restore] 0-: Returning -1 [2013-04-16 14:46:30.722058] E [xlator.c:408:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2013-04-16 14:46:30.722068] E [graph.c:292:glusterfs_graph_init] 0-management: initializing translator failed [2013-04-16 14:46:30.722076] E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed [2013-04-16 14:46:30.722261] W [glusterfsd.c:970:cleanup_and_exit] (-->glusterd(main+0x3df) [0x7f233c323f6f] (-->glusterd(glusterfs_volumes_init+0xb0) [0x7f233c326c30] (-->glusterd(glusterfs_process_volfp+0x103) [0x7f233c326b43]))) 0-: received signum (0), shutting down [2013-04-16 14:46:30.722282] D [glusterfsd-mgmt.c:2254:glusterfs_mgmt_pmap_signout] 0-fsd-mgmt: portmapper signout arguments not given Thanks, Guido
Guido De Rosa
2013-Apr-16 16:43 UTC
[Gluster-users] Glusterd refuses to start if a peer doesn't DNS-resolve
Well, apparently this is the result of (or is exposed by) a subtle misconfiguration. Assume gluster1 is the name of the remote peer, and 192.168.98.11 is its IP addresses. Well, I did : gluster peer probe 192.168.98.11 but then: gluster volume create gv0 replica 2 [...] gluster1:/export/brick1 So I used IP address to add the peer, but host-name to add a brick to the volume. Instead, if I coherently use name either for peer and volume brick definition, glusterd starts normally, even if the DNS resolution fails (in which case the remote peer is correctly treated as a disconnected node). (of course it's assumed that at least the local peer is correctly resolved, or the IP address is used for it) Thanks, Guido