Erik Jacobson
2021-Sep-20 15:59 UTC
[Gluster-users] gluster update question regarding new DNS resolution requirement
Hello all! I hope you are well. We are starting a new software release cycle and I am trying to find a way to upgrade customers from our build of gluster 7.9 to our build of gluster 9.3 When we deploy gluster, we foribly remove all references to any host names and use only IP addresses. This is because, if for any reason a DNS server is unreachable, even if the peer files have IPs and DNS, it causes glusterd to be unable to reach peers properly. We can't really rely on /etc/hosts either because customers take artistic licene with their /etc/hosts files and don't realize that problems that can cause. So our deployed peer files look something like this: uuid=46a4b506-029d-4750-acfb-894501a88977 state=3 hostname1=172.23.0.16 That is, with full intention, we avoid host names. When we upgrade to gluster 9.3, we fall over with these errors and gluster is now partitioned and the updated gluster servers can't reach anybody: [2021-09-20 15:50:41.731543 +0000] E [name.c:265:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host 172.23.0.16 As you can see, we have defined on purpose everything using IPs but in 9.3 it appears this method fails. Are there any suggestions short of putting real host names in peer files? FYI This supercomputer will be using gluster for part of its system management. It is how we deploy the Image Objects (squashfs images) hosted on NFS today and served by gluster leader nodes and also store system logs, console logs, and other data. https://www.olcf.ornl.gov/frontier/ Erik
Erik Jacobson
2021-Sep-20 16:35 UTC
[Gluster-users] gluster update question regarding new DNS resolution requirement
I missed the other important log snip: The message "E [MSGID: 101075] [common-utils.c:520:gf_resolve_ip6] 0-resolver: error in getaddrinfo [{family=10}, {ret=Address family for hostname not supported}]" repeated 620 times between [2021-09-20 15:49:23.720633 +0000] and [2021-09-20 15:50:41.731542 +0000] So I will dig in to the code some here. On Mon, Sep 20, 2021 at 10:59:30AM -0500, Erik Jacobson wrote:> Hello all! I hope you are well. > > We are starting a new software release cycle and I am trying to find a > way to upgrade customers from our build of gluster 7.9 to our build of > gluster 9.3 > > When we deploy gluster, we foribly remove all references to any host > names and use only IP addresses. This is because, if for any reason a > DNS server is unreachable, even if the peer files have IPs and DNS, it > causes glusterd to be unable to reach peers properly. We can't really > rely on /etc/hosts either because customers take artistic licene with > their /etc/hosts files and don't realize that problems that can cause. > > So our deployed peer files look something like this: > > uuid=46a4b506-029d-4750-acfb-894501a88977 > state=3 > hostname1=172.23.0.16 > > That is, with full intention, we avoid host names. > > When we upgrade to gluster 9.3, we fall over with these errors and > gluster is now partitioned and the updated gluster servers can't reach > anybody: > > [2021-09-20 15:50:41.731543 +0000] E [name.c:265:af_inet_client_get_remote_sockaddr] 0-management: DNS resolution failed on host 172.23.0.16 > > > As you can see, we have defined on purpose everything using IPs but in > 9.3 it appears this method fails. Are there any suggestions short of > putting real host names in peer files? > > > > FYI > > This supercomputer will be using gluster for part of its system > management. It is how we deploy the Image Objects (squashfs images) > hosted on NFS today and served by gluster leader nodes and also store > system logs, console logs, and other data. > > https://www.olcf.ornl.gov/frontier/ > > > Erik > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users