Claudio Kuenzler
2015-Mar-03 14:53 UTC
[Gluster-users] /etc/hosts entry requires for gluster servers?
Probably a stupid workaround but what if you just add a "sleep n" line into the init script? Glusterfs is working fine if it was launched manually, did I understand that right? It's only the automatic startup at boot which causes the lookup failure? On Tue, Mar 3, 2015 at 2:54 PM, ML mail <mlnospam at yahoo.com> wrote:> Thanks for the tip but Debian wheezy does not use systemd at all, it's > still old sysV style init scripts. > > > > On Tuesday, March 3, 2015 2:43 PM, Jeremy Young <jrm16020 at gmail.com> > wrote: > > > > I know that you mention the script in init.d but want to ask to make sure > that you're not using systemd. Are you still using SysVinit with wheezy? > I've had this exact problem with RHEL 7, and creating the file > /etc/systemd/system/glusterd.service.d/restart.conf, containing the > information below, resolved my issue. > > [Service] > Restart=on-failure > StartLimitInterval=5s > StartLimitBurst=5 > > > On Tue, Mar 3, 2015 at 7:04 AM, Claudio Kuenzler <ck at claudiokuenzler.com> > wrote: > > So that would sound like the boot order is messed up. Sounds a bit like > this: > http://joejulian.name/blog/glusterfs-volumes-not-mounting-in-debian-squeeze-at-boot-time/ > >As I said before, I also run glusterfs 3.5.2 on Debian Wheezy and I don't > have issues after a reboot. But unfortunately I don't remember if I had to > manually adapt something like the boot order of the init scripts). > > > >You can try and make your gluster scripts run at the very end of the boot > process and see if that helps. > > > > > > > >On Tue, Mar 3, 2015 at 1:56 PM, ML mail <mlnospam at yahoo.com> wrote: > > > >Yes dig and ping works fine. I used first the short hostname gfs1 and > then I also tried gfs1.intra.domain.com. That did not change anything. > >> > >> > >> > >>Currently for testing I only have a single node setup so my "gluster > peer status" output would be empty. > >> > >> > >>Now I had a closed look at the brick logfile and it looks like my > network is not read at the time glusterfsd kicks in. I can see > >>the following entry: > >> > >> > >>[2015-03-03 13:45:48.647981] E > [name.c:249:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution > failed on host gfs1.intra.domain.com > >> > >> > >>happening at exactly 13:45:48 but on my Cisco Switch I can see the > following log entry: > >> > >> > >>Mar 3 13:45:51: %LINEPROTO-5-UPDOWN: Line protocol on Interface > GigabitEthernet0/9, changed state to up > >> > >> > >>So as you see my switch's port is only ready 3 seconds later at: > 13:45:51, which is too late for glusterfsd. Strangely enough I use > "spanning-tree portfast" on all my ports and as such the switch ports are > ready as soon as possible. I have the feeling here that there is an issue > with the /etc/init.d/glusterfs-server script on Debian wheezy from the > glusterfs-server package. Can anyone confirm and maybe fix this for the > future? > >> > >> > >> > >> > >> > >>On Tuesday, March 3, 2015 12:57 PM, Claudio Kuenzler < > ck at claudiokuenzler.com> wrote: > >> > >> > >> > >>Can you resolve the other gluster peers with "dig"? > >>Are you able to "ping" the other peers, too? > >> > >> > >> > >>On Tue, Mar 3, 2015 at 12:38 PM, ML mail <mlnospam at yahoo.com> wrote: > >> > >>Well the weird thing is that my DNS resolver servers are configured > correctly and working fine. Here below is the exact error message from the > brick log file: > >>> > >>>[2015-03-03 11:34:21.731639] E [common-utils.c:223:gf_resolve_ip6] > 0-resolver: getaddrinfo failed (Name or service not known) > >>>[2015-03-03 11:34:21.731654] E > [name.c:249:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution > failed on host gfs1.intra.domain.com > >>>[2015-03-03 11:34:21.731707] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] > 0-glusterfsd-mgmt: failed to connect with remote-host: > gfs1.intra.domain.com (Success) > >>>[2015-03-03 11:34:21.731724] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] > 0-glusterfsd-mgmt: Exhausted all volfile servers > >>> > >>>I checked in my /etc/hosts file and I have the following entry: > >>> > >>> > >>>127.0.1.1 gfs1.intra.domain.com gfs1a > >>> > >>>This is Debian's default and did not touch the hosts file. I also tried > to remove this 127.0.1.1 but nothing changed. > >>> > >>> > >>> > >>>On Tuesday, March 3, 2015 10:33 AM, Claudio Kuenzler < > ck at claudiokuenzler.com> wrote: > >>> > >>> > >>> > >>>Hi ML, > >>> > >>>Here's what I have noted down in my personal documentation when I > installed GlusterFS the first time in 2013 (also on Debian Wheezy with > 3.5.2): > >>> > >>>"All cluster nodes MUST resolve each other through DNS (preferred) or > /etc/hosts." > >>>An entry in /etc/hosts is probably even more safe because you don't > depend on external DNS resolvers. > >>>cheers,ck > >>> > >>> > >>>On Tue, Mar 3, 2015 at 8:43 AM, ML mail <mlnospam at yahoo.com> wrote: > >>> > >>>Hello, > >>>> > >>>>Is it required to have the GlusterFS servers in /etc/hosts for the > gluster servers themselves? I read many tutorials where people always add > an entry in their /etc/hosts file. > >>>> > >>>>I am asking because my issue is that my volumes, or more precisely > glusterfsd, are not starting at system boot. The brick log shows that the > hostname of the server could not be resolved but I have an entry in my DNS > server and my /etc/resolv.conf is configured correctly. > >>>> > >>>> > >>>>I am using Debian Wheezy with GlusterFS 3.5.2. > >>>> > >>>>Best regards > >>>>ML > >>>>_______________________________________________ > >>>>Gluster-users mailing list > >>>>Gluster-users at gluster.org > >>>>http://www.gluster.org/mailman/listinfo/gluster-users > >>> > >>>> > >>>_______________________________________________ > >>>Gluster-users mailing list > >>>Gluster-users at gluster.org > >>>http://www.gluster.org/mailman/listinfo/gluster-users > >>> > >> > >> > >> > >> > >> > >> > >>_______________________________________________ > >>Gluster-users mailing list > >>Gluster-users at gluster.org > >>http://www.gluster.org/mailman/listinfo/gluster-users > >> > > > > > > > >_______________________________________________ > >Gluster-users mailing list > >Gluster-users at gluster.org > >http://www.gluster.org/mailman/listinfo/gluster-users > > > > > -- > > Jeremy Young, M.S., RHCSA > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150303/0f44f12e/attachment.html>
ML mail
2015-Mar-03 15:12 UTC
[Gluster-users] /etc/hosts entry requires for gluster servers?
Yes, so I added a sleep 5 in the init script right before the startup of the daemon here: start-stop-daemon --start --quiet --oknodo --pidfile $PIDFILE --startas $DAEMON -- -p $PIDFILE $GLUSTERD_OPTS and yes it works. So this is a workaround but IMHO this should get fixed into the Debian package. Anyone from Gluster reading this? I've cc'ed the init script's authors just in case. On Tuesday, March 3, 2015 3:53 PM, Claudio Kuenzler <ck at claudiokuenzler.com> wrote: Probably a stupid workaround but what if you just add a "sleep n" line into the init script? Glusterfs is working fine if it was launched manually, did I understand that right? It's only the automatic startup at boot which causes the lookup failure? On Tue, Mar 3, 2015 at 2:54 PM, ML mail <mlnospam at yahoo.com> wrote: Thanks for the tip but Debian wheezy does not use systemd at all, it's still old sysV style init scripts.> > > > >On Tuesday, March 3, 2015 2:43 PM, Jeremy Young <jrm16020 at gmail.com> wrote: > > > >I know that you mention the script in init.d but want to ask to make sure that you're not using systemd. Are you still using SysVinit with wheezy? I've had this exact problem with RHEL 7, and creating the file /etc/systemd/system/glusterd.service.d/restart.conf, containing the information below, resolved my issue. > >[Service] >Restart=on-failure >StartLimitInterval=5s >StartLimitBurst=5 > > >On Tue, Mar 3, 2015 at 7:04 AM, Claudio Kuenzler <ck at claudiokuenzler.com> wrote: > >So that would sound like the boot order is messed up. Sounds a bit like this: http://joejulian.name/blog/glusterfs-volumes-not-mounting-in-debian-squeeze-at-boot-time/ >>As I said before, I also run glusterfs 3.5.2 on Debian Wheezy and I don't have issues after a reboot. But unfortunately I don't remember if I had to manually adapt something like the boot order of the init scripts). >> >>You can try and make your gluster scripts run at the very end of the boot process and see if that helps. >> >> >> >>On Tue, Mar 3, 2015 at 1:56 PM, ML mail <mlnospam at yahoo.com> wrote: >> >>Yes dig and ping works fine. I used first the short hostname gfs1 and then I also tried gfs1.intra.domain.com. That did not change anything. >>> >>> >>> >>>Currently for testing I only have a single node setup so my "gluster peer status" output would be empty. >>> >>> >>>Now I had a closed look at the brick logfile and it looks like my network is not read at the time glusterfsd kicks in. I can see >>>the following entry: >>> >>> >>>[2015-03-03 13:45:48.647981] E [name.c:249:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host gfs1.intra.domain.com >>> >>> >>>happening at exactly 13:45:48 but on my Cisco Switch I can see the following log entry: >>> >>> >>>Mar 3 13:45:51: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/9, changed state to up >>> >>> >>>So as you see my switch's port is only ready 3 seconds later at: 13:45:51, which is too late for glusterfsd. Strangely enough I use "spanning-tree portfast" on all my ports and as such the switch ports are ready as soon as possible. I have the feeling here that there is an issue with the /etc/init.d/glusterfs-server script on Debian wheezy from the glusterfs-server package. Can anyone confirm and maybe fix this for the future? >>> >>> >>> >>> >>> >>>On Tuesday, March 3, 2015 12:57 PM, Claudio Kuenzler <ck at claudiokuenzler.com> wrote: >>> >>> >>> >>>Can you resolve the other gluster peers with "dig"? >>>Are you able to "ping" the other peers, too? >>> >>> >>> >>>On Tue, Mar 3, 2015 at 12:38 PM, ML mail <mlnospam at yahoo.com> wrote: >>> >>>Well the weird thing is that my DNS resolver servers are configured correctly and working fine. Here below is the exact error message from the brick log file: >>>> >>>>[2015-03-03 11:34:21.731639] E [common-utils.c:223:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known) >>>>[2015-03-03 11:34:21.731654] E [name.c:249:af_inet_client_get_remote_sockaddr] 0-glusterfs: DNS resolution failed on host gfs1.intra.domain.com >>>>[2015-03-03 11:34:21.731707] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: gfs1.intra.domain.com (Success) >>>>[2015-03-03 11:34:21.731724] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers >>>> >>>>I checked in my /etc/hosts file and I have the following entry: >>>> >>>> >>>>127.0.1.1 gfs1.intra.domain.com gfs1a >>>> >>>>This is Debian's default and did not touch the hosts file. I also tried to remove this 127.0.1.1 but nothing changed. >>>> >>>> >>>> >>>>On Tuesday, March 3, 2015 10:33 AM, Claudio Kuenzler <ck at claudiokuenzler.com> wrote: >>>> >>>> >>>> >>>>Hi ML, >>>> >>>>Here's what I have noted down in my personal documentation when I installed GlusterFS the first time in 2013 (also on Debian Wheezy with 3.5.2): >>>> >>>>"All cluster nodes MUST resolve each other through DNS (preferred) or /etc/hosts." >>>>An entry in /etc/hosts is probably even more safe because you don't depend on external DNS resolvers. >>>>cheers,ck >>>> >>>> >>>>On Tue, Mar 3, 2015 at 8:43 AM, ML mail <mlnospam at yahoo.com> wrote: >>>> >>>>Hello, >>>>> >>>>>Is it required to have the GlusterFS servers in /etc/hosts for the gluster servers themselves? I read many tutorials where people always add an entry in their /etc/hosts file. >>>>> >>>>>I am asking because my issue is that my volumes, or more precisely glusterfsd, are not starting at system boot. The brick log shows that the hostname of the server could not be resolved but I have an entry in my DNS server and my /etc/resolv.conf is configured correctly. >>>>> >>>>> >>>>>I am using Debian Wheezy with GlusterFS 3.5.2. >>>>> >>>>>Best regards >>>>>ML >>>>>_______________________________________________ >>>>>Gluster-users mailing list >>>>>Gluster-users at gluster.org >>>>>http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>>>> >>>>_______________________________________________ >>>>Gluster-users mailing list >>>>Gluster-users at gluster.org >>>>http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >>> >>> >>> >>> >>>_______________________________________________ >>>Gluster-users mailing list >>>Gluster-users at gluster.org >>>http://www.gluster.org/mailman/listinfo/gluster-users >>> >> >> >> >>_______________________________________________ >>Gluster-users mailing list >>Gluster-users at gluster.org >>http://www.gluster.org/mailman/listinfo/gluster-users >> > > >-- > >Jeremy Young, M.S., RHCSA > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://www.gluster.org/mailman/listinfo/gluster-users >