Stefan Becker
2011-Feb-23 13:31 UTC
[Gluster-users] make gluster run on internal interfaces only
Hi together, I still did not manage gluster to run on internal interfaces only. With public interface I had it running in like 30 minutes and it worked like a charm. Then I changed the bind-address for glusterd and I just cannot probe the other server. I did a debian dist-upgrade today and removed all gluster packages (I wanted to start from scratch). Then I reinstalled a brand new .deb downloaded today from gluster.org. Here is my setup: +++ Node 1 +++ s1-new:~# cat /etc/hostname s1-new s1-new:~# s1-new:~# cat /etc/hosts 127.0.0.1 localhost 10.10.100.31 s1-new 10.10.100.223 s2-new # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts s1-new:~# s1-new:~# cat /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /etc/glusterd option transport-type tcp option transport.socket.bind-address 10.10.100.31 option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 end-volume s1-new:~# s1-new:~# netstat -lpn | grep gluster tcp 0 0 10.10.100.31:24007 0.0.0.0:* LISTEN 2336/glusterd +++ Node 2 +++ s2-new:~# cat /etc/hostname s2-new s2-new:~# s2-new:~# cat /etc/hosts 127.0.0.1 localhost 10.10.100.31 s1-new 10.10.100.223 s2-new # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts s2-new:~# s2-new:~# cat /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /etc/glusterd option transport-type tcp option transport.socket.bind-address 10.10.100.223 option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 end-volume s2-new:~# s2-new:~# netstat -lpn | grep gluster tcp 0 0 10.10.100.223:24007 0.0.0.0:* LISTEN 1850/glusterd Both nodes/daemons are reachable from each other: s1-new:~# ping s2-new PING s2-new (10.10.100.223) 56(84) bytes of data. 64 bytes from s2-new (10.10.100.223): icmp_req=1 ttl=64 time=0.099 ms 64 bytes from s2-new (10.10.100.223): icmp_req=2 ttl=64 time=0.093 ms ^C --- s2-new ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.093/0.096/0.099/0.003 ms s1-new:~# s1-new:~# telnet s2-new 24007 Trying 10.10.100.223... Connected to s2-new. Escape character is '^]'. Connection closed by foreign host. s2-new:~# ping s1-new PING s1-new (10.10.100.31) 56(84) bytes of data. 64 bytes from s1-new (10.10.100.31): icmp_req=1 ttl=64 time=0.096 ms 64 bytes from s1-new (10.10.100.31): icmp_req=2 ttl=64 time=0.092 ms s2-new:~# s2-new:~# telnet s1-new 24007 Trying 10.10.100.31... Connected to s1-new. Escape character is '^]'. Connection closed by foreign host. But probing just does not work: s1-new:~# gluster peer probe s2-new Connection failed. Please check if gluster daemon is operational. s1-new:~# And vice versa: s2-new:~# gluster peer probe s1-new Connection failed. Please check if gluster daemon is operational. s2-new:~# Next I straced those calls to see what they do: s1-new:~# cat strace.out | grep 'connect(' connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(4, {sa_family=AF_INET, sin_port=htons(24007), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory) connect(4, {sa_family=AF_INET, sin_port=htons(24007), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) s1-new:~# So I think its not working because the probe process is first trying to connect to the local glusterd which is not listening under 127.0.0.1? I have no idea, please help me. My two servers are totally naked if you need more information just ask, thank you! Bye, Stefan
Raghavendra G
2011-Feb-25 03:51 UTC
[Gluster-users] make gluster run on internal interfaces only
----- Original Message -----> From: "Stefan Becker" <sbecker at rapidsoft.de> > To: gluster-users at gluster.org > Sent: Wednesday, February 23, 2011 5:31:16 PM > Subject: [Gluster-users] make gluster run on internal interfaces only > Hi together, > > I still did not manage gluster to run on internal interfaces only. > With public interface I had it running in like 30 minutes and it > worked like a charm. Then I changed the bind-address for glusterd and > I just cannot probe the other server. I did a debian dist-upgrade > today and removed all gluster packages (I wanted to start from > scratch). Then I reinstalled a brand new .deb downloaded today from > gluster.org. > > Here is my setup: > > +++ Node 1 +++ > > s1-new:~# cat /etc/hostname > s1-new > s1-new:~# > s1-new:~# cat /etc/hosts > 127.0.0.1 localhost > 10.10.100.31 s1-new > 10.10.100.223 s2-new > > # The following lines are desirable for IPv6 capable hosts > ::1 localhost ip6-localhost ip6-loopback > fe00::0 ip6-localnet > ff00::0 ip6-mcastprefix > ff02::1 ip6-allnodes > ff02::2 ip6-allrouters > ff02::3 ip6-allhosts > s1-new:~# > s1-new:~# cat /etc/glusterfs/glusterd.vol > volume management > type mgmt/glusterd > option working-directory /etc/glusterd > option transport-type tcp > option transport.socket.bind-address 10.10.100.31 > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > end-volume > > s1-new:~# > s1-new:~# netstat -lpn | grep gluster > tcp 0 0 10.10.100.31:24007 0.0.0.0:* LISTEN 2336/glusterd > > +++ Node 2 +++ > > s2-new:~# cat /etc/hostname > s2-new > s2-new:~# > s2-new:~# cat /etc/hosts > 127.0.0.1 localhost > 10.10.100.31 s1-new > 10.10.100.223 s2-new > > # The following lines are desirable for IPv6 capable hosts > ::1 localhost ip6-localhost ip6-loopback > fe00::0 ip6-localnet > ff00::0 ip6-mcastprefix > ff02::1 ip6-allnodes > ff02::2 ip6-allrouters > ff02::3 ip6-allhosts > s2-new:~# > s2-new:~# cat /etc/glusterfs/glusterd.vol > volume management > type mgmt/glusterd > option working-directory /etc/glusterd > option transport-type tcp > option transport.socket.bind-address 10.10.100.223 > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > end-volume > > s2-new:~# > s2-new:~# netstat -lpn | grep gluster > tcp 0 0 10.10.100.223:24007 0.0.0.0:* LISTEN 1850/glusterd > > Both nodes/daemons are reachable from each other: > > s1-new:~# ping s2-new > PING s2-new (10.10.100.223) 56(84) bytes of data. > 64 bytes from s2-new (10.10.100.223): icmp_req=1 ttl=64 time=0.099 ms > 64 bytes from s2-new (10.10.100.223): icmp_req=2 ttl=64 time=0.093 ms > ^C > --- s2-new ping statistics --- > 2 packets transmitted, 2 received, 0% packet loss, time 999ms > rtt min/avg/max/mdev = 0.093/0.096/0.099/0.003 ms > s1-new:~# > s1-new:~# telnet s2-new 24007 > Trying 10.10.100.223... > Connected to s2-new. > Escape character is '^]'. > Connection closed by foreign host. > > s2-new:~# ping s1-new > PING s1-new (10.10.100.31) 56(84) bytes of data. > 64 bytes from s1-new (10.10.100.31): icmp_req=1 ttl=64 time=0.096 ms > 64 bytes from s1-new (10.10.100.31): icmp_req=2 ttl=64 time=0.092 ms > s2-new:~# > s2-new:~# telnet s1-new 24007 > Trying 10.10.100.31... > Connected to s1-new. > Escape character is '^]'. > Connection closed by foreign host. > > But probing just does not work: > > s1-new:~# gluster peer probe s2-new > Connection failed. Please check if gluster daemon is operational. > s1-new:~# > > And vice versa: > > s2-new:~# gluster peer probe s1-new > Connection failed. Please check if gluster daemon is operational. > s2-new:~# > > Next I straced those calls to see what they do: > > s1-new:~# cat strace.out | grep 'connect(' > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_INET, sin_port=htons(24007), > sin_addr=inet_addr("127.0.0.1")}, 16) = 0 > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_INET, sin_port=htons(24007), > sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now > in progress)since glusterfs uses non-blocking sockets, there is nothing wrong in error value EINPROGRESS. The connection gets established sometime later and we handle it using a poll/epoll interface. Is there any firewall running on s1-new or s2-new?> s1-new:~# > > So I think its not working because the probe process is first trying > to connect to the local glusterd which is not listening under > 127.0.0.1? > > I have no idea, please help me. My two servers are totally naked if > you need more information just ask, thank you! > > Bye, > Stefan >regards, Raghavendra> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Amar Tumballi
2011-Feb-25 04:29 UTC
[Gluster-users] make gluster run on internal interfaces only
Hi Stefan,> connect(4, {sa_family=AF_INET, sin_port=htons(24007), > sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in > progress) > s1-new:~# > > So I think its not working because the probe process is first trying to > connect to the local glusterd which is not listening under 127.0.0.1? > >This is because currently 'gluster' CLI does operations only on management nodes and it assumes that all its communication will be happening to local 'glusterd'. All the peer related operations are driven through 'glusterd' process. Hence 'gluster' CLI defaults its peer IP to '127.0.0.1'. Try 'gluster' with '--remote-host=<IP/hostname of local hostname>' option and see if it works. Regards, Amar