Stefan Becker
2011-Feb-23 13:31 UTC
[Gluster-users] make gluster run on internal interfaces only
Hi together,
I still did not manage gluster to run on internal interfaces only. With public
interface I had it running in like 30 minutes and it worked like a charm. Then I
changed the bind-address for glusterd and I just cannot probe the other server.
I did a debian dist-upgrade today and removed all gluster packages (I wanted to
start from scratch). Then I reinstalled a brand new .deb downloaded today from
gluster.org.
Here is my setup:
+++ Node 1 +++
s1-new:~# cat /etc/hostname
s1-new
s1-new:~#
s1-new:~# cat /etc/hosts
127.0.0.1 localhost
10.10.100.31 s1-new
10.10.100.223 s2-new
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
s1-new:~#
s1-new:~# cat /etc/glusterfs/glusterd.vol
volume management
type mgmt/glusterd
option working-directory /etc/glusterd
option transport-type tcp
option transport.socket.bind-address 10.10.100.31
option transport.socket.keepalive-time 10
option transport.socket.keepalive-interval 2
end-volume
s1-new:~#
s1-new:~# netstat -lpn | grep gluster
tcp 0 0 10.10.100.31:24007 0.0.0.0:* LISTEN
2336/glusterd
+++ Node 2 +++
s2-new:~# cat /etc/hostname
s2-new
s2-new:~#
s2-new:~# cat /etc/hosts
127.0.0.1 localhost
10.10.100.31 s1-new
10.10.100.223 s2-new
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
s2-new:~#
s2-new:~# cat /etc/glusterfs/glusterd.vol
volume management
type mgmt/glusterd
option working-directory /etc/glusterd
option transport-type tcp
option transport.socket.bind-address 10.10.100.223
option transport.socket.keepalive-time 10
option transport.socket.keepalive-interval 2
end-volume
s2-new:~#
s2-new:~# netstat -lpn | grep gluster
tcp 0 0 10.10.100.223:24007 0.0.0.0:* LISTEN
1850/glusterd
Both nodes/daemons are reachable from each other:
s1-new:~# ping s2-new
PING s2-new (10.10.100.223) 56(84) bytes of data.
64 bytes from s2-new (10.10.100.223): icmp_req=1 ttl=64 time=0.099 ms
64 bytes from s2-new (10.10.100.223): icmp_req=2 ttl=64 time=0.093 ms
^C
--- s2-new ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.093/0.096/0.099/0.003 ms
s1-new:~#
s1-new:~# telnet s2-new 24007
Trying 10.10.100.223...
Connected to s2-new.
Escape character is '^]'.
Connection closed by foreign host.
s2-new:~# ping s1-new
PING s1-new (10.10.100.31) 56(84) bytes of data.
64 bytes from s1-new (10.10.100.31): icmp_req=1 ttl=64 time=0.096 ms
64 bytes from s1-new (10.10.100.31): icmp_req=2 ttl=64 time=0.092 ms
s2-new:~#
s2-new:~# telnet s1-new 24007
Trying 10.10.100.31...
Connected to s1-new.
Escape character is '^]'.
Connection closed by foreign host.
But probing just does not work:
s1-new:~# gluster peer probe s2-new
Connection failed. Please check if gluster daemon is operational.
s1-new:~#
And vice versa:
s2-new:~# gluster peer probe s1-new
Connection failed. Please check if gluster daemon is operational.
s2-new:~#
Next I straced those calls to see what they do:
s1-new:~# cat strace.out | grep 'connect('
connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1
ENOENT (No such file or directory)
connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1
ENOENT (No such file or directory)
connect(4, {sa_family=AF_INET, sin_port=htons(24007),
sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1
ENOENT (No such file or directory)
connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1
ENOENT (No such file or directory)
connect(4, {sa_family=AF_INET, sin_port=htons(24007),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now
in progress)
s1-new:~#
So I think its not working because the probe process is first trying to connect
to the local glusterd which is not listening under 127.0.0.1?
I have no idea, please help me. My two servers are totally naked if you need
more information just ask, thank you!
Bye,
Stefan
Raghavendra G
2011-Feb-25 03:51 UTC
[Gluster-users] make gluster run on internal interfaces only
----- Original Message -----> From: "Stefan Becker" <sbecker at rapidsoft.de> > To: gluster-users at gluster.org > Sent: Wednesday, February 23, 2011 5:31:16 PM > Subject: [Gluster-users] make gluster run on internal interfaces only > Hi together, > > I still did not manage gluster to run on internal interfaces only. > With public interface I had it running in like 30 minutes and it > worked like a charm. Then I changed the bind-address for glusterd and > I just cannot probe the other server. I did a debian dist-upgrade > today and removed all gluster packages (I wanted to start from > scratch). Then I reinstalled a brand new .deb downloaded today from > gluster.org. > > Here is my setup: > > +++ Node 1 +++ > > s1-new:~# cat /etc/hostname > s1-new > s1-new:~# > s1-new:~# cat /etc/hosts > 127.0.0.1 localhost > 10.10.100.31 s1-new > 10.10.100.223 s2-new > > # The following lines are desirable for IPv6 capable hosts > ::1 localhost ip6-localhost ip6-loopback > fe00::0 ip6-localnet > ff00::0 ip6-mcastprefix > ff02::1 ip6-allnodes > ff02::2 ip6-allrouters > ff02::3 ip6-allhosts > s1-new:~# > s1-new:~# cat /etc/glusterfs/glusterd.vol > volume management > type mgmt/glusterd > option working-directory /etc/glusterd > option transport-type tcp > option transport.socket.bind-address 10.10.100.31 > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > end-volume > > s1-new:~# > s1-new:~# netstat -lpn | grep gluster > tcp 0 0 10.10.100.31:24007 0.0.0.0:* LISTEN 2336/glusterd > > +++ Node 2 +++ > > s2-new:~# cat /etc/hostname > s2-new > s2-new:~# > s2-new:~# cat /etc/hosts > 127.0.0.1 localhost > 10.10.100.31 s1-new > 10.10.100.223 s2-new > > # The following lines are desirable for IPv6 capable hosts > ::1 localhost ip6-localhost ip6-loopback > fe00::0 ip6-localnet > ff00::0 ip6-mcastprefix > ff02::1 ip6-allnodes > ff02::2 ip6-allrouters > ff02::3 ip6-allhosts > s2-new:~# > s2-new:~# cat /etc/glusterfs/glusterd.vol > volume management > type mgmt/glusterd > option working-directory /etc/glusterd > option transport-type tcp > option transport.socket.bind-address 10.10.100.223 > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > end-volume > > s2-new:~# > s2-new:~# netstat -lpn | grep gluster > tcp 0 0 10.10.100.223:24007 0.0.0.0:* LISTEN 1850/glusterd > > Both nodes/daemons are reachable from each other: > > s1-new:~# ping s2-new > PING s2-new (10.10.100.223) 56(84) bytes of data. > 64 bytes from s2-new (10.10.100.223): icmp_req=1 ttl=64 time=0.099 ms > 64 bytes from s2-new (10.10.100.223): icmp_req=2 ttl=64 time=0.093 ms > ^C > --- s2-new ping statistics --- > 2 packets transmitted, 2 received, 0% packet loss, time 999ms > rtt min/avg/max/mdev = 0.093/0.096/0.099/0.003 ms > s1-new:~# > s1-new:~# telnet s2-new 24007 > Trying 10.10.100.223... > Connected to s2-new. > Escape character is '^]'. > Connection closed by foreign host. > > s2-new:~# ping s1-new > PING s1-new (10.10.100.31) 56(84) bytes of data. > 64 bytes from s1-new (10.10.100.31): icmp_req=1 ttl=64 time=0.096 ms > 64 bytes from s1-new (10.10.100.31): icmp_req=2 ttl=64 time=0.092 ms > s2-new:~# > s2-new:~# telnet s1-new 24007 > Trying 10.10.100.31... > Connected to s1-new. > Escape character is '^]'. > Connection closed by foreign host. > > But probing just does not work: > > s1-new:~# gluster peer probe s2-new > Connection failed. Please check if gluster daemon is operational. > s1-new:~# > > And vice versa: > > s2-new:~# gluster peer probe s1-new > Connection failed. Please check if gluster daemon is operational. > s2-new:~# > > Next I straced those calls to see what they do: > > s1-new:~# cat strace.out | grep 'connect(' > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_INET, sin_port=htons(24007), > sin_addr=inet_addr("127.0.0.1")}, 16) = 0 > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 > ENOENT (No such file or directory) > connect(4, {sa_family=AF_INET, sin_port=htons(24007), > sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now > in progress)since glusterfs uses non-blocking sockets, there is nothing wrong in error value EINPROGRESS. The connection gets established sometime later and we handle it using a poll/epoll interface. Is there any firewall running on s1-new or s2-new?> s1-new:~# > > So I think its not working because the probe process is first trying > to connect to the local glusterd which is not listening under > 127.0.0.1? > > I have no idea, please help me. My two servers are totally naked if > you need more information just ask, thank you! > > Bye, > Stefan >regards, Raghavendra> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Amar Tumballi
2011-Feb-25 04:29 UTC
[Gluster-users] make gluster run on internal interfaces only
Hi Stefan,> connect(4, {sa_family=AF_INET, sin_port=htons(24007), > sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in > progress) > s1-new:~# > > So I think its not working because the probe process is first trying to > connect to the local glusterd which is not listening under 127.0.0.1? > >This is because currently 'gluster' CLI does operations only on management nodes and it assumes that all its communication will be happening to local 'glusterd'. All the peer related operations are driven through 'glusterd' process. Hence 'gluster' CLI defaults its peer IP to '127.0.0.1'. Try 'gluster' with '--remote-host=<IP/hostname of local hostname>' option and see if it works. Regards, Amar