Scot Kreienkamp
2012-Apr-27 13:53 UTC
[Gluster-users] unable to get Geo-replication working
Hey everyone,
I'm trying to get geo-replication working from a two brick replicated volume
to a single directory on a remote host. I can ssh as either root or georep-user
to the destination as either georep-user or root with no password using the
default ssh commands given by the config command: ssh
-oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/etc/glusterd/geo-replication/secret.pem. All the glusterfs rpms are installed
on the remote host. There are no firewalls running on any of the hosts and no
firewalls in between them. The remote_gsyncd command is correct as I can copy
and paste it to the command line and run it on both source hosts and destination
host. I'm using the current production version of glusterfs 3.2.6, rsync
3.0.9, fuse-2.8.3 rpm's are installed, OpenSSH 5.3, and Python 2.6.6 on
RHEL6.2. The remote directory is set to 777, world read-write so there are no
permission errors.
I'm using this command to start replication: gluster volume geo-replication
RMSNFSMOUNT hptv3130:/nfs start
Whenever I try to initiate geo-replication the status goes to starting for about
30 seconds, then goes to faulty. On the slave I get these messages repeating in
the geo-replication-slaves log:
[2012-04-27 09:37:59.485424] I [resource(slave):201:service_loop] FILE: slave
listening
[2012-04-27 09:38:05.413768] I [repce(slave):60:service_loop] RepceServer:
terminating on reaching EOF.
[2012-04-27 09:38:15.35907] I [resource(slave):207:service_loop] FILE:
connection inactive for 120 seconds, stopping
[2012-04-27 09:38:15.36382] I [gsyncd(slave):302:main_i] <top>: exiting.
[2012-04-27 09:38:19.952683] I [gsyncd(slave):290:main_i] <top>: syncing:
file:///nfs
[2012-04-27 09:38:19.955024] I [resource(slave):201:service_loop] FILE: slave
listening
I get these messages in etc-glusterfs-glusterd.vol.log on the slave:
[2012-04-27 09:39:23.667930] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Transport endpoint is
not connected), peer (127.0.0.1:1021)
[2012-04-27 09:39:43.736138] I [glusterd-handler.c:3226:glusterd_handle_getwd]
0-glusterd: Received getwd req
[2012-04-27 09:39:43.740749] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Transport endpoint is
not connected), peer (127.0.0.1:1023)
As I understand it from searching the list that message is benign and can be
ignored though.
Here are tails of all the logs on one of the sources:
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log
+------------------------------------------------------------------------------+
[2012-04-26 16:16:40.804047] E [socket.c:1685:socket_connect_finish]
0-RMSNFSMOUNT-client-1: connection to failed (Connection refused)
[2012-04-26 16:16:40.804852] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-RMSNFSMOUNT-client-0: changing port to 24009 (from 0)
[2012-04-26 16:16:44.779451] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0)
[2012-04-26 16:16:44.855903] I
[client-handshake.c:1090:select_server_supported_programs]
0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437), Version
(310)
[2012-04-26 16:16:44.856893] I [client-handshake.c:913:client_setvolume_cbk]
0-RMSNFSMOUNT-client-0: Connected to 10.170.1.222:24009, attached to remote
volume '/nfs'.
[2012-04-26 16:16:44.856943] I [afr-common.c:3141:afr_notify]
0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back
up; going online.
[2012-04-26 16:16:44.866734] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse:
switched to graph 0
[2012-04-26 16:16:44.867391] I [fuse-bridge.c:3241:fuse_thread_proc] 0-fuse:
unmounting /tmp/gsyncd-aux-mount-8zMs0J
[2012-04-26 16:16:44.868538] W [glusterfsd.c:727:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd]
(-->/lib64/libpthread.so.0() [0x3149c077f1]
(-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c)
[0x40477c]))) 0-: received signum (15), shutting down
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log
[2012-04-26 16:16:39.263871] I [gsyncd:290:main_i] <top>: syncing:
gluster://localhost:RMSNFSMOUNT -> ssh://georep-user at hptv3130:/nfs
[2012-04-26 16:16:41.332690] E [syncdutils:133:log_raise_exception] <top>:
FAIL:
Traceback (most recent call last):
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 154, in twrap
tf(*aa)
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 117, in listen
rid, exc, res = recv(self.inf)
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 41, in recv
return pickle.load(inf)
EOFError
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log
[2012-04-27 09:48:42.892842] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0)
[2012-04-27 09:48:43.120749] I
[client-handshake.c:1090:select_server_supported_programs]
0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437), Version
(310)
[2012-04-27 09:48:43.121489] I [client-handshake.c:913:client_setvolume_cbk]
0-RMSNFSMOUNT-client-0: Connected to 10.170.1.222:24009, attached to remote
volume '/nfs'.
[2012-04-27 09:48:43.121515] I [afr-common.c:3141:afr_notify]
0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back
up; going online.
[2012-04-27 09:48:43.132904] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse:
switched to graph 0
[2012-04-27 09:48:43.133704] I [fuse-bridge.c:2927:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
[2012-04-27 09:48:43.135797] I
[afr-common.c:1520:afr_set_root_inode_on_first_lookup]
0-RMSNFSMOUNT-replicate-0: added root inode
[2012-04-27 09:48:44.533289] W [fuse-bridge.c:2517:fuse_xattr_cbk]
0-glusterfs-fuse: 8:
GETXATTR(trusted.glusterfs.9de3c1c8-a753-45a1-8042-b6a4872c5c3c.xtime) / =>
-1 (Transport endpoint is not connected)
[2012-04-27 09:48:44.544934] I [fuse-bridge.c:3241:fuse_thread_proc] 0-fuse:
unmounting /tmp/gsyncd-aux-mount-uXCybC
[2012-04-27 09:48:44.545879] W [glusterfsd.c:727:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd]
(-->/lib64/libpthread.so.0() [0x3149c077f1]
(-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c)
[0x40477c]))) 0-: received signum (15), shutting down
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 34, in lgetxattr
return cls._query_xattr( path, siz, 'lgetxattr', attr)
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 26, in _query_xattr
cls.raise_oserr()
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 16, in raise_oserr
raise OSError(errn, os.strerror(errn))
OSError: [Errno 107] Transport endpoint is not connected
[2012-04-27 09:49:14.846837] I [monitor(monitor):59:monitor] Monitor:
------------------------------------------------------------
[2012-04-27 09:49:14.847898] I [monitor(monitor):60:monitor] Monitor: starting
gsyncd worker
[2012-04-27 09:49:14.930681] I [gsyncd:290:main_i] <top>: syncing:
gluster://localhost:RMSNFSMOUNT -> ssh://hptv3130:/nfs
I'm out of ideas. I've satisfied all the requirements I can find, and
I'm not seeing anything in the logs that makes any sense to me as an error
that I can fix. Can anyone help?
Thanks!
Scot Kreienkamp
skreien at la-z-boy.com
This message is intended only for the individual or entity to which it is
addressed. It may contain privileged, confidential information which is exempt
from disclosure under applicable laws. If you are not the intended recipient,
please note that you are strictly prohibited from disseminating or distributing
this information (other than to the intended recipient) or copying this
information. If you have received this communication in error, please notify us
immediately by e-mail or by telephone at the above number. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120427/8139a0e8/attachment.html>
Scot Kreienkamp
2012-Apr-27 15:18 UTC
[Gluster-users] unable to get Geo-replication working
Sure....
[root at retv3130 RMSNFSMOUNT]# gluster peer status
Number of Peers: 1
Hostname: retv3131
Uuid: 450cc731-60be-47be-a42d-d856a03dac01
State: Peer in Cluster (Connected)
[root at hptv3130 ~]# gluster peer status
No peers present
[root at retv3130 ~]# gluster volume geo-replication RMSNFSMOUNT root at
hptv3130:/nfs status
MASTER SLAVE STATUS
--------------------------------------------------------------------------------
RMSNFSMOUNT root at hptv3130:/nfs
faulty
Scot Kreienkamp
Senior Systems Engineer
skreien at la-z-boy.com
From: Mohit Anchlia [mailto:mohitanchlia at gmail.com]
Sent: Friday, April 27, 2012 10:58 AM
To: Scot Kreienkamp
Subject: Re: [Gluster-users] unable to get Geo-replication working
Can you look at the status of "gluster geo-replication MASTER SLAVE
status"? Also, do gluster peer status on both MASTER and SLAVE? Paste the
results here.
On Fri, Apr 27, 2012 at 6:53 AM, Scot Kreienkamp <SKreien at
la-z-boy.com<mailto:SKreien at la-z-boy.com>> wrote:
Hey everyone,
I'm trying to get geo-replication working from a two brick replicated volume
to a single directory on a remote host. I can ssh as either root or georep-user
to the destination as either georep-user or root with no password using the
default ssh commands given by the config command: ssh
-oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/etc/glusterd/geo-replication/secret.pem. All the glusterfs rpms are installed
on the remote host. There are no firewalls running on any of the hosts and no
firewalls in between them. The remote_gsyncd command is correct as I can copy
and paste it to the command line and run it on both source hosts and destination
host. I'm using the current production version of glusterfs 3.2.6, rsync
3.0.9, fuse-2.8.3 rpm's are installed, OpenSSH 5.3, and Python 2.6.6 on
RHEL6.2. The remote directory is set to 777, world read-write so there are no
permission errors.
I'm using this command to start replication: gluster volume geo-replication
RMSNFSMOUNT hptv3130:/nfs start
Whenever I try to initiate geo-replication the status goes to starting for about
30 seconds, then goes to faulty. On the slave I get these messages repeating in
the geo-replication-slaves log:
[2012-04-27 09:37:59.485424] I [resource(slave):201:service_loop] FILE: slave
listening
[2012-04-27 09:38:05.413768] I [repce(slave):60:service_loop] RepceServer:
terminating on reaching EOF.
[2012-04-27 09:38:15.35907] I [resource(slave):207:service_loop] FILE:
connection inactive for 120 seconds, stopping
[2012-04-27 09:38:15.36382] I [gsyncd(slave):302:main_i] <top>: exiting.
[2012-04-27 09:38:19.952683] I [gsyncd(slave):290:main_i] <top>: syncing:
file:///nfs<file:///\\nfs>
[2012-04-27 09:38:19.955024] I [resource(slave):201:service_loop] FILE: slave
listening
I get these messages in etc-glusterfs-glusterd.vol.log on the slave:
[2012-04-27 09:39:23.667930] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Transport endpoint is
not connected), peer (127.0.0.1:1021<http://127.0.0.1:1021/>)
[2012-04-27 09:39:43.736138] I [glusterd-handler.c:3226:glusterd_handle_getwd]
0-glusterd: Received getwd req
[2012-04-27 09:39:43.740749] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Transport endpoint is
not connected), peer (127.0.0.1:1023<http://127.0.0.1:1023/>)
As I understand it from searching the list that message is benign and can be
ignored though.
Here are tails of all the logs on one of the sources:
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log
+------------------------------------------------------------------------------+
[2012-04-26 16:16:40.804047] E [socket.c:1685:socket_connect_finish]
0-RMSNFSMOUNT-client-1: connection to failed (Connection refused)
[2012-04-26 16:16:40.804852] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-RMSNFSMOUNT-client-0: changing port to 24009 (from 0)
[2012-04-26 16:16:44.779451] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0)
[2012-04-26 16:16:44.855903] I
[client-handshake.c:1090:select_server_supported_programs]
0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437), Version
(310)
[2012-04-26 16:16:44.856893] I [client-handshake.c:913:client_setvolume_cbk]
0-RMSNFSMOUNT-client-0: Connected to
10.170.1.222:24009<http://10.170.1.222:24009/>, attached to remote volume
'/nfs'.
[2012-04-26 16:16:44.856943] I [afr-common.c:3141:afr_notify]
0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back
up; going online.
[2012-04-26 16:16:44.866734] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse:
switched to graph 0
[2012-04-26 16:16:44.867391] I [fuse-bridge.c:3241:fuse_thread_proc] 0-fuse:
unmounting /tmp/gsyncd-aux-mount-8zMs0J
[2012-04-26 16:16:44.868538] W [glusterfsd.c:727:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd]
(-->/lib64/libpthread.so.0() [0x3149c077f1]
(-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c)
[0x40477c]))) 0-: received signum (15), shutting down
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log
[2012-04-26 16:16:39.263871] I [gsyncd:290:main_i] <top>: syncing:
gluster://localhost:RMSNFSMOUNT -> ssh://georep-user at hptv3130:/nfs
[2012-04-26 16:16:41.332690] E [syncdutils:133:log_raise_exception] <top>:
FAIL:
Traceback (most recent call last):
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/syncdutils.py",
line 154, in twrap
tf(*aa)
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 117, in listen
rid, exc, res = recv(self.inf)
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py",
line 41, in recv
return pickle.load(inf)
EOFError
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log
[2012-04-27 09:48:42.892842] I [rpc-clnt.c:1536:rpc_clnt_reconfig]
0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0)
[2012-04-27 09:48:43.120749] I
[client-handshake.c:1090:select_server_supported_programs]
0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437), Version
(310)
[2012-04-27 09:48:43.121489] I [client-handshake.c:913:client_setvolume_cbk]
0-RMSNFSMOUNT-client-0: Connected to
10.170.1.222:24009<http://10.170.1.222:24009/>, attached to remote volume
'/nfs'.
[2012-04-27 09:48:43.121515] I [afr-common.c:3141:afr_notify]
0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back
up; going online.
[2012-04-27 09:48:43.132904] I [fuse-bridge.c:3339:fuse_graph_setup] 0-fuse:
switched to graph 0
[2012-04-27 09:48:43.133704] I [fuse-bridge.c:2927:fuse_init] 0-glusterfs-fuse:
FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13
[2012-04-27 09:48:43.135797] I
[afr-common.c:1520:afr_set_root_inode_on_first_lookup]
0-RMSNFSMOUNT-replicate-0: added root inode
[2012-04-27 09:48:44.533289] W [fuse-bridge.c:2517:fuse_xattr_cbk]
0-glusterfs-fuse: 8:
GETXATTR(trusted.glusterfs.9de3c1c8-a753-45a1-8042-b6a4872c5c3c.xtime) / =>
-1 (Transport endpoint is not connected)
[2012-04-27 09:48:44.544934] I [fuse-bridge.c:3241:fuse_thread_proc] 0-fuse:
unmounting /tmp/gsyncd-aux-mount-uXCybC
[2012-04-27 09:48:44.545879] W [glusterfsd.c:727:cleanup_and_exit]
(-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd]
(-->/lib64/libpthread.so.0() [0x3149c077f1]
(-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c)
[0x40477c]))) 0-: received signum (15), shutting down
[root at retv3130 RMSNFSMOUNT]# tail
ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 34, in lgetxattr
return cls._query_xattr( path, siz, 'lgetxattr', attr)
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 26, in _query_xattr
cls.raise_oserr()
File
"/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py",
line 16, in raise_oserr
raise OSError(errn, os.strerror(errn))
OSError: [Errno 107] Transport endpoint is not connected
[2012-04-27 09:49:14.846837] I [monitor(monitor):59:monitor] Monitor:
------------------------------------------------------------
[2012-04-27 09:49:14.847898] I [monitor(monitor):60:monitor] Monitor: starting
gsyncd worker
[2012-04-27 09:49:14.930681] I [gsyncd:290:main_i] <top>: syncing:
gluster://localhost:RMSNFSMOUNT -> ssh://hptv3130:/nfs
I'm out of ideas. I've satisfied all the requirements I can find, and
I'm not seeing anything in the logs that makes any sense to me as an error
that I can fix. Can anyone help?
Thanks!
Scot Kreienkamp
skreien at la-z-boy.com<mailto:skreien at la-z-boy.com>
This message is intended only for the individual or entity to which it is
addressed. It may contain privileged, confidential information which is exempt
from disclosure under applicable laws. If you are not the intended recipient,
please note that you are strictly prohibited from disseminating or distributing
this information (other than to the intended recipient) or copying this
information. If you have received this communication in error, please notify us
immediately by e-mail or by telephone at the above number. Thank you.
_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
This message is intended only for the individual or entity to which it is
addressed. It may contain privileged, confidential information which is exempt
from disclosure under applicable laws. If you are not the intended recipient,
please note that you are strictly prohibited from disseminating or distributing
this information (other than to the intended recipient) or copying this
information. If you have received this communication in error, please notify us
immediately by e-mail or by telephone at the above number. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120427/74b7042b/attachment.html>
Hello Scot, On 04/27/2012 03:53 PM, Scot Kreienkamp wrote:> Hey everyone, > > > > I'm trying to get geo-replication working from a two brick replicated > volume to a single directory on a remote host. I can ssh as either root > or georep-user to the destination as either georep-user or root with no > password using the default ssh commands given by the config command: ssh > -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i > /etc/glusterd/geo-replication/secret.pem. All the glusterfs rpms are > installed on the remote host. There are no firewalls running on any of > the hosts and no firewalls in between them. The remote_gsyncd command > is correct as I can copy and paste it to the command line and run it on > both source hosts and destination host. I'm using the current > production version of glusterfs 3.2.6, rsync 3.0.9, fuse-2.8.3 rpm's are > installed, OpenSSH 5.3, and Python 2.6.6 on RHEL6.2. The remote > directory is set to 777, world read-write so there are no permission > errors. > > > > I'm using this command to start replication: gluster volume > geo-replication RMSNFSMOUNT hptv3130:/nfs start > > > > Whenever I try to initiate geo-replication the status goes to starting > for about 30 seconds, then goes to faulty. On the slave I get these > messages repeating in the geo-replication-slaves log:The filesystem on the slave has xattrs enabled? Is there any improvement if glusterd is stopped on the slave? Regards, Andreas> > > > [2012-04-27 09:37:59.485424] I [resource(slave):201:service_loop] FILE: > slave listening > > [2012-04-27 09:38:05.413768] I [repce(slave):60:service_loop] > RepceServer: terminating on reaching EOF. > > [2012-04-27 09:38:15.35907] I [resource(slave):207:service_loop] FILE: > connection inactive for 120 seconds, stopping > > [2012-04-27 09:38:15.36382] I [gsyncd(slave):302:main_i] <top>: exiting. > > [2012-04-27 09:38:19.952683] I [gsyncd(slave):290:main_i] <top>: > syncing: file:///nfs > > [2012-04-27 09:38:19.955024] I [resource(slave):201:service_loop] FILE: > slave listening > > > > > > I get these messages in etc-glusterfs-glusterd.vol.log on the slave: > > > > [2012-04-27 09:39:23.667930] W > [socket.c:1494:__socket_proto_state_machine] 0-socket.management: > reading from socket failed. Error (Transport endpoint is not connected), > peer (127.0.0.1:1021) > > [2012-04-27 09:39:43.736138] I > [glusterd-handler.c:3226:glusterd_handle_getwd] 0-glusterd: Received > getwd req > > [2012-04-27 09:39:43.740749] W > [socket.c:1494:__socket_proto_state_machine] 0-socket.management: > reading from socket failed. Error (Transport endpoint is not connected), > peer (127.0.0.1:1023) > > > > As I understand it from searching the list that message is benign and > can be ignored though. > > > > > > Here are tails of all the logs on one of the sources: > > > > [root at retv3130 RMSNFSMOUNT]# tail > ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log > > +------------------------------------------------------------------------------+ > > [2012-04-26 16:16:40.804047] E [socket.c:1685:socket_connect_finish] > 0-RMSNFSMOUNT-client-1: connection to failed (Connection refused) > > [2012-04-26 16:16:40.804852] I [rpc-clnt.c:1536:rpc_clnt_reconfig] > 0-RMSNFSMOUNT-client-0: changing port to 24009 (from 0) > > [2012-04-26 16:16:44.779451] I [rpc-clnt.c:1536:rpc_clnt_reconfig] > 0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0) > > [2012-04-26 16:16:44.855903] I > [client-handshake.c:1090:select_server_supported_programs] > 0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437), > Version (310) > > [2012-04-26 16:16:44.856893] I > [client-handshake.c:913:client_setvolume_cbk] 0-RMSNFSMOUNT-client-0: > Connected to 10.170.1.222:24009, attached to remote volume '/nfs'. > > [2012-04-26 16:16:44.856943] I [afr-common.c:3141:afr_notify] > 0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back > up; going online. > > [2012-04-26 16:16:44.866734] I [fuse-bridge.c:3339:fuse_graph_setup] > 0-fuse: switched to graph 0 > > [2012-04-26 16:16:44.867391] I [fuse-bridge.c:3241:fuse_thread_proc] > 0-fuse: unmounting /tmp/gsyncd-aux-mount-8zMs0J > > [2012-04-26 16:16:44.868538] W [glusterfsd.c:727:cleanup_and_exit] > (-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd] > (-->/lib64/libpthread.so.0() [0x3149c077f1] > (-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c) > [0x40477c]))) 0-: received signum (15), shutting down > > [root at retv3130 RMSNFSMOUNT]# tail > ssh%3A%2F%2Fgeorep-user%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log > > [2012-04-26 16:16:39.263871] I [gsyncd:290:main_i] <top>: syncing: > gluster://localhost:RMSNFSMOUNT -> ssh://georep-user at hptv3130:/nfs > > [2012-04-26 16:16:41.332690] E [syncdutils:133:log_raise_exception] > <top>: FAIL: > > Traceback (most recent call last): > > File > "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/syncdutils.py", > line 154, in twrap > > tf(*aa) > > File > "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py", > line 117, in listen > > rid, exc, res = recv(self.inf) > > File > "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/repce.py", > line 41, in recv > > return pickle.load(inf) > > EOFError > > [root at retv3130 RMSNFSMOUNT]# tail > ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.gluster.log > > [2012-04-27 09:48:42.892842] I [rpc-clnt.c:1536:rpc_clnt_reconfig] > 0-RMSNFSMOUNT-client-1: changing port to 24010 (from 0) > > [2012-04-27 09:48:43.120749] I > [client-handshake.c:1090:select_server_supported_programs] > 0-RMSNFSMOUNT-client-0: Using Program GlusterFS 3.2.6, Num (1298437), > Version (310) > > [2012-04-27 09:48:43.121489] I > [client-handshake.c:913:client_setvolume_cbk] 0-RMSNFSMOUNT-client-0: > Connected to 10.170.1.222:24009, attached to remote volume '/nfs'. > > [2012-04-27 09:48:43.121515] I [afr-common.c:3141:afr_notify] > 0-RMSNFSMOUNT-replicate-0: Subvolume 'RMSNFSMOUNT-client-0' came back > up; going online. > > [2012-04-27 09:48:43.132904] I [fuse-bridge.c:3339:fuse_graph_setup] > 0-fuse: switched to graph 0 > > [2012-04-27 09:48:43.133704] I [fuse-bridge.c:2927:fuse_init] > 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 > kernel 7.13 > > [2012-04-27 09:48:43.135797] I > [afr-common.c:1520:afr_set_root_inode_on_first_lookup] > 0-RMSNFSMOUNT-replicate-0: added root inode > > [2012-04-27 09:48:44.533289] W [fuse-bridge.c:2517:fuse_xattr_cbk] > 0-glusterfs-fuse: 8: > GETXATTR(trusted.glusterfs.9de3c1c8-a753-45a1-8042-b6a4872c5c3c.xtime) / > => -1 (Transport endpoint is not connected) > > [2012-04-27 09:48:44.544934] I [fuse-bridge.c:3241:fuse_thread_proc] > 0-fuse: unmounting /tmp/gsyncd-aux-mount-uXCybC > > [2012-04-27 09:48:44.545879] W [glusterfsd.c:727:cleanup_and_exit] > (-->/lib64/libc.so.6(clone+0x6d) [0x31494e5ccd] > (-->/lib64/libpthread.so.0() [0x3149c077f1] > (-->/opt/glusterfs/3.2.6/sbin/glusterfs(glusterfs_sigwaiter+0x17c) > [0x40477c]))) 0-: received signum (15), shutting down > > [root at retv3130 RMSNFSMOUNT]# tail > ssh%3A%2F%2Froot%4010.2.1.60%3Afile%3A%2F%2F%2Fnfs.log > > File > "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py", > line 34, in lgetxattr > > return cls._query_xattr( path, siz, 'lgetxattr', attr) > > File > "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py", > line 26, in _query_xattr > > cls.raise_oserr() > > File > "/opt/glusterfs/3.2.6/local/libexec/glusterfs/python/syncdaemon/libcxattr.py", > line 16, in raise_oserr > > raise OSError(errn, os.strerror(errn)) > > OSError: [Errno 107] Transport endpoint is not connected > > [2012-04-27 09:49:14.846837] I [monitor(monitor):59:monitor] Monitor: > ------------------------------------------------------------ > > [2012-04-27 09:49:14.847898] I [monitor(monitor):60:monitor] Monitor: > starting gsyncd worker > > [2012-04-27 09:49:14.930681] I [gsyncd:290:main_i] <top>: syncing: > gluster://localhost:RMSNFSMOUNT -> ssh://hptv3130:/nfs > > > > > > I'm out of ideas. I've satisfied all the requirements I can find, and > I'm not seeing anything in the logs that makes any sense to me as an > error that I can fix. Can anyone help? > > > > Thanks! > > > > Scot Kreienkamp > > skreien at la-z-boy.com > > > > > > > This message is intended only for the individual or entity to which it > is addressed. It may contain privileged, confidential information which > is exempt from disclosure under applicable laws. If you are not the > intended recipient, please note that you are strictly prohibited from > disseminating or distributing this information (other than to the > intended recipient) or copying this information. If you have received > this communication in error, please notify us immediately by e-mail or > by telephone at the above number. Thank you. > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 222 bytes Desc: OpenPGP digital signature URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120503/c49e36ff/attachment.sig>