A Ghoshal
2015-Jan-20 12:04 UTC
[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server
Hello, I am using the following replicated volume: root at serv0:~> gluster v info replicated_vol Volume Name: replicated_vol Type: Replicate Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: serv0:/mnt/bricks/replicated_vol/brick Brick2: serv1:/mnt/bricks/replicated_vol/brick Options Reconfigured: diagnostics.client-log-level: INFO network.ping-timeout: 10 nfs.enable-ino32: on cluster.self-heal-daemon: on nfs.disable: off replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1. If I do the following on serv0: root at serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile root at serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile And then I check for the state of the replicas in the bricks, then I find that root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile cranberries tangerines root at serv0:~> root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile root at serv1:~> As may be seen, the replica on serv1 is blank, when I write into testfile from serv0 (even though the file is created on both bricks). Interestingly, if I write something to the file at serv1, then the two replicas become identical. root at serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile cranberries tangerines artichokes root at serv1:~> root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile cranberries tangerines artichokes root at serv0:~> So, I dabbled into the logs a little bit, after upping the diagnostic level, and this is what I saw: When I write on serv0 (bad case): [2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume] 0-glusterfs-fuse: 53027: LOOKUP /testfl(f0a76987-8a42-47a2-b027-a823254b736b) [2015-01-20 09:21:52.197959] D [afr-common.c:131:afr_lookup_xattr_req_prepare] 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict [2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record] 0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner: 0000000000000000 [2015-01-20 09:21:52.198024] T [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96 [2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0) [2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport (replicated_vol-client-0) [2015-01-20 09:21:52.198640] D [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ] [2015-01-20 09:21:52.198669] D [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ] [2015-01-20 09:21:52.198681] D [afr-self-heal-common.c:887:afr_mark_sources] 0-replicated_vol-replicate-0: Number of sources: 1 [2015-01-20 09:21:52.198694] D [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 0-replicated_vol-replicate-0: returning read_child: 0 [2015-01-20 09:21:52.198705] D [afr-common.c:1380:afr_lookup_select_read_child] 0-replicated_vol-replicate-0: Source selected as 0 for /testfl [2015-01-20 09:21:52.198720] D [afr-common.c:1117:afr_lookup_build_response_params] 0-replicated_vol-replicate-0: Building lookup response from 0 [2015-01-20 09:21:52.198732] D [afr-common.c:1732:afr_lookup_perform_self_heal] 0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect self heal When I write on serv1 (good case): [2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume] 0-glusterfs-fuse: 31212: LOOKUP /testfl(f0a76987-8a42-47a2-b027-a823254b736b) [2015-01-20 09:37:49.151683] D [afr-common.c:131:afr_lookup_xattr_req_prepare] 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict [2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record] 0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner: 0000000000000000 [2015-01-20 09:37:49.151744] T [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96 [2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0) [2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] 0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner: 0000000000000000 [2015-01-20 09:37:49.151824] T [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96 [2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1) [2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport (replicated_vol-client-1) [2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport (replicated_vol-client-0) [2015-01-20 09:37:49.152582] D [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ] [2015-01-20 09:37:49.152596] D [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ] [2015-01-20 09:37:49.152621] D [afr-self-heal-common.c:887:afr_mark_sources] 0-replicated_vol-replicate-0: Number of sources: 1 [2015-01-20 09:37:49.152633] D [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 0-replicated_vol-replicate-0: returning read_child: 0 [2015-01-20 09:37:49.152644] D [afr-common.c:1380:afr_lookup_select_read_child] 0-replicated_vol-replicate-0: Source selected as 0 for /testfl [2015-01-20 09:37:49.152657] D [afr-common.c:1117:afr_lookup_build_response_params] 0-replicated_vol-replicate-0: Building lookup response from 0 We see that when you write on serv1, the RPC request is sent to both replicated_vol-client-0 and replicated_vol-client-1, while when we write on serv0, the request is sent only to replicated_vol-client-0, and the FUse client is unaware of the presence of client-1 in the latter case. I checked a bit more in the logs. When I turn on my trace, I found many instances of these logs on serv0 but NOT on serv1: [2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk] 0-glusterfs-fuse: 53011: LOOKUP() / => 1 [2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect] 0-replicated_vol-client-1: attempting reconnect [2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family] 0-replicated_vol-client-1: address-family not specified, guessing it to be inet from (remote-host: serv1) [2015-01-20 09:21:17.683178] T [name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1: option remote-port missing in volume replicated_vol-client-1. Defaulting to 24007 [2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6] 0-resolver: flushing DNS cache [2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6] 0-resolver: DNS cache not present, freshly probing hostname: serv1 [2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6] 0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1 and port: 24007 [2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6] 0-resolver: next DNS query will return: ip-192.168.24.81 port-24007 [2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay] 0-replicated_vol-client-1: NODELAY enabled for socket 10 [2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive] 0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2, idle: 20 [2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info (No such file or directory) [2015-01-20 09:21:17.684253] W [common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to get reserved ports, hence there is a possibility that glusterfs may consume reserved port [2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown] 0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is not connected [2015-01-20 09:21:17.684699] T [rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1: cleaning up state in transport object 0x68a630 [2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv] 0-replicated_vol-client-1: EOF on socket [2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv] 0-replicated_vol-client-1: readv failed (No data available) [2015-01-20 09:21:17.684766] D [socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1: reading from socket failed. Error (No data available), peer (192.168.24.81:49198) I could not find a 'remote-port' option in /var/lib/glusterd on either peer. Could somebody tell me where this configuration is looked up from? Also, sometime later, I rebooted serv0 and that seemed to solve the problem. However, stop+start of replicated_vol and restart of /etc/init.d/glusterd did NOT solve the problem. Any help on this matter will be greatly appreciated as I need to provide robustness assurances for our setup. Thanks a lot, Anirban P.s. Additional details: glusterfs version: 3.4.2 Linux kernel version: 2.6.34 =====-----=====-----====Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/7dbc8065/attachment.html>
Lindsay Mathieson
2015-Jan-20 12:16 UTC
[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server
> > replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.The mounts - these are the base disk mounts? To access the replicated files system you need to mount the gluster file system itself. mount -t glusterfs HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR e.g mount -t glusterfs server1:/test-volume /mnt/glusterfs Good blog post for ubuntu here: http://www.jamescoyle.net/how-to/439-mount-a-glusterfs-volume
Pranith Kumar Karampuri
2015-Jan-21 18:39 UTC
[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server
hi, Responses inline. PS: You are chalkogen_oxygen? Pranith On 01/20/2015 05:34 PM, A Ghoshal wrote:> Hello, > > I am using the following replicated volume: > > root at serv0:~> gluster v info replicated_vol > > Volume Name: replicated_vol > Type: Replicate > Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: serv0:/mnt/bricks/replicated_vol/brick > Brick2: serv1:/mnt/bricks/replicated_vol/brick > Options Reconfigured: > diagnostics.client-log-level: INFO > network.ping-timeout: 10 > nfs.enable-ino32: on > cluster.self-heal-daemon: on > nfs.disable: off > > replicated_vol is mounted at /mnt/replicated_vol on both serv0 and > serv1. If I do the following on serv0: > > root at serv0:~>echo "cranberries" > /mnt/replicated_vol/testfile > root at serv0:~>echo "tangerines" >> /mnt/replicated_vol/testfile > > And then I check for the state of the replicas in the bricks, then I > find that > > root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile > cranberries > tangerines > root at serv0:~> > > root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile > root at serv1:~> > > As may be seen, the replica on serv1 is blank, when I write into > testfile from serv0 (even though the file is created on both bricks). > Interestingly, if I write something to the file at serv1, then the two > replicas become identical. > > root at serv1:~>echo "artichokes" >> /mnt/replicated_vol/testfile > > root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile > cranberries > tangerines > artichokes > root at serv1:~> > > root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile > cranberries > tangerines > artichokes > root at serv0:~> > > So, I dabbled into the logs a little bit, after upping the diagnostic > level, and this is what I saw: > > *_When I write on serv0 (bad case):_* > > [2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume] > 0-glusterfs-fuse: 53027: LOOKUP > /testfl(f0a76987-8a42-47a2-b027-a823254b736b) > [2015-01-20 09:21:52.197959] D > [afr-common.c:131:afr_lookup_xattr_req_prepare] > 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict > [2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record] > 0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, > owner: 0000000000000000 > [2015-01-20 09:21:52.198024] T > [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request > fraglen 456, payload: 360, rpc hdr: 96 > [2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit] > 0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3, > ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0) > [2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init] > 0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x > Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport > (replicated_vol-client-0) > [2015-01-20 09:21:52.198640] D > [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] > 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ] > [2015-01-20 09:21:52.198669] D > [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] > 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ] > [2015-01-20 09:21:52.198681] D > [afr-self-heal-common.c:887:afr_mark_sources] > 0-replicated_vol-replicate-0: Number of sources: 1 > [2015-01-20 09:21:52.198694] D > [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] > 0-replicated_vol-replicate-0: returning read_child: 0 > [2015-01-20 09:21:52.198705] D > [afr-common.c:1380:afr_lookup_select_read_child] > 0-replicated_vol-replicate-0: Source selected as 0 for /testfl > [2015-01-20 09:21:52.198720] D > [afr-common.c:1117:afr_lookup_build_response_params] > 0-replicated_vol-replicate-0: Building lookup response from 0 > [2015-01-20 09:21:52.198732] D > [afr-common.c:1732:afr_lookup_perform_self_heal] > 0-replicated_vol-replicate-0: Only 1 child up - do not attempt to > detect self heal > > *_When I write on serv1 (good case):_* > > [2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume] > 0-glusterfs-fuse: 31212: LOOKUP > /testfl(f0a76987-8a42-47a2-b027-a823254b736b) > [2015-01-20 09:37:49.151683] D > [afr-common.c:131:afr_lookup_xattr_req_prepare] > 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict > [2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record] > 0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, > owner: 0000000000000000 > [2015-01-20 09:37:49.151744] T > [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request > fraglen 456, payload: 360, rpc hdr: 96 > [2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] > 0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, > ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0) > [2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] > 0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, > owner: 0000000000000000 > [2015-01-20 09:37:49.151824] T > [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request > fraglen 456, payload: 360, rpc hdr: 96 > [2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] > 0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, > ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1) > [2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] > 0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x > Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport > (replicated_vol-client-1) > [2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] > 0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x > Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport > (replicated_vol-client-0) > [2015-01-20 09:37:49.152582] D > [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] > 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ] > [2015-01-20 09:37:49.152596] D > [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] > 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ] > [2015-01-20 09:37:49.152621] D > [afr-self-heal-common.c:887:afr_mark_sources] > 0-replicated_vol-replicate-0: Number of sources: 1 > [2015-01-20 09:37:49.152633] D > [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] > 0-replicated_vol-replicate-0: returning read_child: 0 > [2015-01-20 09:37:49.152644] D > [afr-common.c:1380:afr_lookup_select_read_child] > 0-replicated_vol-replicate-0: Source selected as 0 for /testfl > [2015-01-20 09:37:49.152657] D > [afr-common.c:1117:afr_lookup_build_response_params] > 0-replicated_vol-replicate-0: Building lookup response from 0 > > We see that when you write on serv1, the RPC request is sent to both > replicated_vol-client-0 and replicated_vol-client-1, while when we > write on serv0, the request is sent only to replicated_vol-client-0, > and the FUse client is unaware of the presence of client-1 in the > latter case. > > I checked a bit more in the logs. When I turn on my trace, I found > many instances of these logs on serv0 but NOT on serv1: > > [2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk] > 0-glusterfs-fuse: 53011: LOOKUP() / => 1 > [2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect] > 0-replicated_vol-client-1: attempting reconnect > [2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family] > 0-replicated_vol-client-1: address-family not specified, guessing it > to be inet from (remote-host: serv1) > [2015-01-20 09:21:17.683178] T > [name.c:225:af_inet_client_get_remote_sockaddr] > 0-replicated_vol-client-1: option remote-port missing in volume > replicated_vol-client-1. Defaulting to 24007 > [2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6] > 0-resolver: flushing DNS cache > [2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6] > 0-resolver: DNS cache not present, freshly probing hostname: serv1 > [2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6] > 0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: > serv1 and port: 24007 > [2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6] > 0-resolver: next DNS query will return: ip-192.168.24.81 port-24007 > [2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay] > 0-replicated_vol-client-1: NODELAY enabled for socket 10 > [2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive] > 0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval > 2, idle: 20 > [2015-01-20 09:21:17.684236] W > [common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not > open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting > reserved ports info (No such file or directory) > [2015-01-20 09:21:17.684253] W > [common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able > to get reserved ports, hence there is a possibility that glusterfs may > consume reserved portLogs above suggest that mount process couldn't assign a reserved port because it couldn't find the file /proc/sys/net/ipv4/ip_local_reserved_ports I guess reboot of the machine fixed it. Wonder why it was not found in the first place. Pranith.> [2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown] > 0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint > is not connected > [2015-01-20 09:21:17.684699] T > [rpc-clnt.c:519:rpc_clnt_connection_cleanup] > 0-replicated_vol-client-1: cleaning up state in transport object 0x68a630 > [2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv] > 0-replicated_vol-client-1: EOF on socket > [2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv] > 0-replicated_vol-client-1: readv failed (No data available) > [2015-01-20 09:21:17.684766] D > [socket.c:1962:__socket_proto_state_machine] > 0-replicated_vol-client-1: reading from socket failed. Error (No data > available), peer (192.168.24.81:49198) > > I could not find a 'remote-port' option in /var/lib/glusterd on either > peer. Could somebody tell me where this configuration is looked up > from? Also, sometime later, I rebooted serv0 and that seemed to solve > the problem. However, stop+start of replicated_vol and restart of > /etc/init.d/glusterd did NOT solve the problem.Ignore that log. If no port is given in that volfile, it picks 24007 as the port, which is the default port where glusterd 'listens'> > Any help on this matter will be greatly appreciated as I need to > provide robustness assurances for our setup. > > Thanks a lot, > Anirban > > P.s. Additional details: > /glusterfs version: 3.4.2/ > /Linux kernel version: 2.6.34/ > > =====-----=====-----====> Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150122/fa4fc9b1/attachment.html>