thr3ads.net - Gluster users - [Gluster-users] In a replica 2 server, file-updates on one server missing on the other server [Jan 2015]

If this information is useful, please help other people find it:
Share via:

A Ghoshal

2015-Jan-20 12:04 UTC

[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server

Hello,

I am using the following replicated volume:

root at serv0:~> gluster v info replicated_vol

Volume Name: replicated_vol
Type: Replicate
Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: serv0:/mnt/bricks/replicated_vol/brick
Brick2: serv1:/mnt/bricks/replicated_vol/brick
Options Reconfigured:
diagnostics.client-log-level: INFO
network.ping-timeout: 10
nfs.enable-ino32: on
cluster.self-heal-daemon: on
nfs.disable: off

replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1. 
If I do the following on serv0:

root at serv0:~>echo "cranberries" >
/mnt/replicated_vol/testfile
root at serv0:~>echo "tangerines" >>
/mnt/replicated_vol/testfile

And then I check for the state of the replicas in the bricks, then I find 
that

root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
root at serv0:~>

root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
root at serv1:~>

As may be seen, the replica on serv1 is blank, when I write into testfile 
from serv0 (even though the file is created on both bricks). 
Interestingly, if I write something to the file at serv1, then the two 
replicas become identical.

root at serv1:~>echo "artichokes" >>
/mnt/replicated_vol/testfile

root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
root at serv1:~>

root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
cranberries
tangerines
artichokes
root at serv0:~>

So, I dabbled into the logs a little bit, after upping the diagnostic 
level, and this is what I saw:

When I write on serv0 (bad case):

[2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume] 
0-glusterfs-fuse: 53027: LOOKUP 
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:21:52.197959] D 
[afr-common.c:131:afr_lookup_xattr_req_prepare] 
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record] 
0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, owner: 
0000000000000000
[2015-01-20 09:21:52.198024] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96
[2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-0)
[2015-01-20 09:21:52.198640] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:21:52.198669] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:21:52.198681] D 
[afr-self-heal-common.c:887:afr_mark_sources] 
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:21:52.198694] D 
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:21:52.198705] D 
[afr-common.c:1380:afr_lookup_select_read_child] 
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:21:52.198720] D 
[afr-common.c:1117:afr_lookup_build_response_params] 
0-replicated_vol-replicate-0: Building lookup response from 0
[2015-01-20 09:21:52.198732] D 
[afr-common.c:1732:afr_lookup_perform_self_heal] 
0-replicated_vol-replicate-0: Only 1 child up - do not attempt to detect 
self heal

When I write on serv1 (good case):

[2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume] 
0-glusterfs-fuse: 31212: LOOKUP 
/testfl(f0a76987-8a42-47a2-b027-a823254b736b)
[2015-01-20 09:37:49.151683] D 
[afr-common.c:131:afr_lookup_xattr_req_prepare] 
0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
[2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record] 
0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, owner: 
0000000000000000
[2015-01-20 09:37:49.151744] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
[2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] 
0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, owner: 
0000000000000000
[2015-01-20 09:37:49.151824] T 
[rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 
456, payload: 360, rpc hdr: 96
[2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] 
0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, 
ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
[2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-1)
[2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x 
Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
(replicated_vol-client-0)
[2015-01-20 09:37:49.152582] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
[2015-01-20 09:37:49.152596] D 
[afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
[2015-01-20 09:37:49.152621] D 
[afr-self-heal-common.c:887:afr_mark_sources] 
0-replicated_vol-replicate-0: Number of sources: 1
[2015-01-20 09:37:49.152633] D 
[afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
0-replicated_vol-replicate-0: returning read_child: 0
[2015-01-20 09:37:49.152644] D 
[afr-common.c:1380:afr_lookup_select_read_child] 
0-replicated_vol-replicate-0: Source selected as 0 for /testfl
[2015-01-20 09:37:49.152657] D 
[afr-common.c:1117:afr_lookup_build_response_params] 
0-replicated_vol-replicate-0: Building lookup response from 0

We see that when you write on serv1, the RPC request is sent to both 
replicated_vol-client-0 and replicated_vol-client-1, while when we write 
on serv0, the request is sent only to replicated_vol-client-0, and the 
FUse client is unaware of the presence of client-1 in the latter case.

I checked a bit more in the logs. When I turn on my trace, I found many 
instances of these logs on serv0 but NOT on serv1:

[2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk] 
0-glusterfs-fuse: 53011: LOOKUP() / => 1
[2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect] 
0-replicated_vol-client-1: attempting reconnect
[2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family] 
0-replicated_vol-client-1: address-family not specified, guessing it to be 
inet from (remote-host: serv1)
[2015-01-20 09:21:17.683178] T 
[name.c:225:af_inet_client_get_remote_sockaddr] 0-replicated_vol-client-1: 
option remote-port missing in volume replicated_vol-client-1. Defaulting 
to 24007
[2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6] 
0-resolver: flushing DNS cache
[2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6] 
0-resolver: DNS cache not present, freshly probing hostname: serv1
[2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6] 
0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: serv1 
and port: 24007
[2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6] 
0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
[2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay] 
0-replicated_vol-client-1: NODELAY enabled for socket 10
[2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive] 
0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 2, 
idle: 20
[2015-01-20 09:21:17.684236] W [common-utils.c:2247:gf_get_reserved_ports] 
0-glusterfs: could not open the file 
/proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info 
(No such file or directory)
[2015-01-20 09:21:17.684253] W 
[common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able to 
get reserved ports, hence there is a possibility that glusterfs may 
consume reserved port
[2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown] 
0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint is 
not connected
[2015-01-20 09:21:17.684699] T 
[rpc-clnt.c:519:rpc_clnt_connection_cleanup] 0-replicated_vol-client-1: 
cleaning up state in transport object 0x68a630
[2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv] 
0-replicated_vol-client-1: EOF on socket
[2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv] 
0-replicated_vol-client-1: readv failed (No data available)
[2015-01-20 09:21:17.684766] D 
[socket.c:1962:__socket_proto_state_machine] 0-replicated_vol-client-1: 
reading from socket failed. Error (No data available), peer 
(192.168.24.81:49198)

I could not find a 'remote-port' option in /var/lib/glusterd on either 
peer. Could somebody tell me where this configuration is looked up from? 
Also, sometime later, I rebooted serv0 and that seemed to solve the 
problem. However, stop+start of replicated_vol and restart of 
/etc/init.d/glusterd did NOT solve the problem.

Any help on this matter will be greatly appreciated as I need to provide 
robustness assurances for our setup. 

Thanks a lot,
Anirban

P.s. Additional details:
glusterfs version: 3.4.2
Linux kernel version: 2.6.34
=====-----=====-----====Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150120/7dbc8065/attachment.html>

Lindsay Mathieson

2015-Jan-20 12:16 UTC

head link

[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server

> 
> replicated_vol is mounted at /mnt/replicated_vol on both serv0 and serv1.
The mounts - these are the base disk mounts? 

To access the replicated files system you need to mount the gluster file system 
itself.

mount -t glusterfs HOSTNAME-OR-IPADDRESS:/VOLNAME MOUNTDIR

e.g

mount -t glusterfs server1:/test-volume /mnt/glusterfs

Good blog post for ubuntu here:

  http://www.jamescoyle.net/how-to/439-mount-a-glusterfs-volume

Pranith Kumar Karampuri

2015-Jan-21 18:39 UTC

head link

[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server

hi,
     Responses inline.

PS: You are chalkogen_oxygen?

Pranith
On 01/20/2015 05:34 PM, A Ghoshal wrote:> Hello,
>
> I am using the following replicated volume:
>
> root at serv0:~> gluster v info replicated_vol
>
> Volume Name: replicated_vol
> Type: Replicate
> Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: serv0:/mnt/bricks/replicated_vol/brick
> Brick2: serv1:/mnt/bricks/replicated_vol/brick
> Options Reconfigured:
> diagnostics.client-log-level: INFO
> network.ping-timeout: 10
> nfs.enable-ino32: on
> cluster.self-heal-daemon: on
> nfs.disable: off
>
> replicated_vol is mounted at /mnt/replicated_vol on both serv0 and 
> serv1. If I do the following on serv0:
>
> root at serv0:~>echo "cranberries" >
/mnt/replicated_vol/testfile
> root at serv0:~>echo "tangerines" >>
/mnt/replicated_vol/testfile
>
> And then I check for the state of the replicas in the bricks, then I 
> find that
>
> root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
> cranberries
> tangerines
> root at serv0:~>
>
> root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
> root at serv1:~>
>
> As may be seen, the replica on serv1 is blank, when I write into 
> testfile from serv0 (even though the file is created on both bricks). 
> Interestingly, if I write something to the file at serv1, then the two 
> replicas become identical.
>
> root at serv1:~>echo "artichokes" >>
/mnt/replicated_vol/testfile
>
> root at serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
> cranberries
> tangerines
> artichokes
> root at serv1:~>
>
> root at serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
> cranberries
> tangerines
> artichokes
> root at serv0:~>
>
> So, I dabbled into the logs a little bit, after upping the diagnostic 
> level, and this is what I saw:
>
> *_When I write on serv0 (bad case):_*
>
> [2015-01-20 09:21:52.197704] T [fuse-bridge.c:546:fuse_lookup_resume] 
> 0-glusterfs-fuse: 53027: LOOKUP 
> /testfl(f0a76987-8a42-47a2-b027-a823254b736b)
> [2015-01-20 09:21:52.197959] D 
> [afr-common.c:131:afr_lookup_xattr_req_prepare] 
> 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
> [2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record] 
> 0-replicated_vol-client-0: Auth Info: pid: 28151, uid: 0, gid: 0, 
> owner: 0000000000000000
> [2015-01-20 09:21:52.198024] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x78163x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
> [2015-01-20 09:21:52.198565] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-0: received rpc message (RPC XID: 0x78163x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-0)
> [2015-01-20 09:21:52.198640] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
> [2015-01-20 09:21:52.198669] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
> [2015-01-20 09:21:52.198681] D 
> [afr-self-heal-common.c:887:afr_mark_sources] 
> 0-replicated_vol-replicate-0: Number of sources: 1
> [2015-01-20 09:21:52.198694] D 
> [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
> 0-replicated_vol-replicate-0: returning read_child: 0
> [2015-01-20 09:21:52.198705] D 
> [afr-common.c:1380:afr_lookup_select_read_child] 
> 0-replicated_vol-replicate-0: Source selected as 0 for /testfl
> [2015-01-20 09:21:52.198720] D 
> [afr-common.c:1117:afr_lookup_build_response_params] 
> 0-replicated_vol-replicate-0: Building lookup response from 0
> [2015-01-20 09:21:52.198732] D 
> [afr-common.c:1732:afr_lookup_perform_self_heal] 
> 0-replicated_vol-replicate-0: Only 1 child up - do not attempt to 
> detect self heal
>
> *_When I write on serv1 (good case):_*
>
> [2015-01-20 09:37:49.151506] T [fuse-bridge.c:546:fuse_lookup_resume] 
> 0-glusterfs-fuse: 31212: LOOKUP 
> /testfl(f0a76987-8a42-47a2-b027-a823254b736b)
> [2015-01-20 09:37:49.151683] D 
> [afr-common.c:131:afr_lookup_xattr_req_prepare] 
> 0-replicated_vol-replicate-0: /testfl: failed to get the gfid from dict
> [2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record] 
> 0-replicated_vol-client-0: Auth Info: pid: 7599, uid: 0, gid: 0, 
> owner: 0000000000000000
> [2015-01-20 09:37:49.151744] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x39620x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-0)
> [2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record] 
> 0-replicated_vol-client-1: Auth Info: pid: 7599, uid: 0, gid: 0, 
> owner: 0000000000000000
> [2015-01-20 09:37:49.151824] T 
> [rpc-clnt.c:1182:rpc_clnt_record_build_header] 0-rpc-clnt: Request 
> fraglen 456, payload: 360, rpc hdr: 96
> [2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit] 
> 0-rpc-clnt: submitted request (XID: 0x39563x Program: GlusterFS 3.3, 
> ProgVers: 330, Proc: 27) to rpc-transport (replicated_vol-client-1)
> [2015-01-20 09:37:49.152239] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-1: received rpc message (RPC XID: 0x39563x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-1)
> [2015-01-20 09:37:49.152484] T [rpc-clnt.c:669:rpc_clnt_reply_init] 
> 0-replicated_vol-client-0: received rpc message (RPC XID: 0x39620x 
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport 
> (replicated_vol-client-0)
> [2015-01-20 09:37:49.152582] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
> [2015-01-20 09:37:49.152596] D 
> [afr-self-heal-common.c:138:afr_sh_print_pending_matrix] 
> 0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
> [2015-01-20 09:37:49.152621] D 
> [afr-self-heal-common.c:887:afr_mark_sources] 
> 0-replicated_vol-replicate-0: Number of sources: 1
> [2015-01-20 09:37:49.152633] D 
> [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type] 
> 0-replicated_vol-replicate-0: returning read_child: 0
> [2015-01-20 09:37:49.152644] D 
> [afr-common.c:1380:afr_lookup_select_read_child] 
> 0-replicated_vol-replicate-0: Source selected as 0 for /testfl
> [2015-01-20 09:37:49.152657] D 
> [afr-common.c:1117:afr_lookup_build_response_params] 
> 0-replicated_vol-replicate-0: Building lookup response from 0
>
> We see that when you write on serv1, the RPC request is sent to both 
> replicated_vol-client-0 and replicated_vol-client-1, while when we 
> write on serv0, the request is sent only to replicated_vol-client-0, 
> and the FUse client is unaware of the presence of client-1 in the 
> latter case.
>
> I checked a bit more in the logs. When I turn on my trace, I found 
> many instances of these logs on serv0 but NOT on serv1:
>
> [2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk] 
> 0-glusterfs-fuse: 53011: LOOKUP() / => 1
> [2015-01-20 09:21:17.683088] T [rpc-clnt.c:422:rpc_clnt_reconnect] 
> 0-replicated_vol-client-1: attempting reconnect
> [2015-01-20 09:21:17.683159] D [name.c:155:client_fill_address_family] 
> 0-replicated_vol-client-1: address-family not specified, guessing it 
> to be inet from (remote-host: serv1)
> [2015-01-20 09:21:17.683178] T 
> [name.c:225:af_inet_client_get_remote_sockaddr] 
> 0-replicated_vol-client-1: option remote-port missing in volume 
> replicated_vol-client-1. Defaulting to 24007
> [2015-01-20 09:21:17.683191] T [common-utils.c:188:gf_resolve_ip6] 
> 0-resolver: flushing DNS cache
> [2015-01-20 09:21:17.683202] T [common-utils.c:195:gf_resolve_ip6] 
> 0-resolver: DNS cache not present, freshly probing hostname: serv1
> [2015-01-20 09:21:17.683814] D [common-utils.c:237:gf_resolve_ip6] 
> 0-resolver: returning ip-192.168.24.81 (port-24007) for hostname: 
> serv1 and port: 24007
> [2015-01-20 09:21:17.684139] D [common-utils.c:257:gf_resolve_ip6] 
> 0-resolver: next DNS query will return: ip-192.168.24.81 port-24007
> [2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay] 
> 0-replicated_vol-client-1: NODELAY enabled for socket 10
> [2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive] 
> 0-replicated_vol-client-1: Keep-alive enabled for socket 10, interval 
> 2, idle: 20
> [2015-01-20 09:21:17.684236] W 
> [common-utils.c:2247:gf_get_reserved_ports] 0-glusterfs: could not 
> open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting 
> reserved ports info (No such file or directory)
> [2015-01-20 09:21:17.684253] W 
> [common-utils.c:2280:gf_process_reserved_ports] 0-glusterfs: Not able 
> to get reserved ports, hence there is a possibility that glusterfs may 
> consume reserved portLogs above suggest that mount process couldn't assign a reserved port 
because it couldn't find the file /proc/sys/net/ipv4/ip_local_reserved_ports

I guess reboot of the machine fixed it. Wonder why it was not found in 
the first place.

Pranith.> [2015-01-20 09:21:17.684660] D [socket.c:605:__socket_shutdown] 
> 0-replicated_vol-client-1: shutdown() returned -1. Transport endpoint 
> is not connected
> [2015-01-20 09:21:17.684699] T 
> [rpc-clnt.c:519:rpc_clnt_connection_cleanup] 
> 0-replicated_vol-client-1: cleaning up state in transport object 0x68a630
> [2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv] 
> 0-replicated_vol-client-1: EOF on socket
> [2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv] 
> 0-replicated_vol-client-1: readv failed (No data available)
> [2015-01-20 09:21:17.684766] D 
> [socket.c:1962:__socket_proto_state_machine] 
> 0-replicated_vol-client-1: reading from socket failed. Error (No data 
> available), peer (192.168.24.81:49198)
>
> I could not find a 'remote-port' option in /var/lib/glusterd on
either
> peer. Could somebody tell me where this configuration is looked up 
> from? Also, sometime later, I rebooted serv0 and that seemed to solve 
> the problem. However, stop+start of replicated_vol and restart of 
> /etc/init.d/glusterd did NOT solve the problem.Ignore that log. If no port is given in that volfile, it picks 24007 as 
the port, which is the default port where glusterd 'listens'
>
> Any help on this matter will be greatly appreciated as I need to 
> provide robustness assurances for our setup.
>
> Thanks a lot,
> Anirban
>
> P.s. Additional details:
> /glusterfs version: 3.4.2/
> /Linux kernel version: 2.6.34/
>
> =====-----=====-----====> Notice: The information contained in this
e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150122/fa4fc9b1/attachment.html>

Gluster users - Jan 2015 - In a replica 2 server, file-updates on one server missing on the other server

[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server

[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server

[Gluster-users] In a replica 2 server, file-updates on one server missing on the other server