Hi All, I am having a few problems with a gluster configuration I'm using. The issues are: 1) Sometimes the gluster client running on ServerA stops serving files. Doing an "ls" on the mount point returns an empty directory. All the other clients seem fine when this happens. Unmounting and remounting the gluster directory temporarily "fixes" the problem. Sometimes it fixes it for a few minutes, sometimes it fixes it for a day. 2) The log files in /var/log/glusterfs are not being rotated on ServerA. They are being rotated on ServerB. 3) On ServerB I have both /etc/glusterd and /etc/glusterfs. ServerA and the pure clients have only /etc/glusterfs. Here is some info on my setup, but if there is any info missing please let me know and I'll provide it. Gluster version: 3.3.0 OS: Ubuntu 12.04 (running on EC2) On ServerA the following is filling up the /var/log/glusterfs/glustershd.log file: [2012-12-18 19:20:46.819623] I [afr-common.c:1340:afr_launch_self_heal] 0-default-replicate-0: background entry self-heal triggered. path: <gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6>, reason: lookup detected pending operations [2012-12-18 19:20:46.831481] E [afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler] 0-default-replicate-0: path <gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6>/test_quote.pdf on subvolume default-client-1 => -1 (No such file or directory) [2012-12-18 19:20:46.831512] I [afr-self-heal-entry.c:1904:afr_sh_entry_common_lookup_done] 0-default-replicate-0: <gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6>/test_quote.pdf: Skipping entry self-heal because of gfid absence [2012-12-18 19:20:46.833554] E [afr-self-heal-common.c:2156:afr_self_heal_completion_cbk] 0-default-replicate-0: background entry self-heal failed on <gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6> I have a single replicated volume called "default". There are two servers each with one brick. gluster> volume info Volume Name: default Type: Replicate Volume ID: cb46f3ac-2ae1-4c9d-a2af-0df242b2acd3 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: ServerA:/ebs/gluster/default Brick2: ServerB:/ebs/gluster/default gluster> volume status all Status of volume: default Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick ServerA:/ebs/gluster/default 24009 Y 3575 Brick ServerB:/ebs/gluster/default 24009 Y 2241 NFS Server on localhost 38467 Y 3581 Self-heal Daemon on localhost N/A Y 3587 NFS Server on ServerB 38467 Y 2247 Self-heal Daemon on ServerB N/A Y 2253 In addition to ServerA and ServerB (which are also running the gluster client) there are about 10 other systems acting as pure clients. Does anybody have any ideas what might be causing my problems? Or additional things to check? Thanks in advance! - chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121218/05428c36/attachment.html>