thr3ads.net - Gluster users - [Gluster-users] Gluster 3.3 Questions [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Chris Bornmann

2012-Dec-18 19:33 UTC

[Gluster-users] Gluster 3.3 Questions

Hi All,

I am having a few problems with a gluster configuration I'm using. The
issues are:

1) Sometimes the gluster client running on ServerA stops serving files.
Doing an "ls" on the mount point returns an empty directory. All the
other
clients seem fine when this happens. Unmounting and remounting the gluster
directory temporarily "fixes" the problem. Sometimes it fixes it for
a few
minutes, sometimes it fixes it for a day.
2) The log files in /var/log/glusterfs are not being rotated on ServerA.
They are being rotated on ServerB.
3) On ServerB I have both /etc/glusterd and /etc/glusterfs. ServerA and
the pure clients have only /etc/glusterfs.

Here is some info on my setup, but if there is any info missing please let
me know and I'll provide it.

Gluster version: 3.3.0
OS: Ubuntu 12.04 (running on EC2)

On ServerA the following is filling up the
/var/log/glusterfs/glustershd.log file:

[2012-12-18 19:20:46.819623] I [afr-common.c:1340:afr_launch_self_heal]
0-default-replicate-0: background entry self-heal triggered. path:
<gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6>, reason: lookup detected
pending operations
[2012-12-18 19:20:46.831481] E
[afr-self-heal-common.c:1087:afr_sh_common_lookup_resp_handler]
0-default-replicate-0: path
<gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6>/test_quote.pdf on subvolume
default-client-1 => -1 (No such file or directory)
[2012-12-18 19:20:46.831512] I
[afr-self-heal-entry.c:1904:afr_sh_entry_common_lookup_done]
0-default-replicate-0:
<gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6>/test_quote.pdf: Skipping entry
self-heal because of gfid absence
[2012-12-18 19:20:46.833554] E
[afr-self-heal-common.c:2156:afr_self_heal_completion_cbk]
0-default-replicate-0: background entry self-heal failed on
<gfid:d88ad693-86fd-49eb-9360-7fe89d0e6cf6>

I have a single replicated volume called "default". There are two
servers
each with one brick.

gluster> volume info

Volume Name: default
Type: Replicate
Volume ID: cb46f3ac-2ae1-4c9d-a2af-0df242b2acd3
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: ServerA:/ebs/gluster/default
Brick2: ServerB:/ebs/gluster/default

gluster> volume status all
Status of volume: default
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick ServerA:/ebs/gluster/default 24009 Y 3575
Brick ServerB:/ebs/gluster/default 24009 Y 2241
NFS Server on localhost 38467 Y 3581
Self-heal Daemon on localhost N/A Y 3587
NFS Server on ServerB 38467 Y 2247
Self-heal Daemon on ServerB N/A Y 2253

In addition to ServerA and ServerB (which are also running the gluster
client) there are about 10 other systems acting as pure clients.

Does anybody have any ideas what might be causing my problems? Or
additional things to check?

Thanks in advance!
- chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121218/05428c36/attachment.html>

Gluster users - Dec 2012 - Gluster 3.3 Questions

[Gluster-users] Gluster 3.3 Questions