thr3ads.net - Gluster users - [Gluster-users] Gluster 3.2.1 : Mounted volumes "vanishes" on client side [Aug 2011]

If this information is useful, please help other people find it:
Share via:

gluster1206 at akxnet.de

2011-Aug-30 20:29 UTC

[Gluster-users] Gluster 3.2.1 : Mounted volumes "vanishes" on client side

Hi!

I am using Gluster 3.2.1 on a two/three Opensuse 11.3/11.4 server
cluster, where the Gluster nodes are server and client.

While merging the cluster to servers with higher performance, I tried
Gluster 3.3 beta.

Both versions show the same problem:

A single volume (holding the mail base, being accessed by POP3, IMAP and
SMTP server) reports short time after mounting an "Input/Ouput error"
and becomes unaccessible. The same volume on another idle server mounted
still works.

ls /var/vmail
ls: cannot access /var/vmail: Input/output error

lsof /var/vmail
lsof: WARNING: can't stat() fuse.glusterfs file system /var/vmail
      Output information may be incomplete.
lsof: status error on /var/vmail: Input/output error

After unmounting and remounting the volume, the same thing happens.

I tried to recreate the volume, but this does not help.

Although just created, the log is full of "self healing" entries (but
they should not cause the volume to disappear, right?).

I tried it with initially three bricks (and had to remove one) and the
following parameters

Volume Name: vmail
Type: Replicate
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: mx00.akxnet.de:/data/vmail
Brick2: mx02.akxnet.de:/data/vmail
Brick3: mx01.akxnet.de:/data/vmail
Options Reconfigured:
network.ping-timeout: 15
performance.write-behind-window-size: 2097152
auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
performance.io-thread-count: 64
performance.io-cache: on
performance.stat-prefetch: on
performance.quick-read: off
nfs.disable: on
performance.cache-size: 32MB and 64 MB

and after the delete/create with two bricks and the following parameters

Volume Name: vmail
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mx02.akxnet.de:/data/vmail
Brick2: mx01.akxnet.de:/data/vmail
Options Reconfigured:
performance.quick-read: off
nfs.disable: on
auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1

But always the same result.

The log entries

[2011-08-30 22:10:45.376568] I
[afr-self-heal-common.c:1557:afr_self_heal_completion_cbk]
0-vmail-replicate-0: background  data data self-heal completed on
/xxxxx.de/yyyyyyyyyy/.Tauchen/courierimapuiddb
[2011-08-30 22:10:45.385541] I [afr-common.c:801:afr_lookup_done]
0-vmail-replicate-0: background  meta-data self-heal triggered. path:
/xxxxx.de/yyyyyyyyy/.Tauchen/courierimapkeywords

The volume is presently unuseable. Any hint?

Pranith Kumar K

2011-Aug-31 03:05 UTC

head link

[Gluster-users] Gluster 3.2.1 : Mounted volumes "vanishes" on client side

hi,
     This can happen if there is a split-brain on that directory, could 
you post the output of "getfattr -d -m . /data/vmail/var/vmail" on all
the bricks so that we can confirm if that is the case.

Pranith.
On 08/31/2011 01:59 AM, gluster1206 at akxnet.de wrote:> Hi!
>
> I am using Gluster 3.2.1 on a two/three Opensuse 11.3/11.4 server
> cluster, where the Gluster nodes are server and client.
>
> While merging the cluster to servers with higher performance, I tried
> Gluster 3.3 beta.
>
> Both versions show the same problem:
>
> A single volume (holding the mail base, being accessed by POP3, IMAP and
> SMTP server) reports short time after mounting an "Input/Ouput
error"
> and becomes unaccessible. The same volume on another idle server mounted
> still works.
>
> ls /var/vmail
> ls: cannot access /var/vmail: Input/output error
>
> lsof /var/vmail
> lsof: WARNING: can't stat() fuse.glusterfs file system /var/vmail
>        Output information may be incomplete.
> lsof: status error on /var/vmail: Input/output error
>
> After unmounting and remounting the volume, the same thing happens.
>
> I tried to recreate the volume, but this does not help.
>
> Although just created, the log is full of "self healing" entries
(but
> they should not cause the volume to disappear, right?).
>
> I tried it with initially three bricks (and had to remove one) and the
> following parameters
>
> Volume Name: vmail
> Type: Replicate
> Status: Started
> Number of Bricks: 3
> Transport-type: tcp
> Bricks:
> Brick1: mx00.akxnet.de:/data/vmail
> Brick2: mx02.akxnet.de:/data/vmail
> Brick3: mx01.akxnet.de:/data/vmail
> Options Reconfigured:
> network.ping-timeout: 15
> performance.write-behind-window-size: 2097152
> auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
> performance.io-thread-count: 64
> performance.io-cache: on
> performance.stat-prefetch: on
> performance.quick-read: off
> nfs.disable: on
> performance.cache-size: 32MB and 64 MB
>
> and after the delete/create with two bricks and the following parameters
>
> Volume Name: vmail
> Type: Replicate
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp
> Bricks:
> Brick1: mx02.akxnet.de:/data/vmail
> Brick2: mx01.akxnet.de:/data/vmail
> Options Reconfigured:
> performance.quick-read: off
> nfs.disable: on
> auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
>
> But always the same result.
>
> The log entries
>
> [2011-08-30 22:10:45.376568] I
> [afr-self-heal-common.c:1557:afr_self_heal_completion_cbk]
> 0-vmail-replicate-0: background  data data self-heal completed on
> /xxxxx.de/yyyyyyyyyy/.Tauchen/courierimapuiddb
> [2011-08-30 22:10:45.385541] I [afr-common.c:801:afr_lookup_done]
> 0-vmail-replicate-0: background  meta-data self-heal triggered. path:
> /xxxxx.de/yyyyyyyyy/.Tauchen/courierimapkeywords
>
> The volume is presently unuseable. Any hint?
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Gluster users - Aug 2011 - Gluster 3.2.1 : Mounted volumes "vanishes" on client side

[Gluster-users] Gluster 3.2.1 : Mounted volumes "vanishes" on client side

[Gluster-users] Gluster 3.2.1 : Mounted volumes "vanishes" on client side