gluster1206 at akxnet.de
2011-Aug-30 20:29 UTC
[Gluster-users] Gluster 3.2.1 : Mounted volumes "vanishes" on client side
Hi!
I am using Gluster 3.2.1 on a two/three Opensuse 11.3/11.4 server
cluster, where the Gluster nodes are server and client.
While merging the cluster to servers with higher performance, I tried
Gluster 3.3 beta.
Both versions show the same problem:
A single volume (holding the mail base, being accessed by POP3, IMAP and
SMTP server) reports short time after mounting an "Input/Ouput error"
and becomes unaccessible. The same volume on another idle server mounted
still works.
ls /var/vmail
ls: cannot access /var/vmail: Input/output error
lsof /var/vmail
lsof: WARNING: can't stat() fuse.glusterfs file system /var/vmail
Output information may be incomplete.
lsof: status error on /var/vmail: Input/output error
After unmounting and remounting the volume, the same thing happens.
I tried to recreate the volume, but this does not help.
Although just created, the log is full of "self healing" entries (but
they should not cause the volume to disappear, right?).
I tried it with initially three bricks (and had to remove one) and the
following parameters
Volume Name: vmail
Type: Replicate
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: mx00.akxnet.de:/data/vmail
Brick2: mx02.akxnet.de:/data/vmail
Brick3: mx01.akxnet.de:/data/vmail
Options Reconfigured:
network.ping-timeout: 15
performance.write-behind-window-size: 2097152
auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
performance.io-thread-count: 64
performance.io-cache: on
performance.stat-prefetch: on
performance.quick-read: off
nfs.disable: on
performance.cache-size: 32MB and 64 MB
and after the delete/create with two bricks and the following parameters
Volume Name: vmail
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: mx02.akxnet.de:/data/vmail
Brick2: mx01.akxnet.de:/data/vmail
Options Reconfigured:
performance.quick-read: off
nfs.disable: on
auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
But always the same result.
The log entries
[2011-08-30 22:10:45.376568] I
[afr-self-heal-common.c:1557:afr_self_heal_completion_cbk]
0-vmail-replicate-0: background data data self-heal completed on
/xxxxx.de/yyyyyyyyyy/.Tauchen/courierimapuiddb
[2011-08-30 22:10:45.385541] I [afr-common.c:801:afr_lookup_done]
0-vmail-replicate-0: background meta-data self-heal triggered. path:
/xxxxx.de/yyyyyyyyy/.Tauchen/courierimapkeywords
The volume is presently unuseable. Any hint?
Pranith Kumar K
2011-Aug-31 03:05 UTC
[Gluster-users] Gluster 3.2.1 : Mounted volumes "vanishes" on client side
hi,
This can happen if there is a split-brain on that directory, could
you post the output of "getfattr -d -m . /data/vmail/var/vmail" on all
the bricks so that we can confirm if that is the case.
Pranith.
On 08/31/2011 01:59 AM, gluster1206 at akxnet.de wrote:> Hi!
>
> I am using Gluster 3.2.1 on a two/three Opensuse 11.3/11.4 server
> cluster, where the Gluster nodes are server and client.
>
> While merging the cluster to servers with higher performance, I tried
> Gluster 3.3 beta.
>
> Both versions show the same problem:
>
> A single volume (holding the mail base, being accessed by POP3, IMAP and
> SMTP server) reports short time after mounting an "Input/Ouput
error"
> and becomes unaccessible. The same volume on another idle server mounted
> still works.
>
> ls /var/vmail
> ls: cannot access /var/vmail: Input/output error
>
> lsof /var/vmail
> lsof: WARNING: can't stat() fuse.glusterfs file system /var/vmail
> Output information may be incomplete.
> lsof: status error on /var/vmail: Input/output error
>
> After unmounting and remounting the volume, the same thing happens.
>
> I tried to recreate the volume, but this does not help.
>
> Although just created, the log is full of "self healing" entries
(but
> they should not cause the volume to disappear, right?).
>
> I tried it with initially three bricks (and had to remove one) and the
> following parameters
>
> Volume Name: vmail
> Type: Replicate
> Status: Started
> Number of Bricks: 3
> Transport-type: tcp
> Bricks:
> Brick1: mx00.akxnet.de:/data/vmail
> Brick2: mx02.akxnet.de:/data/vmail
> Brick3: mx01.akxnet.de:/data/vmail
> Options Reconfigured:
> network.ping-timeout: 15
> performance.write-behind-window-size: 2097152
> auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
> performance.io-thread-count: 64
> performance.io-cache: on
> performance.stat-prefetch: on
> performance.quick-read: off
> nfs.disable: on
> performance.cache-size: 32MB and 64 MB
>
> and after the delete/create with two bricks and the following parameters
>
> Volume Name: vmail
> Type: Replicate
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp
> Bricks:
> Brick1: mx02.akxnet.de:/data/vmail
> Brick2: mx01.akxnet.de:/data/vmail
> Options Reconfigured:
> performance.quick-read: off
> nfs.disable: on
> auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
>
> But always the same result.
>
> The log entries
>
> [2011-08-30 22:10:45.376568] I
> [afr-self-heal-common.c:1557:afr_self_heal_completion_cbk]
> 0-vmail-replicate-0: background data data self-heal completed on
> /xxxxx.de/yyyyyyyyyy/.Tauchen/courierimapuiddb
> [2011-08-30 22:10:45.385541] I [afr-common.c:801:afr_lookup_done]
> 0-vmail-replicate-0: background meta-data self-heal triggered. path:
> /xxxxx.de/yyyyyyyyy/.Tauchen/courierimapkeywords
>
> The volume is presently unuseable. Any hint?
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users