thr3ads.net - Gluster users - [Gluster-users] nfs problems [Feb 2011]

If this information is useful, please help other people find it:
Share via:

David Lloyd

2011-Feb-16 16:09 UTC

[Gluster-users] nfs problems

getting lots of stale nfs filehandle errors

we have 4 nodes in our cluster, clients nfs mount the volume from any node
in a round-robin

it appears that one node has gone bad. the clients mounting that node can't
see the files that the others can see. ls -l gives rubbish for the metadata,
and get lots of these lines in the nfs.log:

[2011-02-16 15:33:32.538756] I [dht-layout.c:588:dht_layout_normalize]
glustervol1-dht: found anomalies in /production/people.1. holes=2 overlaps=0
[2011-02-16 15:33:32.540759] I [dht-layout.c:588:dht_layout_normalize]
glustervol1-dht: found anomalies in /production/people.nano. holes=2
overlaps=0
[2011-02-16 15:33:32.543682] I [dht-layout.c:588:dht_layout_normalize]
glustervol1-dht: found anomalies in /production/people.2. holes=2 overlaps=0
[2011-02-16 15:33:32.507428] I [dht-layout.c:588:dht_layout_normalize]
glustervol1-dht: found anomalies in /production/skeleton. holes=2 overlaps=0
[2011-02-16 15:33:32.509440] I [dht-layout.c:588:dht_layout_normalize]
glustervol1-dht: found anomalies in /production/svn. holes=2 overlaps=0
[2011-02-16 15:33:32.511275] I [dht-layout.c:588:dht_layout_normalize]
glustervol1-dht: found anomalies in /production/tempo. holes=2 overlaps=0

Any ideas?

Thanks
David


gluster 3.1.2
g3:/var/log/glusterfs # gluster volume info

Volume Name: glustervol1
Type: Distributed-Replicate
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: g1:/mnt/glus1
Brick2: g2:/mnt/glus1
Brick3: g3:/mnt/glus1
Brick4: g4:/mnt/glus1
Brick5: g1:/mnt/glus2
Brick6: g2:/mnt/glus2
Brick7: g3:/mnt/glus2
Brick8: g4:/mnt/glus2
Options Reconfigured:
diagnostics.dump-fd-stats: on
diagnostics.latency-measurement: off
network.ping-timeout: 20
performance.write-behind-window-size: 1mb
performance.cache-size: 1gb
performance.stat-prefetch: 1



-- 
David Lloyd
V Consultants
www.v-consultants.co.uk
<http://www.v-consultants.co.uk>

Dan Bretherton

2011-Feb-25 17:55 UTC

head link

[Gluster-users] nfs problems

*> *David Lloyd* david.lloyd at v-consultants.co.uk 
>
<mailto:gluster-users%40gluster.org?Subject=Re%3A%20%5BGluster-users%5D%20nfs%20problems&In-Reply-To=%3CAANLkTikFYRjGSP66ptO01-Ms49iaG%3DSrfSbUm4pH9FAK%40mail.gmail.com%3E>
> /Wed Feb 16 08:09:36 PST 2011/
>
>     * Previous message: [Gluster-users] can't start/stop volume
>      
<http://gluster.org/pipermail/gluster-users/2011-February/006653.html>
>     * Next message: [Gluster-users] GlusterFS 3.1.2 crash with NFS
>      
<http://gluster.org/pipermail/gluster-users/2011-February/006660.html>
>     * *Messages sorted by:* [ date ]
>      
<http://gluster.org/pipermail/gluster-users/2011-February/date.html#6656>
>       [ thread ]
>      
<http://gluster.org/pipermail/gluster-users/2011-February/thread.html#6656>
>       [ subject ]
>      
<http://gluster.org/pipermail/gluster-users/2011-February/subject.html#6656>
>       [ author ]
>      
<http://gluster.org/pipermail/gluster-users/2011-February/author.html#6656>
>
>
> ------------------------------------------------------------------------
> getting lots of stale nfs filehandle errors
>
> we have 4 nodes in our cluster, clients nfs mount the volume from any node
> in a round-robin
>
> it appears that one node has gone bad. the clients mounting that node
can't
> see the files that the others can see. ls -l gives rubbish for the
metadata,
> and get lots of these lines in the nfs.log:
>
> [2011-02-16 15:33:32.538756] I [dht-layout.c:588:dht_layout_normalize]
> glustervol1-dht: found anomalies in /production/people.1. holes=2
overlaps=0
> [2011-02-16 15:33:32.540759] I [dht-layout.c:588:dht_layout_normalize]
> glustervol1-dht: found anomalies in /production/people.nano. holes=2
> overlaps=0
> [2011-02-16 15:33:32.543682] I [dht-layout.c:588:dht_layout_normalize]
> glustervol1-dht: found anomalies in /production/people.2. holes=2
overlaps=0
> [2011-02-16 15:33:32.507428] I [dht-layout.c:588:dht_layout_normalize]
> glustervol1-dht: found anomalies in /production/skeleton. holes=2
overlaps=0
> [2011-02-16 15:33:32.509440] I [dht-layout.c:588:dht_layout_normalize]
> glustervol1-dht: found anomalies in /production/svn. holes=2 overlaps=0
> [2011-02-16 15:33:32.511275] I [dht-layout.c:588:dht_layout_normalize]
> glustervol1-dht: found anomalies in /production/tempo. holes=2 overlaps=0
>
> Any ideas?
>
> Thanks
> David*David,
I recently had a problem that resulted in error messages referring to 
"anomalies", "holes" and "overlaps" as above, and
also quite frequently
"mismatching layouts" as well.   In my case I had caused the problem
to
develop myself, by not mounting some of the backend filesystems properly 
(without extended attribute support explicitly enabled).  That's 
obviously not the case here, but as nobody else has replied to this 
posting I thought I would briefly explain how I got rid of the errors in 
my volume.  Once the backend filesystems were mounted correctly (with 
the "user_xattr" mount option) I performed the following procedure to 
"sanitize" them, following advice from Gluster.  Note that I was
advised
to take a full backup before attempting this procedure.

1) Stop the volume with "gluster volume stop ..."
2) Run a script called "backend-xattr-sanitize.sh" on each of the 
backend filesystems.  The script can be downloaded from here: 
https://github.com/gluster/glusterfs/blob/master/extras/backend-xattr-sanitize.sh
3) Start the volume and mount it
4) Run "find .| xargs stat >>/dev/null 2>&1" in the mount
point

It would probably be wise for you to take independent advice before 
using the "sanitize" script yourself, but it certainly worked for me.

-Dan.

Maybe Matching Threads

Search for more reasonably related threads

Gluster users - Feb 2011 - nfs problems

[Gluster-users] nfs problems

[Gluster-users] nfs problems

Maybe Matching Threads