thr3ads.net - Gluster users - [Gluster-users] trouble combining nufa, distribute and replicate [Jul 2010]

If this information is useful, please help other people find it:
Share via:

Matthias Munnich

2010-Jul-06 15:19 UTC

[Gluster-users] trouble combining nufa, distribute and replicate

Hi!

I am trying to combine nufa, distribute and replicate but am running in to 
messages like

ls: cannot open directory .: Stale NFS file handle

When I try to list in the mounted directory.  I don't use NFS at all and am
puzzled as to what is going on.  Attached you find my client config file.  
The comments marked "ok" are setups which work. However, more than
one disk is local which let me to use 3 layers:
1: replicate, 2: distribute: 3: nufa
but somehow this is not working. Does anybody spot what is wrong?
Any help is appreciated. 

... Matt



-------------- next part --------------
## file auto generated by /opt/bin/glusterfs-volgen (mount.vol)
# Cmd line:
# $ /opt/bin/glusterfs-volgen -r 1 -n nufa-rep localhost:/bricks/sdc1
mentha:/bricks/sdc1 dahlia:/work/2/.glusterfs salvia:/work/2/.glusterfs
#
# and later hand edited for nufa/replicate

# RAID 1
# TRANSPORT-TYPE tcp
volume dahlia-1
    type protocol/client
    option transport-type tcp
    option remote-host dahlia
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick3
end-volume

volume salvia-1
    type protocol/client
    option transport-type tcp
    option remote-host salvia
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick3
end-volume

volume mentha-1
    type protocol/client
    option transport-type tcp
    option remote-host mentha
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume localhost-1
    type protocol/client
    option transport-type tcp
    option remote-host localhost
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume

volume mentha-2
    type protocol/client
    option transport-type tcp
    option remote-host mentha
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick2
end-volume

volume localhost-2
    type protocol/client
    option transport-type tcp
    option remote-host localhost
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick2
end-volume

volume mirror-0
    type cluster/replicate
    subvolumes dahlia-1 salvia-1
end-volume

volume mirror-1
    type cluster/replicate
    subvolumes localhost-1 mentha-1
end-volume

volume mirror-2
    type cluster/replicate
    subvolumes localhost-2 mentha-2
end-volume

volume distribute-1
    type cluster/distribute
    subvolumes mirror-1 mirror-2
end-volume

volume nufa
    type cluster/nufa
    option local-volume-name distribute-1
    subvolumes mirror-0 distribute-1 
    #ok option local-volume-name mirror-1
    #ok subvolumes mirror-0 mirror-1 mirror-2
end-volume

volume readahead
    type performance/read-ahead
    option page-count 4
    subvolumes nufa
end-volume

#ok volume distribute
    #ok type cluster/distribute
    #ok subvolumes mirror-0 mirror-1
#ok end-volume

#ok volume readahead
    #ok type performance/read-ahead
    #ok option page-count 4
    #ok subvolumes distribute
#ok end-volume

volume iocache
    type performance/io-cache
    option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed
's/[^0-9]//g') / 5120 ))`MB
    option cache-timeout 1
    subvolumes readahead
end-volume

volume quickread
    type performance/quick-read
    option cache-timeout 1
    option max-file-size 64kB
    subvolumes iocache
end-volume

volume writebehind
    type performance/write-behind
    option cache-size 4MB
    subvolumes quickread
end-volume

volume statprefetch
    type performance/stat-prefetch
    subvolumes writebehind
end-volume

Jeff Darcy

2010-Jul-06 16:30 UTC

head link

[Gluster-users] trouble combining nufa, distribute and replicate

On 07/06/2010 11:19 AM, Matthias Munnich wrote:> 
> Hi!
> 
> I am trying to combine nufa, distribute and replicate but am running in to 
> messages like
> 
> ls: cannot open directory .: Stale NFS file handle
> 
> When I try to list in the mounted directory.  I don't use NFS at all
and am
> puzzled as to what is going on.  Attached you find my client config file.  
> The comments marked "ok" are setups which work. However, more
than
> one disk is local which let me to use 3 layers:
> 1: replicate, 2: distribute: 3: nufa
> but somehow this is not working. Does anybody spot what is wrong?
> Any help is appreciated. 
First, you can pretty much ignore the reference to NFS.  It's just a bad
errno-to-string conversion.

Second, it seems like there are several places where we treat ESTALE
specially, but only one in the I/O path where we generate it.  That one
is in dht-common.c, which is shared between distribute and nufa.  The
function is dht_revalidate_cbk, and the ESTALE comes from detecting that
the dht "layout" structure is inconsistent.  This leads me to wonder
whether the problem has to do with the fact that distribute/nufa both
use this code and the same set of extended attributes, and might be
stepping on each other.  In general, having explored in some depth how
these translators work, the idea of stacking nufa/distribute on top of
one another (or themselves) makes me a bit queasy.
>From your volfile, it looks like you want to create files on one of twofilesystems replicated between localhost and mentha, and look for files
created elsewhere on dahlia and salvia.  Assuming the four nodes are
similar, you might want to consider using nufa with local-volume-name
set to one of the two replicated subvolumes, and let mentha use the
other replicated subvolume for the other direction.  Also, you should be
able to use the localhost filesystems with just storage/posix instead of
protocol/client (I assume you must have a separate glusterfsd running
for this setup to work) which would eliminate some context switches and
another layer of translator hierarchy.  See
http://www.gluster.com/community/documentation/index.php/NUFA_with_single_process
for further examples and explanation, and good luck.

Gluster users - Jul 2010 - trouble combining nufa, distribute and replicate

[Gluster-users] trouble combining nufa, distribute and replicate

[Gluster-users] trouble combining nufa, distribute and replicate