Matthias Munnich
2010-Jul-06  15:19 UTC
[Gluster-users] trouble combining nufa, distribute and replicate
Hi!
I am trying to combine nufa, distribute and replicate but am running in to 
messages like
ls: cannot open directory .: Stale NFS file handle
When I try to list in the mounted directory.  I don't use NFS at all and am
puzzled as to what is going on.  Attached you find my client config file.  
The comments marked "ok" are setups which work. However, more than
one disk is local which let me to use 3 layers:
1: replicate, 2: distribute: 3: nufa
but somehow this is not working. Does anybody spot what is wrong?
Any help is appreciated. 
... Matt
-------------- next part --------------
## file auto generated by /opt/bin/glusterfs-volgen (mount.vol)
# Cmd line:
# $ /opt/bin/glusterfs-volgen -r 1 -n nufa-rep localhost:/bricks/sdc1
mentha:/bricks/sdc1 dahlia:/work/2/.glusterfs salvia:/work/2/.glusterfs
#
# and later hand edited for nufa/replicate
# RAID 1
# TRANSPORT-TYPE tcp
volume dahlia-1
    type protocol/client
    option transport-type tcp
    option remote-host dahlia
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick3
end-volume
volume salvia-1
    type protocol/client
    option transport-type tcp
    option remote-host salvia
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick3
end-volume
volume mentha-1
    type protocol/client
    option transport-type tcp
    option remote-host mentha
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume
volume localhost-1
    type protocol/client
    option transport-type tcp
    option remote-host localhost
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick1
end-volume
volume mentha-2
    type protocol/client
    option transport-type tcp
    option remote-host mentha
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick2
end-volume
volume localhost-2
    type protocol/client
    option transport-type tcp
    option remote-host localhost
    option transport.socket.nodelay on
    option transport.remote-port 6996
    option remote-subvolume brick2
end-volume
volume mirror-0
    type cluster/replicate
    subvolumes dahlia-1 salvia-1
end-volume
volume mirror-1
    type cluster/replicate
    subvolumes localhost-1 mentha-1
end-volume
volume mirror-2
    type cluster/replicate
    subvolumes localhost-2 mentha-2
end-volume
volume distribute-1
    type cluster/distribute
    subvolumes mirror-1 mirror-2
end-volume
volume nufa
    type cluster/nufa
    option local-volume-name distribute-1
    subvolumes mirror-0 distribute-1 
    #ok option local-volume-name mirror-1
    #ok subvolumes mirror-0 mirror-1 mirror-2
end-volume
volume readahead
    type performance/read-ahead
    option page-count 4
    subvolumes nufa
end-volume
#ok volume distribute
    #ok type cluster/distribute
    #ok subvolumes mirror-0 mirror-1
#ok end-volume
#ok volume readahead
    #ok type performance/read-ahead
    #ok option page-count 4
    #ok subvolumes distribute
#ok end-volume
volume iocache
    type performance/io-cache
    option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed
's/[^0-9]//g') / 5120 ))`MB
    option cache-timeout 1
    subvolumes readahead
end-volume
volume quickread
    type performance/quick-read
    option cache-timeout 1
    option max-file-size 64kB
    subvolumes iocache
end-volume
volume writebehind
    type performance/write-behind
    option cache-size 4MB
    subvolumes quickread
end-volume
volume statprefetch
    type performance/stat-prefetch
    subvolumes writebehind
end-volume
Jeff Darcy
2010-Jul-06  16:30 UTC
[Gluster-users] trouble combining nufa, distribute and replicate
On 07/06/2010 11:19 AM, Matthias Munnich wrote:> > Hi! > > I am trying to combine nufa, distribute and replicate but am running in to > messages like > > ls: cannot open directory .: Stale NFS file handle > > When I try to list in the mounted directory. I don't use NFS at all and am > puzzled as to what is going on. Attached you find my client config file. > The comments marked "ok" are setups which work. However, more than > one disk is local which let me to use 3 layers: > 1: replicate, 2: distribute: 3: nufa > but somehow this is not working. Does anybody spot what is wrong? > Any help is appreciated.First, you can pretty much ignore the reference to NFS. It's just a bad errno-to-string conversion. Second, it seems like there are several places where we treat ESTALE specially, but only one in the I/O path where we generate it. That one is in dht-common.c, which is shared between distribute and nufa. The function is dht_revalidate_cbk, and the ESTALE comes from detecting that the dht "layout" structure is inconsistent. This leads me to wonder whether the problem has to do with the fact that distribute/nufa both use this code and the same set of extended attributes, and might be stepping on each other. In general, having explored in some depth how these translators work, the idea of stacking nufa/distribute on top of one another (or themselves) makes me a bit queasy.>From your volfile, it looks like you want to create files on one of twofilesystems replicated between localhost and mentha, and look for files created elsewhere on dahlia and salvia. Assuming the four nodes are similar, you might want to consider using nufa with local-volume-name set to one of the two replicated subvolumes, and let mentha use the other replicated subvolume for the other direction. Also, you should be able to use the localhost filesystems with just storage/posix instead of protocol/client (I assume you must have a separate glusterfsd running for this setup to work) which would eliminate some context switches and another layer of translator hierarchy. See http://www.gluster.com/community/documentation/index.php/NUFA_with_single_process for further examples and explanation, and good luck.