Matthias Munnich
2010-Jul-06 15:19 UTC
[Gluster-users] trouble combining nufa, distribute and replicate
Hi!
I am trying to combine nufa, distribute and replicate but am running in to
messages like
ls: cannot open directory .: Stale NFS file handle
When I try to list in the mounted directory. I don't use NFS at all and am
puzzled as to what is going on. Attached you find my client config file.
The comments marked "ok" are setups which work. However, more than
one disk is local which let me to use 3 layers:
1: replicate, 2: distribute: 3: nufa
but somehow this is not working. Does anybody spot what is wrong?
Any help is appreciated.
... Matt
-------------- next part --------------
## file auto generated by /opt/bin/glusterfs-volgen (mount.vol)
# Cmd line:
# $ /opt/bin/glusterfs-volgen -r 1 -n nufa-rep localhost:/bricks/sdc1
mentha:/bricks/sdc1 dahlia:/work/2/.glusterfs salvia:/work/2/.glusterfs
#
# and later hand edited for nufa/replicate
# RAID 1
# TRANSPORT-TYPE tcp
volume dahlia-1
type protocol/client
option transport-type tcp
option remote-host dahlia
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick3
end-volume
volume salvia-1
type protocol/client
option transport-type tcp
option remote-host salvia
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick3
end-volume
volume mentha-1
type protocol/client
option transport-type tcp
option remote-host mentha
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick1
end-volume
volume localhost-1
type protocol/client
option transport-type tcp
option remote-host localhost
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick1
end-volume
volume mentha-2
type protocol/client
option transport-type tcp
option remote-host mentha
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick2
end-volume
volume localhost-2
type protocol/client
option transport-type tcp
option remote-host localhost
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick2
end-volume
volume mirror-0
type cluster/replicate
subvolumes dahlia-1 salvia-1
end-volume
volume mirror-1
type cluster/replicate
subvolumes localhost-1 mentha-1
end-volume
volume mirror-2
type cluster/replicate
subvolumes localhost-2 mentha-2
end-volume
volume distribute-1
type cluster/distribute
subvolumes mirror-1 mirror-2
end-volume
volume nufa
type cluster/nufa
option local-volume-name distribute-1
subvolumes mirror-0 distribute-1
#ok option local-volume-name mirror-1
#ok subvolumes mirror-0 mirror-1 mirror-2
end-volume
volume readahead
type performance/read-ahead
option page-count 4
subvolumes nufa
end-volume
#ok volume distribute
#ok type cluster/distribute
#ok subvolumes mirror-0 mirror-1
#ok end-volume
#ok volume readahead
#ok type performance/read-ahead
#ok option page-count 4
#ok subvolumes distribute
#ok end-volume
volume iocache
type performance/io-cache
option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed
's/[^0-9]//g') / 5120 ))`MB
option cache-timeout 1
subvolumes readahead
end-volume
volume quickread
type performance/quick-read
option cache-timeout 1
option max-file-size 64kB
subvolumes iocache
end-volume
volume writebehind
type performance/write-behind
option cache-size 4MB
subvolumes quickread
end-volume
volume statprefetch
type performance/stat-prefetch
subvolumes writebehind
end-volume
Jeff Darcy
2010-Jul-06 16:30 UTC
[Gluster-users] trouble combining nufa, distribute and replicate
On 07/06/2010 11:19 AM, Matthias Munnich wrote:> > Hi! > > I am trying to combine nufa, distribute and replicate but am running in to > messages like > > ls: cannot open directory .: Stale NFS file handle > > When I try to list in the mounted directory. I don't use NFS at all and am > puzzled as to what is going on. Attached you find my client config file. > The comments marked "ok" are setups which work. However, more than > one disk is local which let me to use 3 layers: > 1: replicate, 2: distribute: 3: nufa > but somehow this is not working. Does anybody spot what is wrong? > Any help is appreciated.First, you can pretty much ignore the reference to NFS. It's just a bad errno-to-string conversion. Second, it seems like there are several places where we treat ESTALE specially, but only one in the I/O path where we generate it. That one is in dht-common.c, which is shared between distribute and nufa. The function is dht_revalidate_cbk, and the ESTALE comes from detecting that the dht "layout" structure is inconsistent. This leads me to wonder whether the problem has to do with the fact that distribute/nufa both use this code and the same set of extended attributes, and might be stepping on each other. In general, having explored in some depth how these translators work, the idea of stacking nufa/distribute on top of one another (or themselves) makes me a bit queasy.>From your volfile, it looks like you want to create files on one of twofilesystems replicated between localhost and mentha, and look for files created elsewhere on dahlia and salvia. Assuming the four nodes are similar, you might want to consider using nufa with local-volume-name set to one of the two replicated subvolumes, and let mentha use the other replicated subvolume for the other direction. Also, you should be able to use the localhost filesystems with just storage/posix instead of protocol/client (I assume you must have a separate glusterfsd running for this setup to work) which would eliminate some context switches and another layer of translator hierarchy. See http://www.gluster.com/community/documentation/index.php/NUFA_with_single_process for further examples and explanation, and good luck.