Fabricio Cannini
2011-Jan-11 14:01 UTC
[Gluster-users] Frequent "stale nfs file handle" error
Hi all. I've been having this error very frequently, at least once in a week. Whenever this happens, restarting all the gluster daemons makes things work again. This is the hardware i'm using: 22 nodes 2x Intel xeon 5420 2.5GHz , 16GB ddr2 ECC , 1 sata2 hd of 750GB. Of which ~600GB is a partition ( /glstfs ) dedicated to gluster. Each node have 1 Mellanox MT25204 [InfiniHost III Lx] Inifiniband DDR HCA used by gluster through the 'verbs' interface. The switch is a Voltaire ISR 9024S/D. Each node also is a client of the gluster volume, that is accessed through the '/scratch' mount-point. The machine itself is a scientific cluster, with all nodes and the head running Debian Squeeze amd64, with stock 3.0.5 packages. These are the server and client configs: Client config http://pastebin.com/6d4BjQwd Server config http://pastebin.com/4ZmX9ir1 And here are some of the messages in the head node log: http://pastebin.com/gkf3CmK9 If anybody can make a sense of why is it happening, i'd be really really thankful.
Łukasz Jagiełło
2011-Jan-12 09:53 UTC
[Gluster-users] Frequent "stale nfs file handle" error
2011/1/11 Fabricio Cannini <fcannini at gmail.com>:> Hi all. > > I've been having this error very frequently, at least once in a week. > Whenever this happens, restarting all the gluster daemons makes things work > again. > > This is the hardware i'm using: > > 22 nodes > 2x Intel xeon 5420 2.5GHz , 16GB ddr2 ECC , 1 sata2 hd of 750GB. > Of which ~600GB is a partition ( /glstfs ) dedicated to gluster. Each node > have 1 Mellanox MT25204 [InfiniHost III Lx] Inifiniband DDR HCA used by > gluster through the 'verbs' interface. The switch is a Voltaire ISR 9024S/D. > Each node also is a client of the gluster volume, that is accessed through the > '/scratch' mount-point. > The machine itself is a scientific cluster, with all nodes and the head running > Debian Squeeze amd64, with stock 3.0.5 packages. > > These are the server and client configs: > > Client config > http://pastebin.com/6d4BjQwd > > Server config > http://pastebin.com/4ZmX9ir1 > > And here are some of the messages in the head node log: > http://pastebin.com/gkf3CmK9 > > If anybody can make a sense of why is it happening, i'd be really really > thankful.Got same problem at 3.1.1 - 3.1.2qa4 -- ?ukasz Jagie??o lukasz<at>jagiello<dot>org