We have about 800 CentOS 5.2 servers and our university. We use NFS being served from over 10 NetApp frames. We use autofs for to mount up our partitions. There have been times where we can't cd into the directory. It says the directory does not exist. On some servers it works but on others it does not. Typically we restart amd and autofs to resolve this issue. But sometimes it does not even work. A reboot fixes the problem. I am wondering if anyone knows any tricks to check if autofs is working properly on the system and also what I should do in a situation like this. TIA
On Tue, Dec 29, 2009 at 10:04 PM, Mag Gam <magawake at gmail.com> wrote:> We have about 800 CentOS 5.2 servers and our university. We use NFS > being served from over 10 NetApp frames. We use autofs for to mount up > our partitions. There have been times where we can't cd into the > directory. It says the directory does not exist. On some servers it > works but on others it does not. Typically we restart amd and autofs > to resolve this issue. But sometimes it does not even work. A reboot > fixes the problem. I am wondering if anyone knows any tricks to check > if autofs is working properly on the system and also what I should do > in a situation like this.I use a frontend to autofs called autohome. One thing to check is the nesting level of directories in the autohome configuration. If your nesting is too flat you can end up with hundreds of separate mounts in a single directory which can lead to some performance issues (not sure why). I don't see this unless I start hitting about 100 mounts inside a directory though.
On Tue, Dec 29, 2009 at 5:04 PM, Mag Gam <magawake at gmail.com> wrote:> There have been times where we can't cd into the > directory. It says the directory does not exist. On some servers it > works but on others it does not. Typically we restart amd and autofs > to resolve this issue. But sometimes it does not even work. A reboot > fixes the problem.Sometimes I see 'stale NFS handle". In that case, restarting autofs doesn't help. I use fuser or lsof to figure out what processes are accessing the dead thing, then kill those processes. Restarting autofs after killing all such processes seems more likely to work. I'm not sure the issue is completely resolved in my case, and YMMV. Dave