On Mon, Dec 22, 2014 at 10:04:48AM -0500, John Baldwin
wrote:> On Sunday, December 21, 2014 5:27:46 am Richard Perini wrote:
> >
> > We're struggling with an NFS negative name caching issue that
results in
> > a file created by an NFS client 'A' being invisible on client
'B' for up
> > to client A's negnametimeo value. In our scenario, a process on
client
> > A creates a file, and passes a message to another process which may
> > run on client B. The second process expects the file created by A to
> > be available.
>
> Which NFS server are you using? If it is a FreeBSD NFS server, try
changing
> vfs.timestamp_precision to 2 (or 3) and seeing if that reduces the amount
of
> time you have to wait until the directory's ac timeout.
Yes, we are running FreeBSD on the server machines. Unfortunately, our
process really can't tolerate a delay of any length - either the file
is present or its not.
> Another possible the fix is to be careful to not open the file until you
know
> it exists if you still want to keep the reduced LOOKUP RPC load from
caching
> negative lookups.
We have coded around the most common failure points with retry logic,
but this is a hack, and there are some third party libraries involved
that are not practical to fix in this manner.
> > We're running a mix of 9-stable and 10-stable machines, and the
problem is
> > common to both.
> >
> > The obvious fix is to set the nfs mount option 'negnametimeo'
to 0, but
> > unfortunately we also have 'amd' in the picture (which we also
need in our
> > environment). Amd doesn't understand negnametimeo and ignores it,
leaving
> > it set to the system default of 60 seconds (as shown by nfsstat -m).
>
> Have you tried autofs for 10-stable? Is it able to pass this option to NFS
> if you use it? If that works, I would prefer that to be the long term
> solution for this. I'm not a huge fan of adding kernel options to
override
> each NFS default mount option if we can help it.
I just ran up autofs and automountd on 10-stable, set the negnametimeo
option in auto_master and it works a treat. However it will be quite
some time before we're able to shift off 9 which leaves us with the
kernel option as the easiest path.
I'd point out that the nfs client code in
/usr/src/sys/fs/nfsclient/nfsmount.h is already coded to allow override:
ifndef NFS_DEFAULT_NEGNAMETIMEO
#define NFS_DEFAULT_NEGNAMETIMEO 60
#endif
so all that is required is the entry in the "options" file.
Naturally
we can add that ourselves (the beauty of open source :-) but it would
be the only change to the native FreeBSD code for us, so of course
we'd prefer to see it in the tree.
Regards, and compliments of the season.
--R