Sunil Mushran
2008-Jun-20 20:57 UTC
[Ocfs2-announce] Heads up regarding using nfs with ocfs2
All, This is a heads up only for users exporting OCFS2 volumes as NFS mounts. If not, please disregard this email. Recently there was a bugzilla filed that mentioned observing file system lockups when accessing OCFS2 exported volumes with FreeBSD NFS clients. The lockups were not observed by him with Linux NFS clients. http://oss.oracle.com/bugzilla/show_bug.cgi?id=977 Analysis showed that the lockup was being caused by a deadlock triggered during the READDIRPLUS RPC call available in NFS3. The bugzilla has the kernel stack of the deadlocked process. Disabling READDIRPLUS in the NFS client stopped the deadlocks. The user was then able to successfully run his regression on a mixture of Linux 2.6, FreeBSD6 and FreeBSD7 NFS clients. Now, even though the user originally reported that the problem did not reproduce with Linux NFS clients, it was probably only so because the Linux NFS client appears to be more judicious in its use of that RPC call than FreeBSD. For e.g., it does not use it when accessing large directories as was the case here. What this means is that Linux NFS clients can also experience the same process deadlock though probably not as frequently as with other NFS clients. If you are using OCFS2 is such an environment and have experienced such lockups, look into disabling READDIRPLUS in the NFS clients. Refer to the NFS man page in your distribution for instructions on the same. On Linux, it typically is done by mounting with the mount option "nordirplus". Also, it would be helpful if all users using OCFS2 with NFS can let us know whether they have or not experienced the issue. Please include the number of nodes in the cluster, kernel version, OCFS2 version, the distribution running the NFS client and whether it supports disabling READDIRPLUS or not. While the actual problem will eventually likely be addressed in NFS, we are trying to decide whether we should workaround the issue internally in OCFS2. The above information will help us decide the same. Lastly, we would like to thank S?rgio Surkamp for helping us to resolve this issue. Thank you OCFS2 Team