Adam Randall
2013-Jul-17 00:07 UTC
[Ocfs2-users] OCFS2, NFS and random Stale NFS file handles
Please forgive my lack of experience, but I've just recently started deeply working with ocfs2 and am not familiar with all it's caveats. We've just deployed two servers that have SAN arrays attached to them. These arrays are synchronized with DRBD in master/master mode, with ocfs2 configured on top of that. In all my testing everything worked well, except for an issue with symbolic links throwing an exception in the kernel (ths was fixed by applying a patch I found here: comments.gmane.org/gmane.comp.file-systems.ocfs2.devel/8008). Of these machines, one of them is designated the master and the other is it's backup. Host is Gentoo linux running the 3.8.13. I have four other machines that are connecting to the master ocfs2 partition using nfs. The problem I'm having is that on these machines, I'm randomly getting read errors while trying to enter directories over nfs. In all of these cases, except on, these directories are immediately unavailable after they are created. The error that comes back is always something like this: ls: cannot access /mnt/storage/documents/818/8189794/: Stale NFS file handle The mount point is /mnt/storage. Other directories on the mount are available, and on other servers the same directory can be accessed perfectly fine. I haven't been able to reproduce this issue in isolated testing. The four machines that connect via NFS are doing one of two things: 1) processing e-mail through a php driven daemon (read and write, creating directories) 2) serving report files in PDF format over the web via a php web application (read only) I believe that the ocfs2 version if 1.5. I found this in the kernel source itself, but haven't figured out how to determine this in the shell. ocfs2-tools is version 1.8.2, which is what ocfs2 wanted (maybe this is ocfs2 1.8 then?). The only other path I can think to take is to abandon OCFS2 and use DRBD in master/slave mode with ext4 on top of that. This would still provide me with the redundancy I want, but at a lack of not being able to use both machines simultaneously. If anyone has any advice, I'd love to hear it. Thanks in advance, Adam. -- Adam Randall http://www.xaren.net AIM: blitz574 Twitter: @randalla0622 "To err is human... to really foul up requires the root password." -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20130716/17839c02/attachment.html
Patrick J. LoPresti
2013-Jul-17 00:15 UTC
[Ocfs2-users] OCFS2, NFS and random Stale NFS file handles
What version is the NFS mount? ("cat /proc/mounts" on the NFS client) NFSv2 only allowed 64 bits in the file handle. With the "subtree_check" option on the NFS server, 32 of those bits are used for the subtree check, leaving only 32 for the inode. (This is from memory; I may have the exact numbers wrong. But the principle applies.) See <https://oss.oracle.com/projects/ocfs2/dist/documentation/v1.2/ocfs2_faq.html#NFS> If you run "ls -lid <directory>" for directories that work and those that fail, and you find that the failing directories all have huge inode numbers, that will help confirm that this is the problem. Also if you are using NFSv2 and switch to v3 or set the "no_subtree_check" option and it fixes the problem, that will also help confirm that this is the problem. :-) - Pat On Tue, Jul 16, 2013 at 5:07 PM, Adam Randall <randalla at gmail.com> wrote:> Please forgive my lack of experience, but I've just recently started deeply > working with ocfs2 and am not familiar with all it's caveats. > > We've just deployed two servers that have SAN arrays attached to them. These > arrays are synchronized with DRBD in master/master mode, with ocfs2 > configured on top of that. In all my testing everything worked well, except > for an issue with symbolic links throwing an exception in the kernel (ths > was fixed by applying a patch I found here: > comments.gmane.org/gmane.comp.file-systems.ocfs2.devel/8008). Of these > machines, one of them is designated the master and the other is it's backup. > > Host is Gentoo linux running the 3.8.13. > > I have four other machines that are connecting to the master ocfs2 partition > using nfs. The problem I'm having is that on these machines, I'm randomly > getting read errors while trying to enter directories over nfs. In all of > these cases, except on, these directories are immediately unavailable after > they are created. The error that comes back is always something like this: > > ls: cannot access /mnt/storage/documents/818/8189794/: Stale NFS file handle > > The mount point is /mnt/storage. Other directories on the mount are > available, and on other servers the same directory can be accessed perfectly > fine. > > I haven't been able to reproduce this issue in isolated testing. > > The four machines that connect via NFS are doing one of two things: > > 1) processing e-mail through a php driven daemon (read and write, creating > directories) > 2) serving report files in PDF format over the web via a php web application > (read only) > > I believe that the ocfs2 version if 1.5. I found this in the kernel source > itself, but haven't figured out how to determine this in the shell. > ocfs2-tools is version 1.8.2, which is what ocfs2 wanted (maybe this is > ocfs2 1.8 then?). > > The only other path I can think to take is to abandon OCFS2 and use DRBD in > master/slave mode with ext4 on top of that. This would still provide me with > the redundancy I want, but at a lack of not being able to use both machines > simultaneously. > > If anyone has any advice, I'd love to hear it. > > Thanks in advance, > > Adam. > > > -- > Adam Randall > http://www.xaren.net > AIM: blitz574 > Twitter: @randalla0622 > > "To err is human... to really foul up requires the root password." > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users
Adam Randall
2013-Jul-17 17:10 UTC
[Ocfs2-users] OCFS2, NFS and random Stale NFS file handles
The problem I have with NFSv3 is that it's difficult to make it work with iptables. I'll give it a go, however, and see how it affects things. Also, should I instead be considering iSCSI instead of NFS? Adam. On Wed, Jul 17, 2013 at 7:51 AM, Patrick J. LoPresti <patl at patl.com> wrote:> I would seriously try "nfsvers=3" in those mount options. > > In my experience, Linux NFS features take around 10 years before the > bugs are shaken out. And NFSv4 is much, much more complicated than > most. (They added a "generation number" to the file handle, but if the > underlying file system does not implement generation numbers, I have > no idea what will happen...) > > - Pat > > On Wed, Jul 17, 2013 at 7:47 AM, Adam Randall <randalla at gmail.com> wrote: > > My changes to exports had no effect it seems. I awoke to four errors > from my > > processing engine. All of them came from the same server, which makes me > > curious. I've turned that one off and will see what happens. > > > > > > On Tue, Jul 16, 2013 at 11:22 PM, Adam Randall <randalla at gmail.com> > wrote: > >> > >> I've been doing more digging, and I've changed some of the > configuration: > >> > >> 1) I've changed my nfs mount options to this: > >> > >> 192.168.0.160:/mnt/storage /mnt/i2xstorage nfs > >> defaults,nosuid,noexec,noatime,nodiratime 0 0 > >> > >> 2) I've changed the /etc/exports for /mnt/storage to this: > >> > >> /mnt/storage -rw,sync,subtree_check,no_root_squash @trusted > >> > >> In #1, I've removed nodev, which I think I accidentally copied over > from a > >> tmpfs mount point above it when I originally set up the nfs mount point > so > >> long ago. Additionally, I added nodiratime. In #2, it used to be > >> -rw,async,no_subtree_check,no_root_squash. I think the async may be > causing > >> what I'm seeing potentially, and the subtree_check should be okay for > >> testing. > >> > >> Hopefully, this will have an effect. > >> > >> Adam. > >> > >> > >> On Tue, Jul 16, 2013 at 9:44 PM, Adam Randall <randalla at gmail.com> > wrote: > >>> > >>> Here's various outputs: > >>> > >>> # grep nfs /etc/mtab: > >>> rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 > >>> 192.168.0.160:/var/log/dms /mnt/dmslogs nfs > >>> > >>> > rw,noexec,nosuid,nodev,noatime,vers=4,addr=192.168.0.160,clientaddr=192.168.0.150 > >>> 0 0 > >>> 192.168.0.160:/mnt/storage /mnt/storage nfs > >>> > >>> > rw,noexec,nosuid,nodev,noatime,vers=4,addr=192.168.0.160,clientaddr=192.168.0.150 > >>> 0 0 > >>> # grep nfs /proc/mounts: > >>> rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0 > >>> 192.168.0.160:/var/log/dms /mnt/dmslogs nfs4 > >>> > >>> > rw,nosuid,nodev,noexec,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.150,local_lock=none,addr=192.168.0.160 > >>> 0 0 > >>> 192.168.0.160:/mnt/storage /mnt/storage nfs4 > >>> > >>> > rw,nosuid,nodev,noexec,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.150,local_lock=none,addr=192.168.0.160 > >>> 0 0 > >>> > >>> Also, the output of df -hT | grep nfs: > >>> 192.168.0.160:/var/log/dms nfs 273G 5.6G 253G 3% > /mnt/dmslogs > >>> 192.168.0.160:/mnt/storage nfs 2.8T 1.8T 986G 65% > /mnt/storage > >>> > >>> >From the looks of it, it appears to be nfs version 4 (though I thought > >>> that > >>> I was running version 3, hrm...). > >>> > >>> With regards to the ls -lid, one of the directories that wasn't > altered, > >>> but for whatever reason was not accessible due to the handler is this: > >>> > >>> # ls -lid /mnt/storage/reports/5306 > >>> 185862043 drwxrwxrwx 4 1095 users 45056 Jul 15 21:37 > >>> /mnt/storage/reports/5306 > >>> > >>> In the directory where we create new documents, which creates a folder > >>> for each document (legacy decision), it looks something like this: > >>> > >>> # ls -lid /mnt/storage/dms/documents/819/* | head -n 10 > >>> 290518712 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39 > >>> /mnt/storage/dms/documents/819/8191174 > >>> 290518714 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39 > >>> /mnt/storage/dms/documents/819/8191175 > >>> 290518716 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39 > >>> /mnt/storage/dms/documents/819/8191176 > >>> 290518718 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39 > >>> /mnt/storage/dms/documents/819/8191177 > >>> 290518720 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:39 > >>> /mnt/storage/dms/documents/819/8191178 > >>> 290518722 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:40 > >>> /mnt/storage/dms/documents/819/8191179 > >>> 290518724 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:40 > >>> /mnt/storage/dms/documents/819/8191180 > >>> 290518726 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:47 > >>> /mnt/storage/dms/documents/819/8191181 > >>> 290518728 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:50 > >>> /mnt/storage/dms/documents/819/8191182 > >>> 290518730 drwxrwxrwx 2 nobody nobody 3896 Jul 16 18:52 > >>> /mnt/storage/dms/documents/ > >>> 819/8191183 > >>> > >>> The stale handles seem to appear more when there's load on the system, > >>> but that's not overly true. I received notice of two failures (both > from the > >>> same server) tonight, as seen here: > >>> > >>> Jul 16 19:27:40 imaging4 php: Output of: ls -l > >>> /mnt/storage/dms/documents/819/8191226/ 2>&1: > >>> Jul 16 19:27:40 imaging4 php: ls: cannot access > >>> /mnt/storage/dms/documents/819/8191226/: Stale NFS file handle > >>> Jul 16 19:44:15 imaging4 php: Output of: ls -l > >>> /mnt/storage/dms/documents/819/8191228/ 2>&1: > >>> Jul 16 19:44:15 imaging4 php: ls: cannot access > >>> /mnt/storage/dms/documents/819/8191228/: Stale NFS file handle > >>> > >>> The above is logged out of my e-mail collecting daemon, which is > written > >>> in PHP. When I can't access the directory that was just created, it > uses > >>> syslog() to write the above information out. > >>> > >>> >From the same server, doing ls -lid I get these for those two > >>> directories: > >>> > >>> 290518819 drwxrwxrwx 2 nobody nobody 3896 Jul 16 19:44 > >>> /mnt/storage/dms/documents/819/8191228 > >>> 290518816 drwxrwxrwx 2 nobody nobody 3896 Jul 16 19:27 > >>> /mnt/storage/dms/documents/819/8191226 > >>> > >>> Stating the directories showed that the modified times coorespond to > the > >>> logs above: > >>> > >>> Modify: 2013-07-16 19:27:40.786142391 -0700 > >>> Modify: 2013-07-16 19:44:15.458250738 -0700 > >>> > >>> By the time it happened, to the time I got back, the stale handle > cleared > >>> itself. > >>> > >>> If it's at all relevant, this is the fstab: > >>> > >>> 192.168.0.160:/var/log/dms /mnt/dmslogs nfs > >>> defaults,nodev,nosuid,noexec,noatime 0 0 > >>> 192.168.0.160:/mnt/storage /mnt/storage nfs > >>> defaults,nodev,nosuid,noexec,noatime 0 0 > >>> > >>> Lastly, in a fit of grasping at straws, I did unmount the ocfs2 > partition > >>> on the secondary server, and stopped ocfs2 service. I was thinking that > >>> maybe having it in master/master mode could cause what I was seeing. > Alas, > >>> that's not the case as the above errors came after I did that. > >>> > >>> Is there anything else that I can provide that might be of help? > >>> > >>> Adam. > >>> > >>> > >>> > >>> On Tue, Jul 16, 2013 at 5:15 PM, Patrick J. LoPresti < > lopresti at gmail.com> > >>> wrote: > >>>> > >>>> What version is the NFS mount? ("cat /proc/mounts" on the NFS client) > >>>> > >>>> NFSv2 only allowed 64 bits in the file handle. With the > >>>> "subtree_check" option on the NFS server, 32 of those bits are used > >>>> for the subtree check, leaving only 32 for the inode. (This is from > >>>> memory; I may have the exact numbers wrong. But the principle > >>>> applies.) > >>>> > >>>> See > >>>> < > https://oss.oracle.com/projects/ocfs2/dist/documentation/v1.2/ocfs2_faq.html#NFS > > > >>>> > >>>> If you run "ls -lid <directory>" for directories that work and those > >>>> that fail, and you find that the failing directories all have huge > >>>> inode numbers, that will help confirm that this is the problem. > >>>> > >>>> Also if you are using NFSv2 and switch to v3 or set the > >>>> "no_subtree_check" option and it fixes the problem, that will also > >>>> help confirm that this is the problem. :-) > >>>> > >>>> - Pat > >>>> > >>>> > >>>> On Tue, Jul 16, 2013 at 5:07 PM, Adam Randall <randalla at gmail.com> > >>>> wrote: > >>>> > Please forgive my lack of experience, but I've just recently started > >>>> > deeply > >>>> > working with ocfs2 and am not familiar with all it's caveats. > >>>> > > >>>> > We've just deployed two servers that have SAN arrays attached to > them. > >>>> > These > >>>> > arrays are synchronized with DRBD in master/master mode, with ocfs2 > >>>> > configured on top of that. In all my testing everything worked well, > >>>> > except > >>>> > for an issue with symbolic links throwing an exception in the kernel > >>>> > (ths > >>>> > was fixed by applying a patch I found here: > >>>> > comments.gmane.org/gmane.comp.file-systems.ocfs2.devel/8008). Of > these > >>>> > machines, one of them is designated the master and the other is it's > >>>> > backup. > >>>> > > >>>> > Host is Gentoo linux running the 3.8.13. > >>>> > > >>>> > I have four other machines that are connecting to the master ocfs2 > >>>> > partition > >>>> > using nfs. The problem I'm having is that on these machines, I'm > >>>> > randomly > >>>> > getting read errors while trying to enter directories over nfs. In > all > >>>> > of > >>>> > these cases, except on, these directories are immediately > unavailable > >>>> > after > >>>> > they are created. The error that comes back is always something like > >>>> > this: > >>>> > > >>>> > ls: cannot access /mnt/storage/documents/818/8189794/: Stale NFS > file > >>>> > handle > >>>> > > >>>> > The mount point is /mnt/storage. Other directories on the mount are > >>>> > available, and on other servers the same directory can be accessed > >>>> > perfectly > >>>> > fine. > >>>> > > >>>> > I haven't been able to reproduce this issue in isolated testing. > >>>> > > >>>> > The four machines that connect via NFS are doing one of two things: > >>>> > > >>>> > 1) processing e-mail through a php driven daemon (read and write, > >>>> > creating > >>>> > directories) > >>>> > 2) serving report files in PDF format over the web via a php web > >>>> > application > >>>> > (read only) > >>>> > > >>>> > I believe that the ocfs2 version if 1.5. I found this in the kernel > >>>> > source > >>>> > itself, but haven't figured out how to determine this in the shell. > >>>> > ocfs2-tools is version 1.8.2, which is what ocfs2 wanted (maybe this > >>>> > is > >>>> > ocfs2 1.8 then?). > >>>> > > >>>> > The only other path I can think to take is to abandon OCFS2 and use > >>>> > DRBD in > >>>> > master/slave mode with ext4 on top of that. This would still provide > >>>> > me with > >>>> > the redundancy I want, but at a lack of not being able to use both > >>>> > machines > >>>> > simultaneously. > >>>> > > >>>> > If anyone has any advice, I'd love to hear it. > >>>> > > >>>> > Thanks in advance, > >>>> > > >>>> > Adam. > >>>> > > >>>> > > >>>> > -- > >>>> > Adam Randall > >>>> > http://www.xaren.net > >>>> > AIM: blitz574 > >>>> > Twitter: @randalla0622 > >>>> > > >>>> > "To err is human... to really foul up requires the root password." > >>>> > > >>>> > _______________________________________________ > >>>> > Ocfs2-users mailing list > >>>> > Ocfs2-users at oss.oracle.com > >>>> > https://oss.oracle.com/mailman/listinfo/ocfs2-users > >>> > >>> > >>> > >>> > >>> -- > >>> Adam Randall > >>> http://www.xaren.net > >>> AIM: blitz574 > >>> Twitter: @randalla0622 > >>> > >>> "To err is human... to really foul up requires the root password." > >> > >> > >> > >> > >> -- > >> Adam Randall > >> http://www.xaren.net > >> AIM: blitz574 > >> Twitter: @randalla0622 > >> > >> "To err is human... to really foul up requires the root password." > > > > > > > > > > -- > > Adam Randall > > http://www.xaren.net > > AIM: blitz574 > > Twitter: @randalla0622 > > > > "To err is human... to really foul up requires the root password." > > > > _______________________________________________ > > Ocfs2-users mailing list > > Ocfs2-users at oss.oracle.com > > https://oss.oracle.com/mailman/listinfo/ocfs2-users >-- Adam Randall http://www.xaren.net AIM: blitz574 Twitter: @randalla0622 "To err is human... to really foul up requires the root password." -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20130717/6e347199/attachment-0001.html