Niels de Vos
2016-Mar-10 08:55 UTC
[Gluster-users] Gluster (3.6.3) NFS READDIR failing intermittently from Finder on Mac OS X (10.10 and 10.11)
On Thu, Mar 10, 2016 at 06:18:44PM +1100, Brett Randall wrote:> Hi all > > > > I have a problem which is doing my head in. > > > > We are running Gluster 3.6.3 with the in-built NFS server, across 8 servers. > We share our volume out with SMB, AFP and Gluster's NFS server. > > > > In most cases, NFS works fine. Everything is visible and accessible from the > terminal. But from Finder on our Macs, we are having a consistent problem. > > > > Firstly, we are mounting the share from the command line: > > > > $ mount -t nfs -o rw,intr,nolock,tcp 10.0.19.31:/glusvol ./glusvol > > > > We then open Finder and traverse to the folder in question (about 7 levels > deep). I see about 20-30 items, but I know there are 100+ items in there. > This is the case on multiple folders. If I open a terminal, go to that > folder, and create a new empty file, the folder refreshes in Finder and I > can see everything. However, dismount and remount and everything is gone > again (although sometimes it displays all files for a few seconds before > most of them disappear). I've repeated this on three different Macs of > varying origin and OS version. > > > > I've started Wireshark on my Mac and monitored what is happening. It appears > that there is an initial NFS READDIR Call to the NFS server with cookie set > to 0. The READDIR Reply contains the filename of every file in the folder. > Then there is another READDIR call with cookie set to 4096, which happens to > be the last cookie listed in the previous reply. Curiously, the reply to > this call lists all the files that I *cannot* see in Finder. But doesn't > include the ones I can see. Then there are a whole lot of LOOKUP Calls while > it looks at all the files that I *can* see. Then it stops at the 24th file, > the last file I can see in Finder. It then issues another READDIR Call with > a Cookie of 680. The Reply is "NFS3ERR_BAD_COOKIE". Looking through the > previous replies, the only time that cookie was issued was in the FIRST > reply. And again, the file in question with that cookie number is the LAST > file that I can see in Finder. > > > > Surely, Finder cannot be THIS broken? I can see all files in that folder > fine when I mount via AFS or SMB but not via NFS. But it all works fine from > Terminal. We're experimenting with updating Gluster to 3.7.8 and moving to > NFS Ganesha in the hope that moving to NFSv4 fixes it, but does anyone have > any idea what's happening? I'm happy to send the .pcapng file to someone if > it's helpful. I also have a .pcapng of when we create a file in the folder > and Finder refreshes to show everything in there. The only interesting thing > that I noticed in that file is that the cookie number at the end of the > READDIR is much larger than anything I was seeing in the failed listings > (17179869176). I tried forcing 32-bit inode sizes in Gluster NFS options > (the closest thing I could find to NFS's native 32-bit cookie size > restriction) with no joy, just in case that was part of it, which wouldn't > make sense but tried anyway and no difference.It is possible that Finder does not follow the NFSv3 specification correctly. I have seen that some other OS's expect the cookie or inode to be 32-bit. This is the case for most filesystems, but Gluster uses 64-bit values. A subsequent READDIR(P) would use a partial cookie for continuation, and that can result in very strange behaviour. Only exposing 32-bit inodes over Gluster/NFS might be the solution for you. You can enable this with # gluster volume set ${VOLUME} nfs.enable-ino32 on Unmount and re-mount the NFS-export after changing this option. It is possible that the NFSv4 client on Mac OS X handles things better, but it could have the same issues too. HTH, Niels -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160310/b6d9d1aa/attachment.sig>
Brett Randall
2016-Mar-10 09:31 UTC
[Gluster-users] Gluster (3.6.3) NFS READDIR failing intermittently from Finder on Mac OS X (10.10 and 10.11)
Niels de Vos wrote:> On Thu, Mar 10, 2016 at 06:18:44PM +1100, Brett Randall wrote: > > Surely, Finder cannot be THIS broken? I can see all files in that > > folder fine when I mount via AFS or SMB but not via NFS. But it all > > works fine from Terminal. We're experimenting with updating Gluster to > > 3.7.8 and moving to NFS Ganesha in the hope that moving to NFSv4 fixes > > it, but does anyone have any idea what's happening? I'm happy to send > > the .pcapng file to someone if it's helpful. I also have a .pcapng of > > when we create a file in the folder and Finder refreshes to show > > everything in there. The only interesting thing that I noticed in that > > file is that the cookie number at the end of the READDIR is much > > larger than anything I was seeing in the failed listings > > (17179869176). I tried forcing 32-bit inode sizes in Gluster NFS > > options (the closest thing I could find to NFS's native 32-bit cookie > > size > > restriction) with no joy, just in case that was part of it, which > > wouldn't make sense but tried anyway and no difference. > > It is possible that Finder does not follow the NFSv3 specificationcorrectly. I> have seen that some other OS's expect the cookie or inode to be 32-bit.This> is the case for most filesystems, but Gluster uses 64-bit values. Asubsequent> READDIR(P) would use a partial cookie for continuation, and that canresult> in very strange behaviour. > > Only exposing 32-bit inodes over Gluster/NFS might be the solution foryou.> You can enable this with > > # gluster volume set ${VOLUME} nfs.enable-ino32 on > > Unmount and re-mount the NFS-export after changing this option. > > It is possible that the NFSv4 client on Mac OS X handles things better,but it> could have the same issues too.Hi Niels Thanks for the reply. I had already tried that but just tried again and still no joy. In fact, it actually fixes some folders but breaks other folders, so it's probably just alternating how Mac OS X is handling the cookie. I'm quite surprised no one else running Gluster has come across this problem, especially seeing as it is so easily repeatable on our end! If anyone has any other ideas then I'm open but otherwise we'll just keep working on this test upgrade to 3.7 then the Ganesha mod. Brett.
Brett Randall
2016-Mar-14 02:54 UTC
[Gluster-users] Gluster (3.6.3) NFS READDIR failing intermittently from Finder on Mac OS X (10.10 and 10.11)
Just an FYI for all, we found that by adding "rdirplus" on our Macs as an option on the NFS mount, the problem went away. Hope this is helpful to someone in the future! Should really be a "Mac" wiki page on Gluster to summarise how best to get macs to work with Gluster :) Brett.> -----Original Message----- > From: Niels de Vos [mailto:ndevos at redhat.com] > Sent: Thursday, 10 March 2016 7:55 PM > To: Brett Randall <brett.randall at gmail.com> > Cc: gluster-users at gluster.org > Subject: Re: [Gluster-users] Gluster (3.6.3) NFS READDIR failingintermittently> from Finder on Mac OS X (10.10 and 10.11) > > On Thu, Mar 10, 2016 at 06:18:44PM +1100, Brett Randall wrote: > > Hi all > > > > > > > > I have a problem which is doing my head in. > > > > > > > > We are running Gluster 3.6.3 with the in-built NFS server, across 8servers.> > We share our volume out with SMB, AFP and Gluster's NFS server. > > > > > > > > In most cases, NFS works fine. Everything is visible and accessible > > from the terminal. But from Finder on our Macs, we are having aconsistent> problem. > > > > > > > > Firstly, we are mounting the share from the command line: > > > > > > > > $ mount -t nfs -o rw,intr,nolock,tcp 10.0.19.31:/glusvol ./glusvol > > > > > > > > We then open Finder and traverse to the folder in question (about 7 > > levels deep). I see about 20-30 items, but I know there are 100+ itemsin> there. > > This is the case on multiple folders. If I open a terminal, go to that > > folder, and create a new empty file, the folder refreshes in Finder > > and I can see everything. However, dismount and remount and everything > > is gone again (although sometimes it displays all files for a few > > seconds before most of them disappear). I've repeated this on three > > different Macs of varying origin and OS version. > > > > > > > > I've started Wireshark on my Mac and monitored what is happening. It > > appears that there is an initial NFS READDIR Call to the NFS server > > with cookie set to 0. The READDIR Reply contains the filename of everyfile> in the folder. > > Then there is another READDIR call with cookie set to 4096, which > > happens to be the last cookie listed in the previous reply. Curiously, > > the reply to this call lists all the files that I *cannot* see in > > Finder. But doesn't include the ones I can see. Then there are a whole > > lot of LOOKUP Calls while it looks at all the files that I *can* see. > > Then it stops at the 24th file, the last file I can see in Finder. It > > then issues another READDIR Call with a Cookie of 680. The Reply is > > "NFS3ERR_BAD_COOKIE". Looking through the previous replies, the only > > time that cookie was issued was in the FIRST reply. And again, the > > file in question with that cookie number is the LAST file that I can seein> Finder. > > > > > > > > Surely, Finder cannot be THIS broken? I can see all files in that > > folder fine when I mount via AFS or SMB but not via NFS. But it all > > works fine from Terminal. We're experimenting with updating Gluster to > > 3.7.8 and moving to NFS Ganesha in the hope that moving to NFSv4 fixes > > it, but does anyone have any idea what's happening? I'm happy to send > > the .pcapng file to someone if it's helpful. I also have a .pcapng of > > when we create a file in the folder and Finder refreshes to show > > everything in there. The only interesting thing that I noticed in that > > file is that the cookie number at the end of the READDIR is much > > larger than anything I was seeing in the failed listings > > (17179869176). I tried forcing 32-bit inode sizes in Gluster NFS > > options (the closest thing I could find to NFS's native 32-bit cookie > > size > > restriction) with no joy, just in case that was part of it, which > > wouldn't make sense but tried anyway and no difference. > > It is possible that Finder does not follow the NFSv3 specificationcorrectly. I> have seen that some other OS's expect the cookie or inode to be 32-bit.This> is the case for most filesystems, but Gluster uses 64-bit values. Asubsequent> READDIR(P) would use a partial cookie for continuation, and that canresult> in very strange behaviour. > > Only exposing 32-bit inodes over Gluster/NFS might be the solution foryou.> You can enable this with > > # gluster volume set ${VOLUME} nfs.enable-ino32 on > > Unmount and re-mount the NFS-export after changing this option. > > It is possible that the NFSv4 client on Mac OS X handles things better,but it> could have the same issues too. > > HTH, > Niels