Conor Armstrong
2023-Feb-16 23:40 UTC
[Samba] Missing Files/Missing Folders from an NFS Share
Ok, starting to make sense now. In order to fix it.... I note that the lower level calls are wrapped in dir.c - eg dptr_SeekDir(...) wraps SeekDir(...) this might allow for some code to use array indexes instead of NFS cookies as noted by Chris Chilvers here for the 3.10 kernel: https://lore.kernel.org/all/CAAmbk-e-YQAPo6QyNB0aJyc9qzUShmEC+x5eTR7wqp1ABWADsg at mail.gmail.com/T/> On the older 3.10 kernel, this was not an issue as the 3.10 kernel uses > the array index position for the offset value instead of the NFS cookie.I am not yet 100% familiar with all the code but perhaps we could store an array of all the directory entries when we open the directory first time and then utilise the array index instead of the cookie. Problems I see with this though are the increase in memory consumption. If there was a way to check the underlying filesystem type and if it is NFS use this approach as a fallback??? On Thu, 16 Feb 2023 at 21:24, Jeremy Allison <jra at samba.org> wrote:> On Thu, Feb 16, 2023 at 12:20:27PM -0800, Jeremy Allison via samba wrote: > > > >doesn't return us to the same position in the directory > >listing, then we are broken and will not return the complete > >list of files. > > > >This is what Ralph is saying when he says that NFS is > >broken w.r.t listing directories. What NFS server are > >you using ? > > Oh I just remembered, you're using the Amazon NFS > server. > > >This looks relevent: > > > > > https://lore.kernel.org/all/CAAmbk-e-YQAPo6QyNB0aJyc9qzUShmEC+x5eTR7wqp1ABWADsg at mail.gmail.com/T/ > > Indeed. Note the comment from Trond (NFS client > maintainer in the Linux kernel). > > --------------------------------------------------------------- > On Thu, 2022-09-08 at 10:45 +0100, Chris Chilvers wrote: > > I should have flagged this as a bug on the first email. > > > > I've run into an interesting issue using VAST data, their NFS > > implementation > > makes use of the full 64 bit unsigned range allowed by NFS cookie for > > READDIR > > and READDIRPLUS. > > > > The issue is that this 64 unsigned value is used for the directory's > > file > > position (d_off), which is a 64 bit signed value. This can cause > > readdir and > > telldir to return negative values. > > Known issue that we had to deal with 25 years ago for 32 bit systems > when glibc first decided to make lseek() return signed values (struct > old_linux_direct still has it as an unsigned value). > > So VAST have had a long time to learn not to do this... > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust at hammerspace.com > --------------------------------------------------------------- > > Looks like Amazon may have the same bug. >
Conor Armstrong
2023-Feb-16 23:57 UTC
[Samba] Missing Files/Missing Folders from an NFS Share
Alternately, we could possibly modify the dptr_TellDir function to set a flag somewhere if it ever returns a negative offset. Then any calls to dptr_SeekDir checks if the flag is set and does a slower approach of a RewindDir and then multiple ReadDir & TellDir calls until we get the matching offset. If flag is not set then go with the normal SeekDir call??? On Fri, 17 Feb 2023 at 00:40, Conor Armstrong <conorarmstrong at gmail.com> wrote:> Ok, starting to make sense now. In order to fix it.... > > I note that the lower level calls are wrapped in dir.c - eg > dptr_SeekDir(...) wraps SeekDir(...) > > this might allow for some code to use array indexes instead of NFS cookies > as noted by Chris Chilvers here for the 3.10 kernel: > > https://lore.kernel.org/all/CAAmbk-e-YQAPo6QyNB0aJyc9qzUShmEC+x5eTR7wqp1ABWADsg at mail.gmail.com/T/ > > > On the older 3.10 kernel, this was not an issue as the 3.10 kernel uses > > the array index position for the offset value instead of the NFS cookie. > > I am not yet 100% familiar with all the code but perhaps we could store an > array of all the directory entries when we open the directory first time > and then utilise the array index instead of the cookie. Problems I see > with this though are the increase in memory consumption. If there was a > way to check the underlying filesystem type and if it is NFS use this > approach as a fallback??? > > > > On Thu, 16 Feb 2023 at 21:24, Jeremy Allison <jra at samba.org> wrote: > >> On Thu, Feb 16, 2023 at 12:20:27PM -0800, Jeremy Allison via samba wrote: >> > >> >doesn't return us to the same position in the directory >> >listing, then we are broken and will not return the complete >> >list of files. >> > >> >This is what Ralph is saying when he says that NFS is >> >broken w.r.t listing directories. What NFS server are >> >you using ? >> >> Oh I just remembered, you're using the Amazon NFS >> server. >> >> >This looks relevent: >> > >> > >> https://lore.kernel.org/all/CAAmbk-e-YQAPo6QyNB0aJyc9qzUShmEC+x5eTR7wqp1ABWADsg at mail.gmail.com/T/ >> >> Indeed. Note the comment from Trond (NFS client >> maintainer in the Linux kernel). >> >> --------------------------------------------------------------- >> On Thu, 2022-09-08 at 10:45 +0100, Chris Chilvers wrote: >> > I should have flagged this as a bug on the first email. >> > >> > I've run into an interesting issue using VAST data, their NFS >> > implementation >> > makes use of the full 64 bit unsigned range allowed by NFS cookie for >> > READDIR >> > and READDIRPLUS. >> > >> > The issue is that this 64 unsigned value is used for the directory's >> > file >> > position (d_off), which is a 64 bit signed value. This can cause >> > readdir and >> > telldir to return negative values. >> >> Known issue that we had to deal with 25 years ago for 32 bit systems >> when glibc first decided to make lseek() return signed values (struct >> old_linux_direct still has it as an unsigned value). >> >> So VAST have had a long time to learn not to do this... >> >> -- >> Trond Myklebust >> Linux NFS client maintainer, Hammerspace >> trond.myklebust at hammerspace.com >> --------------------------------------------------------------- >> >> Looks like Amazon may have the same bug. >> >
Jeremy Allison
2023-Feb-17 01:41 UTC
[Samba] Missing Files/Missing Folders from an NFS Share
On Fri, Feb 17, 2023 at 12:40:25AM +0100, Conor Armstrong wrote:> Ok, starting to make sense now.? In order to fix it.... > I note that the lower level calls are wrapped in dir.c - eg > dptr_SeekDir(...) wraps SeekDir(...) > this might allow for some code to use array indexes instead of NFS cookies > as noted by Chris Chilvers here for the 3.10 kernel: > [1]https://lore.kernel.org/all/CAAmbk-e-YQAPo6QyNB0aJyc9qzUShmEC+x5eTR7wqp1ABWADsg at mail.gmail.com/T/First I think we need to understand how broken telldir()/seekdir() is over NFS in this case. Do they work at all ? An alternate fix would be to change Samba to store the previously read filename on the STATUS_MORE_ENTRIES case (we've run out of return space) and then when the enumeration starts again use that stored filename, rather than doing the SeekDir() to move back to the just read entry preparing for the next round.
Maybe Matching Threads
- Missing Files/Missing Folders from an NFS Share
- Missing Files/Missing Folders from an NFS Share
- Missing Files/Missing Folders from an NFS Share
- Missing Files/Missing Folders from an NFS Share
- [jra@samba.org: Re: Missing Files/Missing Folders from an NFS Share]