Jeremy Allison
2005-Feb-03 01:38 UTC
[Samba] People with applications needing directories containing large numbers of files.
I've been working (inspired by James Peach of SGI) on the
problem of using Samba3 with applications that need large
numbers of file (100,000 or more) per directory.
I think the current code in SVN in the SAMBA_3_0 branch
may hold the fix for this problem, so I'd like to request
people who need this functionality to give it a try.
The key was fixing the directory handling to read only
the current list requested instead of the old (up to 3.0.11)
behaviour of reading the entire directory into memory before
doling out names. Normally this would have broken OS/2
applications which have *very* strange delete semantics :-),
but by stealing logic from Samba4 (thanks tridge) I think
the current code in SVN handles this correctly.
So here's how to set up an application that needs large
number of files per directory in a way that doesn't damage
performance.
Firstly, you need to canonicalize all the files in the
directory to have one case, upper or lower - take your
pick (I chose upper as all my files were already upper
case names). Then set up a new custom share for the
application as follows :
[bigshare]
path = /home/jeremy/tmp/manyfilesdir
read only = no
default case = upper
preserve case = no
short preserve case = no
Of course, use your own path and settings, but set the
case options to match the case of all the files in your
directory. The path should point at the large directory
needed for the application - any new files created in
there and in any paths under it will be forced by smbd
into upper case - but smbd will no longer have to scan
the directory for names - it knows that if a file doesn't
exist in upper case then it doesn't exist at all.
So please give this a test if you have problems with
Samba and large sized directories. Remember this is in SVN code
only, it isn't in the 3.0.11 pre releases or rc candidates,
as we need to ensure this new code is correct. If you
can help me test it it'll be in 3.0.12 (security problems
notwithstanding :-).
Cheers,
Jeremy.
Jeremy Allison
2005-Feb-03 01:49 UTC
[Samba] Re: People with applications needing directories containing large numbers of files.
On Wed, Feb 02, 2005 at 05:38:19PM -0800, Jeremy Allison wrote:> > So please give this a test if you have problems with > Samba and large sized directories. Remember this is in SVN code > only, it isn't in the 3.0.11 pre releases or rc candidates, > as we need to ensure this new code is correct. If you > can help me test it it'll be in 3.0.12 (security problems > notwithstanding :-).Ok, I'm sorry - I spoke too soon :-(. I have one more fix to do before this works.... Sorry for being stupid :-(. Please ignore the earlier message :-(. Jeremy.
John Caldwell
2005-Feb-03 14:29 UTC
[Samba] Re: People with applications needing directories containing largenumbers of files.
Hello guys/gals,
Could you guys please remove this email address from your list. Even
though I am Linux junkie myself and the topics are interesting, this is my
new work email address and I am being forwarded email from the account of
the previous Network Admin who is no longer here.
Thanks
----- Original Message -----
From: "Jeremy Allison" <jra@samba.org>
To: <samba@samba.org>
Cc: <jra@samba.org>; <samba-technical@samba.org>
Sent: Wednesday, February 02, 2005 8:38 PM
Subject: People with applications needing directories containing
largenumbers of files.
> I've been working (inspired by James Peach of SGI) on the
> problem of using Samba3 with applications that need large
> numbers of file (100,000 or more) per directory.
>
> I think the current code in SVN in the SAMBA_3_0 branch
> may hold the fix for this problem, so I'd like to request
> people who need this functionality to give it a try.
>
> The key was fixing the directory handling to read only
> the current list requested instead of the old (up to 3.0.11)
> behaviour of reading the entire directory into memory before
> doling out names. Normally this would have broken OS/2
> applications which have *very* strange delete semantics :-),
> but by stealing logic from Samba4 (thanks tridge) I think
> the current code in SVN handles this correctly.
>
> So here's how to set up an application that needs large
> number of files per directory in a way that doesn't damage
> performance.
>
> Firstly, you need to canonicalize all the files in the
> directory to have one case, upper or lower - take your
> pick (I chose upper as all my files were already upper
> case names). Then set up a new custom share for the
> application as follows :
>
> [bigshare]
> path = /home/jeremy/tmp/manyfilesdir
> read only = no
> default case = upper
> preserve case = no
> short preserve case = no
>
> Of course, use your own path and settings, but set the
> case options to match the case of all the files in your
> directory. The path should point at the large directory
> needed for the application - any new files created in
> there and in any paths under it will be forced by smbd
> into upper case - but smbd will no longer have to scan
> the directory for names - it knows that if a file doesn't
> exist in upper case then it doesn't exist at all.
>
> So please give this a test if you have problems with
> Samba and large sized directories. Remember this is in SVN code
> only, it isn't in the 3.0.11 pre releases or rc candidates,
> as we need to ensure this new code is correct. If you
> can help me test it it'll be in 3.0.12 (security problems
> notwithstanding :-).
>
> Cheers,
>
> Jeremy.