thr3ads.net - samba - a very simple problem, but unknown solution [Dec 2001]

If this information is useful, please help other people find it:
Share via:

Benson, Paul

2001-Dec-13 10:59 UTC

a very simple problem, but unknown solution

In our environemt, we use PCs (NT) running a JVM that produces unique file
names, and data content about 100K or less. We only write to these file. The
JVM (the application) does spawn many threads so the net effect can be many
simultaneous file creation requests from one PC.

All the files are written to a samba mounted directory, and the directory
*file* itself does get somewhat large. The directory name is MM-DD-YY so it
changes every 24 hours.

Here's the problem .... as the directory gets larger and at about 400K, the
file creation slows down, and it slows down, it appears because samba is
looping calling getdents() hundreds of times without doing anything else as
if it needs to read every entry in the directory for every file 'open'
requested. 

AND each PC seems to be  bound to exactly one Samba server. The Samba
'listener' never forks more then one server  per PC, and I guess that is
expected.

So the bottle neck seems to be this single client connection, and all those
getdent() calls.

The server is Solaris 7, 4 cpu, and we're not out of mem, but for 4 PCs each
at most generating 8 files simultaneously, when the target directory reaches
400K, the servers are using 100% cpu.

Jeremy Allison

2001-Dec-13 15:14 UTC

head link

a very simple problem, but unknown solution

On Thu, Dec 13, 2001 at 01:58:45PM -0500, Benson, Paul
wrote:> 
> In our environemt, we use PCs (NT) running a JVM that produces unique file
> names, and data content about 100K or less. We only write to these file.
The
> JVM (the application) does spawn many threads so the net effect can be many
> simultaneous file creation requests from one PC.
> 
> All the files are written to a samba mounted directory, and the directory
> *file* itself does get somewhat large. The directory name is MM-DD-YY so it
> changes every 24 hours.
> 
> Here's the problem .... as the directory gets larger and at about 400K,
the
> file creation slows down, and it slows down, it appears because samba is
> looping calling getdents() hundreds of times without doing anything else as
> if it needs to read every entry in the directory for every file
'open'
> requested. 
> 
> AND each PC seems to be  bound to exactly one Samba server. The Samba
> 'listener' never forks more then one server  per PC, and I guess
that is
> expected.
> 
> So the bottle neck seems to be this single client connection, and all those
> getdent() calls.
> 
> The server is Solaris 7, 4 cpu, and we're not out of mem, but for 4 PCs
each
> at most generating 8 files simultaneously, when the target directory
reaches
> 400K, the servers are using 100% cpu.
Yeah, this is a known problem with very large directories. The way to fix
it is to have a shared tdb cache per directory between smbds. An enhancement
probably planned for the Samba 3.0.x series.

Jeremy.

Francis Turner

2001-Dec-14 00:57 UTC

head link

a very simple problem, but unknown solution

Benson, Paul wrote:
> In our environemt, we use PCs (NT) running a JVM that produces unique file
> names, and data content about 100K or less. We only write to these file.
The
> JVM (the application) does spawn many threads so the net effect can be many
> simultaneous file creation requests from one PC.
> 
> All the files are written to a samba mounted directory, and the directory
> *file* itself does get somewhat large. The directory name is MM-DD-YY so it
> changes every 24 hours.

As a work around could you create subdirectories based on some portion 
of the name? e.g. the first 2 letters? That way you end up with a lot 
fewer files/directory

I did this once a long time ago to solve a similar problem. Of course 
you do have to have access to the applications source code to do this

Francis
-- 
Francis Turner, CIO Juelich Enzyme Products GmbH
http://www.juelich-enzyme.com/ +49-173-291-7278

If you're not part of the solution, you're part of the precipitate.
             -- Henry J. Tillman

Seemingly Similar Threads

Search for more possibly parallel threads

samba - Dec 2001 - a very simple problem, but unknown solution

a very simple problem, but unknown solution

a very simple problem, but unknown solution

a very simple problem, but unknown solution

Seemingly Similar Threads