In our environemt, we use PCs (NT) running a JVM that produces unique file names, and data content about 100K or less. We only write to these file. The JVM (the application) does spawn many threads so the net effect can be many simultaneous file creation requests from one PC. All the files are written to a samba mounted directory, and the directory *file* itself does get somewhat large. The directory name is MM-DD-YY so it changes every 24 hours. Here's the problem .... as the directory gets larger and at about 400K, the file creation slows down, and it slows down, it appears because samba is looping calling getdents() hundreds of times without doing anything else as if it needs to read every entry in the directory for every file 'open' requested. AND each PC seems to be bound to exactly one Samba server. The Samba 'listener' never forks more then one server per PC, and I guess that is expected. So the bottle neck seems to be this single client connection, and all those getdent() calls. The server is Solaris 7, 4 cpu, and we're not out of mem, but for 4 PCs each at most generating 8 files simultaneously, when the target directory reaches 400K, the servers are using 100% cpu.
On Thu, Dec 13, 2001 at 01:58:45PM -0500, Benson, Paul wrote:> > In our environemt, we use PCs (NT) running a JVM that produces unique file > names, and data content about 100K or less. We only write to these file. The > JVM (the application) does spawn many threads so the net effect can be many > simultaneous file creation requests from one PC. > > All the files are written to a samba mounted directory, and the directory > *file* itself does get somewhat large. The directory name is MM-DD-YY so it > changes every 24 hours. > > Here's the problem .... as the directory gets larger and at about 400K, the > file creation slows down, and it slows down, it appears because samba is > looping calling getdents() hundreds of times without doing anything else as > if it needs to read every entry in the directory for every file 'open' > requested. > > AND each PC seems to be bound to exactly one Samba server. The Samba > 'listener' never forks more then one server per PC, and I guess that is > expected. > > So the bottle neck seems to be this single client connection, and all those > getdent() calls. > > The server is Solaris 7, 4 cpu, and we're not out of mem, but for 4 PCs each > at most generating 8 files simultaneously, when the target directory reaches > 400K, the servers are using 100% cpu.Yeah, this is a known problem with very large directories. The way to fix it is to have a shared tdb cache per directory between smbds. An enhancement probably planned for the Samba 3.0.x series. Jeremy.
Benson, Paul wrote:> In our environemt, we use PCs (NT) running a JVM that produces unique file > names, and data content about 100K or less. We only write to these file. The > JVM (the application) does spawn many threads so the net effect can be many > simultaneous file creation requests from one PC. > > All the files are written to a samba mounted directory, and the directory > *file* itself does get somewhat large. The directory name is MM-DD-YY so it> changes every 24 hours.As a work around could you create subdirectories based on some portion of the name? e.g. the first 2 letters? That way you end up with a lot fewer files/directory I did this once a long time ago to solve a similar problem. Of course you do have to have access to the applications source code to do this Francis -- Francis Turner, CIO Juelich Enzyme Products GmbH http://www.juelich-enzyme.com/ +49-173-291-7278 If you're not part of the solution, you're part of the precipitate. -- Henry J. Tillman