Hi All, We have a problem running a scientific application dCache on ZFS. dCache is a java based software that allows to store huge datasets in pools. One dCache pool consists of two directories pool/data and pool/control. The real data goes into pool/data/ For each file in pool/data/ the pool/control/ directory contains two small files, one is 23 bytes, another one is 989 bytes. When dcache pool starts it consecutively reads all the files in control/ directory. We run a pool on ZFS. When we have approx 300,000 files in control/ the pool startup time is about 12-15 minutes. When we have approx 350,000 files in control/ the pool startup time increases to 70 minutes. If we setup a new zfs pool with the smalles possible blocksize and move control/ there the startup time decreases to 40 minutes (in case of 350,000 files). But if we run the same pool on XFS the startup time is only 15 minutes. Could you suggest to reconfigure ZFS to decrease the startup time. When we have approx 400,000 files in control/ we were not able to start the pool in 24 hours. UFS did not work either in this case, but XFS worked. What could be the problem ? Thank you, -- -- Best Regards, Sergey Chechelnitskiy (chech at sfu.ca) WestGrid/SFU
Sergey Chechelnitskiy <chech at sfu.ca> writes:> Hi All, > > We have a problem running a scientific application dCache on ZFS. > dCache is a java based software that allows to store huge datasets in > pools. One dCache pool consists of two directories pool/data and > pool/control. The real data goes into pool/data/ For each file in > pool/data/ the pool/control/ directory contains two small files, one > is 23 bytes, another one is 989 bytes. When dcache pool starts it > consecutively reads all the files in control/ directory. We run a > pool on ZFS. > > When we have approx 300,000 files in control/ the pool startup time is > about 12-15 minutes. When we have approx 350,000 files in control/ the > pool startup time increases to 70 minutes. If we setup a new zfs pool > with the smalles possible blocksize and move control/ there the > startup time decreases to 40 minutes (in case of 350,000 files). But > if we run the same pool on XFS the startup time is only 15 minutes. > Could you suggest to reconfigure ZFS to decrease the startup time. > > When we have approx 400,000 files in control/ we were not able to > start the pool in 24 hours. UFS did not work either in this case, but > XFS worked. > > What could be the problem ? Thank you,I''m not sure I understand what you''re comparing. Is there an XFS implementation for Solaris that I don''t know about? Are you comparing ZFS on Solaris vs XFS on Linux? If that''s the case it seems there is much more that''s different than just the filesystem. Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on Linux? That doesn''t seem to make sense since the userspace implementation will always suffer. Someone has just mentioned that all of UFS, ZFS and XFS are available on FreeBSD. Are you using that platform? That information would be useful too. Boyd
Boyd Adamson <boyd-adamson at usa.net> wrote:> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on > Linux? That doesn''t seem to make sense since the userspace > implementation will always suffer. > > Someone has just mentioned that all of UFS, ZFS and XFS are available on > FreeBSD. Are you using that platform? That information would be useful > too.FreeBSD does not use what Solaris calls UFS. Both Solaris and FreeBSD did start with the same filesystem code but Sun did start enhancing UFD in the late 1980''s while BSD did not take over the changes. Later BSD started a fork on the filesystemcode. Filesystem performance thus cannot be compared. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
On 01/08/2007, at 7:50 PM, Joerg Schilling wrote:> Boyd Adamson <boyd-adamson at usa.net> wrote: > >> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on >> Linux? That doesn''t seem to make sense since the userspace >> implementation will always suffer. >> >> Someone has just mentioned that all of UFS, ZFS and XFS are >> available on >> FreeBSD. Are you using that platform? That information would be >> useful >> too. > > FreeBSD does not use what Solaris calls UFS. > > Both Solaris and FreeBSD did start with the same filesystem code but > Sun did start enhancing UFD in the late 1980''s while BSD did not > take over > the changes. Later BSD started a fork on the filesystemcode. > Filesystem > performance thus cannot be compared.I''m aware of that, but they still call it UFS. I''m trying to determine what the OP is asking.
> On 01/08/2007, at 7:50 PM, Joerg Schilling wrote: > > Boyd Adamson <boyd-adamson at usa.net> wrote: > > > >> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on > >> Linux? That doesn''t seem to make sense since the userspace > >> implementation will always suffer. > >> > >> Someone has just mentioned that all of UFS, ZFS and XFS are > >> available on > >> FreeBSD. Are you using that platform? That information would be > >> useful > >> too. > > > > FreeBSD does not use what Solaris calls UFS. > > > > Both Solaris and FreeBSD did start with the same filesystem code but > > Sun did start enhancing UFD in the late 1980''s while BSD did not > > take over > > the changes. Later BSD started a fork on the filesystemcode. > > Filesystem > > performance thus cannot be compared. > > I''m aware of that, but they still call it UFS. I''m trying to > determine what the OP is asking.I seem to remember many daemons that used large grouping of files such as this changing to a split out directory tree starting in the late 80''s to avoid slow stat issues. Is this type of design (tossing 300k+ files into one flat directory) becoming more acceptable again? -Wade
Hi All, Thank you for answers. I am not really comparing anything. I have a flat directory with a lot of small files inside. And I have a java application that reads all these files when it starts. If this directory is located on ZFS the application starts fast (15 mins) when the number of files is around 300,000 and starts very slow (more than 24 hours) when the number of files is around 400,000. The question is why ? Let''s set aside the question why this application is designed this way. I still needed to run this application. So, I installed a linux box with XFS, mounted this XFS directory to the Solaris box and moved my flat directory there. Then my application started fast ( < 30 mins) even if the number of files (in the linux operated XFS directory mounted thru NSF to the Solaris box) was 400,000 or more. Basicly, what I want to do is to run this application on a Solaris box. Now I cannot do it. Thanks, Sergey On August 1, 2007 08:15 am, Wade.Stuart at fallon.com wrote:> > On 01/08/2007, at 7:50 PM, Joerg Schilling wrote: > > > Boyd Adamson <boyd-adamson at usa.net> wrote: > > >> Or alternatively, are you comparing ZFS(Fuse) on Linux with XFS on > > >> Linux? That doesn''t seem to make sense since the userspace > > >> implementation will always suffer. > > >> > > >> Someone has just mentioned that all of UFS, ZFS and XFS are > > >> available on > > >> FreeBSD. Are you using that platform? That information would be > > >> useful > > >> too. > > > > > > FreeBSD does not use what Solaris calls UFS. > > > > > > Both Solaris and FreeBSD did start with the same filesystem code but > > > Sun did start enhancing UFD in the late 1980''s while BSD did not > > > take over > > > the changes. Later BSD started a fork on the filesystemcode. > > > Filesystem > > > performance thus cannot be compared. > > > > I''m aware of that, but they still call it UFS. I''m trying to > > determine what the OP is asking. > > I seem to remember many daemons that used large grouping of files such as > this changing to a split out directory tree starting in the late 80''s to > avoid slow stat issues. Is this type of design (tossing 300k+ files into > one flat directory) becoming more acceptable again? > > > -Wade > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I think I am having the same problem using a different application (Windchill). zfs is consuming hugh amounts of memory and system (T2000) is performing poorly. Occasionally it will take a long time (several hours) to do a snapshot. Normally a snapshot will take a second or two. The application will allow me to break the one directory which has almost 600,000 files in to several directories. I am in the process of doing this now. I never thought it was a good idea to have that many files in one directory. This message posted from opensolaris.org
On Wed, Aug 01, 2007 at 09:49:26AM -0700, Sergey Chechelnitskiy wrote: Hi Sergey,> > I have a flat directory with a lot of small files inside. And I have a java > application that reads all these files when it starts. If this directory is > located on ZFS the application starts fast (15 mins) when the number of files > is around 300,000 and starts very slow (more than 24 hours) when the number > of files is around 400,000. > > The question is why ? > Let''s set aside the question why this application is designed this way. > > I still needed to run this application. So, I installed a linux box with XFS, > mounted this XFS directory to the Solaris box and moved my flat directory > there. Then my application started fast ( < 30 mins) even if the number of > files (in the linux operated XFS directory mounted thru NSF to the Solaris > box) was 400,000 or more. > > Basicly, what I want to do is to run this application on a Solaris box. Now I > cannot do it.Just a rough guess - this might be a Solaris threading problem. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6518490 So perhaps starting the app with -XX:-UseThreadPriorities may help ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
We have the same issue (using dCache on Thumpers, data on ZFS). A workaround has been to move the directory on a local UFS filesystem using a low nbpi parameter. However, this is not a solution. Doesn''t look like a threading problem, thanks anyway Jens ! This message posted from opensolaris.org