oooooooooooo ooooooooooooo
2009-Jul-08 06:27 UTC
[CentOS] Question about optimal filesystem with many small files.
Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 400000 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works something like a cache for another app and I can't redesign the way it distributes files into disk due to the other app requirements). The filesystem I use is ext3 with teh following options enabled: Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file Is there any way to improve performance in ext3? Would you suggest another FS for this situation (this is a prodution server, so I need a stable one) ? Thanks in advance (and please excuse my bad english). _________________________________________________________________ Connect to the next generation of MSN Messenger? http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
Niki Kovacs
2009-Jul-08 06:41 UTC
[CentOS] Question about optimal filesystem with many small files.
oooooooooooo ooooooooooooo a ?crit :> Hi, > > I have a program that writes lots of files to a directory treeDid that program also write your address header ? :o)
Per Qvindesland
2009-Jul-08 06:43 UTC
[CentOS] Question about optimal filesystem with many small files.
Perhaps think about running tune2fs maybe also consider adding?noatime Regards Per E-mail: per at norhex.com [1] http://www.linkedin.com/in/perqvindesland [2] --- Original message follows --- SUBJECT:?Re: [CentOS] Question about optimal filesystem with many small files. FROM: ?Niki Kovacs TO:?"CentOS mailing list" DATE:?08-07-2009 8:41 oooooooooooo ooooooooooooo a ?crit :> Hi, > > I have a program that writes lots of files to a directory treeDid that program also write your address header ? :o) _______________________________________________ CentOS mailing list CentOS at centos.org http://lists.centos.org/mailman/listinfo/centos Links: ------ [1] http://webmail.norhex.com/# [2] http://www.linkedin.com/in/perqvindesland -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20090708/67033429/attachment-0003.html>
Les Mikesell
2009-Jul-08 15:56 UTC
[CentOS] Question about optimal filesystem with many small files.
oooooooooooo ooooooooooooo wrote:> Hi, > > I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 400000 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works something like a cache for another app and I can't redesign the way it distributes files into disk due to the other app requirements). > > The filesystem I use is ext3 with teh following options enabled: > > Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file > > Is there any way to improve performance in ext3? Would you suggest another FS for this situation (this is a prodution server, so I need a stable one) ? > > Thanks in advance (and please excuse my bad english).I haven't done, or even seen, any recent benchmarks but I'd expect reiserfs to still be the best at that sort of thing. However even if you can improve things slightly, do not let whoever is responsible for that application ignore the fact that it is a horrible design that ignores a very well known problem that has easy solutions. And don't ever do business with someone who would write a program like that again. Any way you approach it, when you want to write a file the system must check to see if the name already exists, and if not, create it in an empty space that it must also find - and this must be done atomically so the directory must be locked against other concurrent operations until the update is complete. If you don't index the contents the lookup is a slow linear scan - if you do, you then have to rewrite the index on every change so you can't win. Sensible programs that expect to access a lot of files will build a tree structure to break up the number that land in any single directory (see squid for an example). Even more sensible programs would re-use some existing caching mechanism like squid or memcached instead of writing a new one badly. -- Les Mikesell lesmikesell at gmail.com
Kwan Lowe
2009-Jul-08 16:23 UTC
[CentOS] Question about optimal filesystem with many small files.
On Wed, Jul 8, 2009 at 2:27 AM, oooooooooooo ooooooooooooo < hhh735 at hotmail.com> wrote:> > Hi, > > I have a program that writes lots of files to a directory tree (around 15 > Million fo files), and a node can have up to 400000 files (and I don't have > any way to split this ammount in smaller ones). As the number of files > grows, my application gets slower and slower (the app is works something > like a cache for another app and I can't redesign the way it distributes > files into disk due to the other app requirements). > > The filesystem I use is ext3 with teh following options enabled: > > Filesystem features: has_journal resize_inode dir_index filetype > needs_recovery sparse_super large_file > > Is there any way to improve performance in ext3? Would you suggest another > FS for this situation (this is a prodution server, so I need a stable one) ? >I saw this article some time back. http://www.linux.com/archive/feature/127055 I've not implemented it, but from past experience, you may lose some performance initially, but the database fs performance might be more consistent as the number of files grow. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20090708/21d73fc0/attachment-0003.html>
oooooooooooo ooooooooooooo
2009-Jul-08 21:59 UTC
[CentOS] Question about optimal filesystem with many small files.
>Perhaps think about running tune2fs maybe also consider adding noatimeYes, I added it and I got a perfomance increase, anyway as the number of fields grows the speed keeps going below an acceptable level.>I saw this article some time back.http://www.linux.com/archive/feature/127055 Good idea, I already use mysql for indexing the files, so everytime I need to make a lookup I don't need the entire dir and then get the file, anyway my requirements are keeping the files on disk.>The only way to deal with it (especially if theapplication adds and removes these files regularly) is to every once in a while copy the files to another directory, nuke the directory and restore from the copy.Thanks, but there will not be too many file updates once the cache is done, so recreating directories can not be very helpful here. The issue is that as the number of files grows, bot reads from existing files and new insertion gets slower and slower.>I haven't done, or even seen, any recent benchmarks but I'd expectreiserfs to still be the best at that sort of thing. I've looking at some benchmarks and reiser seems a bit faster in my scenario, however my problem happens when I have a arge number of files, for what I have seen, I'm not sure if reiser would be a fix....>However even ifyou can improve things slightly, do not let whoever is responsible for that application ignore the fact that it is a horrible design that ignores a very well known problem that has easy solutions.My original idea was storing the file with a hash of it name, and then store a hash->real filename in mysql. By this way I have direct access to the file and I can make a directory hierachy with the first characters of teh hash /c/0/2/a, so i would have 16*4 =65536 leaves in the directoy tree, and the files would be identically distributed, with around 200 files per dir (waht should not give any perfomance issues). But the requiremenst are to use the real file name for the directory tree, what gives the issue.>Did that program also write your address header ?:) Thanks for the help. ----------------------------------------> From: hhh735 at hotmail.com > To: centos at centos.org > Date: Wed, 8 Jul 2009 06:27:40 +0000 > Subject: [CentOS] Question about optimal filesystem with many small files. > > > Hi, > > I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 400000 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works something like a cache for another app and I can't redesign the way it distributes files into disk due to the other app requirements). > > The filesystem I use is ext3 with teh following options enabled: > > Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file > > Is there any way to improve performance in ext3? Would you suggest another FS for this situation (this is a prodution server, so I need a stable one) ? > > Thanks in advance (and please excuse my bad english). > > > _________________________________________________________________ > Connect to the next generation of MSN Messenger > http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos_________________________________________________________________ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx
oooooooooooo ooooooooooooo
2009-Jul-08 22:03 UTC
[CentOS] Question about optimal filesystem with many small files.
(i resent thsi message as previous one seems bad formatted, sorry for the mess).>Perhaps think about running tune2fs maybe also consider adding noatimeYes, I added it and I got a perfomance increase, anyway as the number of fields grows the speed keeps going below an acceptable level.>I saw this article some time back.http://www.linux.com/archive/feature/127055 Good idea, I already use mysql for indexing the files, so everytime I need to make a lookup I don't need the entire dir and then get the file, anyway my requirements are keeping the files on disk.>The only way to deal with it (especially if theapplication adds and removes these files regularly) is to every once in a while copy the files to another directory, nuke the directory and restore from the copy. Thanks, but there will not be too many file updates once the cache is done, so recreating directories can not be very helpful here. The issue is that as the number of files grows, bot reads from existing files and new insertion gets slower and slower.>I haven't done, or even seen, any recent benchmarks but I'd expectreiserfs to still be the best at that sort of thing. I've looking at some benchmarks and reiser seems a bit faster in my scenario, however my problem happens when I have a arge number of files, for what I have seen, I'm not sure if reiser would be a fix....>However even ifyou can improve things slightly, do not let whoever is responsible for that application ignore the fact that it is a horrible design that ignores a very well known problem that has easy solutions. My original idea was storing the file with a hash of it name, and then store a hash->real filename in mysql. By this way I have direct access to the file and I can make a directory hierachy with the first characters of teh hash /c/0/2/a, so i would have 16*4 =65536 leaves in the directoy tree, and the files would be identically distributed, with around 200 files per dir (waht should not give any perfomance issues). But the requiremenst are to use the real file name for the directory tree, what gives the issue.>Did that program also write your address header ?:) Thanks for the help. ----------------------------------------> From: hhh735 at hotmail.com > To: centos at centos.org > Date: Wed, 8 Jul 2009 06:27:40 +0000 > Subject: [CentOS] Question about optimal filesystem with many small files. > > > Hi, > > I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 400000 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works something like a cache for another app and I can't redesign the way it distributes files into disk due to the other app requirements). > > The filesystem I use is ext3 with teh following options enabled: > > Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file > > Is there any way to improve performance in ext3? Would you suggest another FS for this situation (this is a prodution server, so I need a stable one) ? > > Thanks in advance (and please excuse my bad english). > > > _________________________________________________________________ > Connect to the next generation of MSN Messenger > http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos_________________________________________________________________ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx _________________________________________________________________ Connect to the next generation of MSN Messenger? http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
James A. Peltier
2009-Jul-09 03:13 UTC
[CentOS] Question about optimal filesystem with many small files.
On Wed, 8 Jul 2009, oooooooooooo ooooooooooooo wrote:> > Hi, > > I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 400000 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works something like a cache for another app and I can't redesign the way it distributes files into disk due to the other app requirements). > > The filesystem I use is ext3 with teh following options enabled: > > Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file > > Is there any way to improve performance in ext3? Would you suggest another FS for this situation (this is a prodution server, so I need a stable one) ? > > Thanks in advance (and please excuse my bad english).There isn't a good file system for this type of thing. filesystems with many very small files are always slow. Ext3, XFS, JFS are all terrible for this type of thing. Rethink how you're writing files or you'll be in a world of hurt. -- James A. Peltier Systems Analyst (FASNet), VIVARIUM Technical Director HPC Coordinator Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpeltier at sfu.ca Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca http://blogs.sfu.ca/people/jpeltier MSN : subatomic_spam at hotmail.com The point of the HPC scheduler is to keep everyone equally unhappy.
James A. Peltier
2009-Jul-09 03:14 UTC
[CentOS] Question about optimal filesystem with many small files.
On Wed, 8 Jul 2009, oooooooooooo ooooooooooooo wrote:> > Hi, > > I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 400000 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works something like a cache for another app and I can't redesign the way it distributes files into disk due to the other app requirements). > > The filesystem I use is ext3 with teh following options enabled: > > Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file > > Is there any way to improve performance in ext3? Would you suggest another FS for this situation (this is a prodution server, so I need a stable one) ? > > Thanks in advance (and please excuse my bad english).BTW, you can pretty much say goodbye to any backup solution for this type of project as well. They'll all die dealing with a file system structure like this -- James A. Peltier Systems Analyst (FASNet), VIVARIUM Technical Director HPC Coordinator Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpeltier at sfu.ca Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca http://blogs.sfu.ca/people/jpeltier MSN : subatomic_spam at hotmail.com The point of the HPC scheduler is to keep everyone equally unhappy.
oooooooooooo ooooooooooooo
2009-Jul-13 05:49 UTC
[CentOS] Question about optimal filesystem with many small files.
>How many files per directory do you have?I have 4 directory levels, 65536 leaves directories and around 200 files per dir (15M in total)->Something is wrong. Got to figure this out. Where did this RAM go?Thanks I reduced the memory usage of mysql and my app it and I got around a 15% performance increase. Now my atop looks like this (currently reading only cached files from disk). PRC | sys 0.51s | user 9.29s | #proc 114 | #zombie 0 | #exit 0 | CPU | sys 4% | user 93% | irq 1% | idle 208% | wait 94% | cpu | sys 2% | user 48% | irq 1% | idle 21% | cpu001 w 28% | cpu | sys 1% | user 17% | irq 0% | idle 41% | cpu000 w 40% | cpu | sys 1% | user 14% | irq 0% | idle 74% | cpu003 w 12% | cpu | sys 1% | user 13% | irq 0% | idle 72% | cpu002 w 14% | CPL | avg1 3.45 | avg5 7.42 | avg15 10.76 | csw 15891 | intr 11695 | MEM | tot 2.0G | free 51.2M | cache 587.8M | buff 1.0M | slab 281.2M | SWP | tot 1.9G | free 1.9G | | vmcom 1.6G | vmlim 2.9G | PAG | scan 3072 | stall 0 | | swin 0 | swout 0 | DSK | sdb | busy 89% | read 1451 | write 0 | avio 6 ms | DSK | sda | busy 6% | read 178 | write 54 | avio 2 ms | NET | transport | tcpi 3631 | tcpo 3629 | udpi 0 | udpo 0 | NET | network | ipi 3632 | ipo 3630 | ipfrw 0 | deliv 3632 | NET | eth0 0% | pcki 5 | pcko 3 | si 0 Kbps | so 1 Kbps | NET | lo ---- | pcki 3627 | pcko 3627 | si 775 Kbps | so 775 Kbps |>It is 1024 chars long. Witch want still help.I'm usng mysam and according to: http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html "The maximum key length is 1000 bytes. This can also be changed by changing the source and recompiling. For the case of a key longer than 250 bytes, a larger key block size than the default of 1024 bytes is used. ">I would not store images in either oneas your SELECT LIKE and Random will kill it. Well, I think that this can be avoided, using just searches in teh key fields should not give these issues. Does somebody have experience storing a large amount of medium (1KB-150KB) blob objects in mysql?>However I have not a clue that this is even doable in MySQL.In mysql there is already a MD5 funtion: http://dev.mysql.com/doc/refman/5.1/en/encryption-functions.html#function_md5 Thanks for the help. _________________________________________________________________ Connect to the next generation of MSN Messenger? http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline