I''ve just started checking out caching. I have thousands of items with URLs like ''/item/view/ID''. I see caching puts them in public/item/view/ID.html. I''m using Linux''s ext3 filesystem and I''ve run into problems before with caching and ext3''s 4K entries per directory limit. How can I avoid this? thanks csn __________________________________ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com
Wow, this is an excellent question. ext3''s performance with super large directories can actually be pretty decent with dir_index (it''s depressingly bad without it). You could alter ext3 itself to allow more entries, but honestly, with that many files in a directory you probably spend more time seeking the directory for the file than you would spend generating the page dynamically. You could harvest the oldest files in the directory every few minutes, or whatever seems appropriate, with a scheduled job. It''d be very interesting to see how other large sites have handled this. I''d push for either the cron, or rethinking what you cache (ie, don''t cache whole pages, just cache parts that repeat). With more than a few hundred files in a directory you''ll loose alot of performance no matter what filesystem you use. Looking forward to hearing from some more experienced deployment folk, -Matt B On Tue, 2005-11-15 at 17:15 -0800, CSN wrote:> I''ve just started checking out caching. I have > thousands of items with URLs like ''/item/view/ID''. I > see caching puts them in public/item/view/ID.html. I''m > using Linux''s ext3 filesystem and I''ve run into > problems before with caching and ext3''s 4K entries per > directory limit. How can I avoid this? > > thanks > csn > > > > __________________________________ > Yahoo! FareChase: Search multiple travel sites in one click. > http://farechase.yahoo.com > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails
Matthew Beale wrote:> Wow, this is an excellent question. > > ext3''s performance with super large directories can actually be pretty > decent with dir_index (it''s depressingly bad without it). You could > alter ext3 itself to allow more entries, but honestly, with that many > files in a directory you probably spend more time seeking the directory > for the file than you would spend generating the page dynamically. You > could harvest the oldest files in the directory every few minutes, or > whatever seems appropriate, with a scheduled job.Matthew, What are the performance characteristics of ext3 filesystems with and without dir_index for small directories up to large ones? How many files do you need in a directory before dir_index is worth it? Right now all my filesystems do not have dir_index enabled, so it would require some downtime to enable it. Regards, Blair
On Wed, 2005-11-16 at 11:00 -0800, Blair Zajac wrote:> Matthew Beale wrote: > > Wow, this is an excellent question. > > > > ext3''s performance with super large directories can actually be pretty > > decent with dir_index (it''s depressingly bad without it). You could > > alter ext3 itself to allow more entries, but honestly, with that many > > files in a directory you probably spend more time seeking the directory > > for the file than you would spend generating the page dynamically. You > > could harvest the oldest files in the directory every few minutes, or > > whatever seems appropriate, with a scheduled job. > > Matthew, > > What are the performance characteristics of ext3 filesystems with and without > dir_index for small directories up to large ones?Not sure on any benchmarks. I wouldn''t say "stunning" is a bad word to use. Basically, instead of using lists for files it uses B-Trees, which are the same tech that make reiserfs directories so damn fast. dir_index - Use hashed b-trees to speed up lookups in large directories.> How many files do you need in a directory before dir_index is worth it?I don''t know. But if you have 4000 I''d say that''s a good place to start :)> Right now all my filesystems do not have dir_index enabled, so it would require > some downtime to enable it.Yeah, that''s the crappy part. In theory, you can enable it with tune2fs, but in practice I''ve only gotten it with mke2fs. My tools when implementing it were mostly from older debian distro though, this was unheard of stuff when they were written. gl! Let us know how you fare. -Matthew Beale> Regards, > Blair > _______________________________________________ > Rails mailing list > Rails-1W37MKcQCpIf0INCOvqR/iCwEArCW2h5@public.gmane.org > http://lists.rubyonrails.org/mailman/listinfo/rails