Ralf Hildebrandt
2010-Aug-13 10:00 UTC
Squid and first-level subdirectories & second-level subdirectories on ext3/4
Squid, a proxy, is by its nature, storing large amounts of relatively small files in it's cache. As config optins it's offering: # 'Level-1' is the number of first-level subdirectories which # will be created under the 'Directory'. The default is 16. # # 'Level-2' is the number of second-level subdirectories which # will be created under each first-level directory. The default # is 256. Meaning one has /squid-cache/(16 dirs)/(256 dirs)/(the small files) so the total number of small files in the cache is (hopefully) evenly distributed to 16*256 directories. But is that optimal for an ext3/4 filesystem? What is the point of using 16 for the first level and 256 for the second? Wouldn't 64*64 (which equals 16*256) be better when it comes to finding the files on disk? -- Ralf Hildebrandt Gesch?ftsbereich IT | Abteilung Netzwerk Charit? - Universit?tsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt at charite.de | http://www.charite.de
Andreas Dilger
2010-Aug-15 23:05 UTC
Squid and first-level subdirectories & second-level subdirectories on ext3/4
On 2010-08-13, at 04:00, Ralf Hildebrandt wrote:> Squid, a proxy, is by its nature, storing large amounts of relatively > small files in it's cache. > > As config optins it's offering: > > # 'Level-1' is the number of first-level subdirectories which > # will be created under the 'Directory'. The default is 16. > # > # 'Level-2' is the number of second-level subdirectories which > # will be created under each first-level directory. The default > # is 256. > > Meaning one has > > /squid-cache/(16 dirs)/(256 dirs)/(the small files) > > so the total number of small files in the cache is (hopefully) evenly > distributed to 16*256 directories. > > But is that optimal for an ext3/4 filesystem? What is the point of > using 16 for the first level and 256 for the second?In ext3/4 the top-level inodes are spread around the filesystem, on the assumtion that something like /home or / is allocating trees of unrelated subdirectories at the top level, but that files within those subdirectories ARE related and should be allocated together. Depending on how many files are in your cache, the 256 * {small files} is likely too big to fit into a single block group (32k inodes, 32k blocks) so you may want to consider marking the first level of subdirectories with the "TOPDIR" flag, that indicates the second-level (256) subdirs should also be spread around the disk.> Wouldn't 64*64 (which equals 16*256) be better when it comes to > finding the files on disk?Benchmarking tells all... Cheers, Andreas