On Mar 28, 2007 18:47 +0100, T. Horsnell wrote:> 1. The effect on performance of large numbers of (generally) small files
> One of my ext3 filesystems has 750K files on a 36GB disk, and
> backup with tar takes forever. Even 'find /fs -type f -ls'
> to establish ownership of the various files takes some hours.
> Are there thresholds for #files-per-directory or
#total-files-per-filesystem
> beyond which performance degrades rapidly?
You should enable directory indexing if you have > 5000 file directories,
then index the directories. "tune2fs -O dir_index /dev/XXX; e2fsck -fD
/dev/XXX"
> 2. I have a number of filesystems on SCSI disks which I would
> like to fsck on demand, rather than have an unscheduled
> fsck at reboot because some mount-count has expired.
> I use 'tune2fs -c 0 and -t 0' to do this, and would like
> to use 'shutdown -F -r 'at a chosen time to force fsck on
> reboot, and I'd then like fsck to do things in parallel.
> What are the resources (memory etc) required for parallel
> fsck'ing? Can I reasonably expect to be able to fsck say,
> 50 300GB filesystems in parallel, or should I group them into
> smaller groups? How small?
I think it was at least "(inodes_count * 7 + blocks_count * 3) / 8"
per
filesystem when I last checked, but I don't recall exactly anymore.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.