Bill Sprouse
2008-Apr-18 00:09 UTC
[zfs-discuss] lots of small, twisty files that all look the same
A customer has a zpool where their spectral analysis applications create a ton (millions?) of very small files that are typically 1858 bytes in length. They''re using ZFS because UFS consistently runs out of inodes. I''m assuming that ZFS aggregates these little files into recordsize (128K?) blobs for writes. This seems to go reasonably well amazingly enough. Reads are a disaster as we might expect. To complicate things, writes are coming in over NFS. Reads may be local or may be via NFS and may be random. Once written, data is not changed until removed. No Z RAID''ing is used. The storage device is a 3510 FC array with 5+1 RAID5 in hardware. I would like to triage this if possible. Would changing the recordsize to something much smaller like 8k and tuning down vdev_cache to something like 8k be of initial benefit (S10U4)? Any other ideas gratefully accepted. bill This message posted from opensolaris.org
Bart Smaalders
2008-Apr-18 17:19 UTC
[zfs-discuss] lots of small, twisty files that all look the same
Bill Sprouse wrote:> A customer has a zpool where their spectral analysis applications >create a ton (millions?) of very small files that are typically 1858 > bytes in length. They''re using ZFS because UFS consistently runs out of > inodes. I''m assuming that ZFS aggregates these little files into > recordsize (128K?) blobs for writes. This seems to go reasonably well > amazingly enough. Reads are a disaster as we might expect. To complicate > things, writes are coming in over NFS. Reads may be local or may be via > NFS and may be random. Once written, data is not changed until removed. > No Z RAID''ing is used. The storage device is a 3510 FC array with 5+1 > RAID5 in hardware. I would like to triage this if possible. Would > changing the recordsize to something much smaller like 8k and tuning > down vdev_cache to something like 8k be of initial benefit (S10U4)? Any > other ideas gratefully accepted. > > billFor a random read workload, you need spindles, or enough ram to keep the entire workload in memory. Doing raid5 +1 on the array turns six spindles into one as far as random IO is concerned - don''t do that. Mirror if you can; this helps a lot because not only do you get more IOPs because you have more vdevs, each half of the mirror can satisfy independent read requests. - Bart -- Bart Smaalders Solaris Kernel Performance barts at cyber.eng.sun.com http://blogs.sun.com/barts "You will contribute more with mercurial than with thunderbird."