On Nov 4, 2010, at 10:59 AM, Rob Cohen wrote:
> I have read some conflicting things regarding the ZFs record size setting.
Could you guys verify/correct my these statements:
>
> (These reflect my understanding, not necessarily the facts!)
>
> 1) The ZFS record size in a zvol is the unit that dedup happens at. So,
for a volume that is shared to an NTFS machine, if the NTFS cluster size is
smaller than the zvol record size, dedup will get dramatically worse, since it
won''t dedup clusters that are positioned differently in zvol records.
Not quite. Dedup happens on a per-block basis. For zvols, the volblocksize
(recordsize) is
fixed at zvol creation time, so all blocks in a zvol have the same size.
The physical and logical size of the block are also used to determine if the
blocks are
identical. Clearly, two blocks with the same checksum but different sizes are
not identical.
Setting the volblocksize larger than the client application''s minumum
blocksize can decrease
dedup effectiveness.
> 2) For shared folders, the record size is the allocation unit size, so
large records can waste a substantial amount of space, in cases with lots of
very small files. This is different than a HW raid stripe size, which only
affects performance, not space usage.
No. The recordsize in a file system is dynamic. Small files use the smallest
possible recordsize.
In practice, there is very little waste. If you are really concerned about
that, enable compression.
This is very different than "HW RAID" stripe size. The two have
nothing in common: apples and
oranges.
> 3) Although small record sizes have a large RAM overhead for dedup tables,
as long as the dedup table working set fits in RAM, and the rest fits in L2ARC,
performance will be good.
Dedup changes large I/Os into small I/Os. If your pool does not perform small
I/Os well, then
dedup can have a noticeable impact on performance.
-- richard
ZFS Tutorial at USENIX LISA''10 Conference next Monday
www.RichardElling.com