thr3ads.net - zfs discuss - [zfs-discuss] Newbie ZFS Question: RAM for Dedup [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Never Best

2010-Oct-20 00:36 UTC

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

Sorry I couldn''t find this anywhere yet.  For deduping it is best to
have the lookup table in RAM, but I wasn''t too sure how much RAM is
suggested?

::Assuming 128KB Block Sizes, and 100% unique data:
1TB*1024*1024*1024/128 = 8388608 Blocks
::Each Block needs 8 byte pointer?
8388608*8 = 67108864 bytes
::Ram suggest per TB
67108864/1024/1024 = 64MB

So if I understand correctly we should have a min of 64MB RAM per TB for
deduping? *hopes my math wasn''t way off*, or is there significant extra
overhead stored per block for the lookup table?  For example is there some kind
of redundancy on the lookup table (relation to RAM space requirments) to counter
corruption?

I read some articles and they all mention that there is significant performance
loss if the table isn''t in RAM, but none really mentioned how much RAM
one should have per TB of duping.

Thanks, hope someone can confirm *or give me the real numbers* for me.  I know
blocksize is variable; I''m most interessted in the default zfs setup
right now.
-- 
This message posted from opensolaris.org

Peter Jeremy

2010-Oct-20 03:06 UTC

head link

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

On 2010-Oct-20 08:36:30 +0800, Never Best <quickx at hotmail.com>
wrote:>Sorry I couldn''t find this anywhere yet.  For deduping it is best
to
>have the lookup table in RAM, but I wasn''t too sure how much RAM is
>suggested?
*Lots*
>::Assuming 128KB Block Sizes, and 100% unique data:
>1TB*1024*1024*1024/128 = 8388608 Blocks
>::Each Block needs 8 byte pointer?
>8388608*8 = 67108864 bytes
>::Ram suggest per TB
>67108864/1024/1024 = 64MB
>
>So if I understand correctly we should have a min of 64MB RAM per TB
>for deduping? *hopes my math wasn''t way off*, or is there
significant
>extra overhead stored per block for the lookup table?
The rule-of-thumb is 270 bytes per DDT entry - that means a minimum of
2.2GB of RAM (or fast L2ARC) per TB.

And note that 128KB is the maximum blocksize - it''s quite likely that
you will have smaller blocks (which implies more RAM).  I know my
average blocksize is only a few KB.

-- 
Peter Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101020/62252dd9/attachment.bin>

Never Best

2010-Oct-20 06:12 UTC

head link

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

Ouch.  I was thinking a DDT entry basically just needs an 8byte pointer to
where-ever the data is located on disk, with a O(1) hash table for lookup, and
maybe some redundancy/error correction data.  Maybe that should get optimized; a
light weight version for NB ;).

I guess it is doing more than I thought it was, maybe with some performance
boosts at the cost of DDT size *will read up a bit more*?  Ahh well, I can still
use it for specific folders for now and look into a SSD for L2ARC (this is how
it''s done I''m guessing) to dedup the entire raid ;).

Thanks
-- 
This message posted from opensolaris.org

Orvar Korvar

2010-Oct-20 09:54 UTC

head link

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

Sometimes you read about people having low performance deduping: it is because
they have too little RAM.
-- 
This message posted from opensolaris.org

Nikola M

2010-Oct-20 16:48 UTC

head link

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

Orvar Korvar wrote:> Sometimes you read about people having low performance deduping: it is
because they have too little RAM.
>   I mostly heard they have low performance when they start deleting
deduplicated data, not before that.

So do you think that with 2.2GB of RAM per 1 TB of storage, with 128Kb
blocks, deduplication will have no performance impact when deleting
deduped data?

Or it is like everyone was saying, that slow deleting of deduplicated
data is something that is/to be fixed in further ZFS development?

Haudy Kazemi

2010-Oct-23 03:44 UTC

head link

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

Never Best wrote:> Sorry I couldn''t find this anywhere yet.  For deduping it is best
to have the lookup table in RAM, but I wasn''t too sure how much RAM is
suggested?
>
> ::Assuming 128KB Block Sizes, and 100% unique data:
> 1TB*1024*1024*1024/128 = 8388608 Blocks
> ::Each Block needs 8 byte pointer?
> 8388608*8 = 67108864 bytes
> ::Ram suggest per TB
> 67108864/1024/1024 = 64MB
>
> So if I understand correctly we should have a min of 64MB RAM per TB for
deduping? *hopes my math wasn''t way off*, or is there significant extra
overhead stored per block for the lookup table?  For example is there some kind
of redundancy on the lookup table (relation to RAM space requirments) to counter
corruption?
>
> I read some articles and they all mention that there is significant
performance loss if the table isn''t in RAM, but none really mentioned
how much RAM one should have per TB of duping.
>
> Thanks, hope someone can confirm *or give me the real numbers* for me.  I
know blocksize is variable; I''m most interessted in the default zfs
setup right now.
>   There were several detailed discussions about this over the past 6 
months that should be in the archives.  I believe most of the info came 
from Richard Elling.

Erik Trimble

2010-Oct-23 08:48 UTC

head link

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

On 10/22/2010 8:44 PM, Haudy Kazemi wrote:> Never Best wrote:
>> Sorry I couldn''t find this anywhere yet.  For deduping it is
best to
>> have the lookup table in RAM, but I wasn''t too sure how much
RAM is
>> suggested?
>>
>> ::Assuming 128KB Block Sizes, and 100% unique data:
>> 1TB*1024*1024*1024/128 = 8388608 Blocks
>> ::Each Block needs 8 byte pointer?
>> 8388608*8 = 67108864 bytes
>> ::Ram suggest per TB
>> 67108864/1024/1024 = 64MB
>>
>> So if I understand correctly we should have a min of 64MB RAM per TB 
>> for deduping? *hopes my math wasn''t way off*, or is there
significant
>> extra overhead stored per block for the lookup table?  For example is 
>> there some kind of redundancy on the lookup table (relation to RAM 
>> space requirments) to counter corruption?
>>
>> I read some articles and they all mention that there is significant 
>> performance loss if the table isn''t in RAM, but none really
mentioned
>> how much RAM one should have per TB of duping.
>>
>> Thanks, hope someone can confirm *or give me the real numbers* for 
>> me.  I know blocksize is variable; I''m most interessted in the
>> default zfs setup right now.
> There were several detailed discussions about this over the past 6 
> months that should be in the archives.  I believe most of the info 
> came from Richard Elling.
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Look for both my name and Richard''s, going back about a year. In 
particular, this thread started out a good data flow:

http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg35349.html


bottom line: 270 bytes per record

so, for 4k record size, that  works out to be 67GB per 1 TB of unique 
data. 128k record size means about 2GB per 1 TB.



dedup means buy a (big) SSD for L2ARC.


-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

Richard Elling

2010-Oct-23 16:54 UTC

head link

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

comments at the bottom...

On Oct 23, 2010, at 1:48 AM, Erik Trimble wrote:
> On 10/22/2010 8:44 PM, Haudy Kazemi wrote:
>> Never Best wrote:
>>> Sorry I couldn''t find this anywhere yet.  For deduping it
is best to have the lookup table in RAM, but I wasn''t too sure how much
RAM is suggested?
>>> 
>>> ::Assuming 128KB Block Sizes, and 100% unique data:
>>> 1TB*1024*1024*1024/128 = 8388608 Blocks
>>> ::Each Block needs 8 byte pointer?
>>> 8388608*8 = 67108864 bytes
>>> ::Ram suggest per TB
>>> 67108864/1024/1024 = 64MB
>>> 
>>> So if I understand correctly we should have a min of 64MB RAM per
TB for deduping? *hopes my math wasn''t way off*, or is there
significant extra overhead stored per block for the lookup table?  For example
is there some kind of redundancy on the lookup table (relation to RAM space
requirments) to counter corruption?
>>> 
>>> I read some articles and they all mention that there is significant
performance loss if the table isn''t in RAM, but none really mentioned
how much RAM one should have per TB of duping.
>>> 
>>> Thanks, hope someone can confirm *or give me the real numbers* for
me.  I know blocksize is variable; I''m most interessted in the default
zfs setup right now.
>> There were several detailed discussions about this over the past 6
months that should be in the archives.  I believe most of the info came from
Richard Elling.
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> Look for both my name and Richard''s, going back about a year. In
particular, this thread started out a good data flow:
> 
> http://www.mail-archive.com/zfs-discuss at opensolaris.org/msg35349.html
> 
> 
> bottom line: 270 bytes per record
Sometimes we see bigger sizes, but you have to have a lot of references before
the DDT entry gets bigger than 512 bytes.  Or, another way to look at this is:
for
every record, you will be updating 512 bytes (or the minimum sector size).  This
is why you''ll hear me say that dedup changes big I/O into little I/O,
but it doesn''t
eliminate I/O.  Fortunately, modern SSDs do little I/O well. Unfortunately, HDDs
are better optimized for big I/O and are lousy for little I/O. 
> so, for 4k record size, that  works out to be 67GB per 1 TB of unique data.
128k record size means about 2GB per 1 TB.
Divide by 4 because the DDT is considered metadata and the metadata limit
is 1/4 of ARC size.  Yes, there is an open bug on this. No, it didn''t
make b147.
Yes, it is a trivial fix and can be tuned in the field.
> dedup means buy a (big) SSD for L2ARC.
L2ARC directory entries take space, too.  SWAG around 200 bytes for each L2ARC
record.
 -- richard

-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
USENIX LISA ''10 Conference, November 7-12, San Jose, CA
ZFS and performance consulting
http://www.RichardElling.com

zfs discuss - Oct 2010 - Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup

[zfs-discuss] Newbie ZFS Question: RAM for Dedup