thr3ads.net - zfs discuss - [zfs-discuss] dedup and memory/l2arc requirements [Apr 2010]

If this information is useful, please help other people find it:
Share via:

Roy Sigurd Karlsbakk

2010-Apr-02 00:39 UTC

[zfs-discuss] dedup and memory/l2arc requirements

Hi all

I''ve been told (on #opensolaris, irc.freenode.net) that opensolaris
needs a lot of memory and/or l2arc for dedup to function properly. How much
memory or l2arc should I get for a 12TB zpool (8x2GB in RAIDz2), and then, how
much for 125TB (after RAIDz2 overhead)? Is there a function into which I can
plug my recordsize and volume size to get the appropriate numbers?

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

Roy Sigurd Karlsbakk

2010-Apr-02 04:33 UTC

head link

[zfs-discuss] dedup and memory/l2arc requirements

> > I might add some swap I guess.  I will have to try it on another
> > machine with more RAM and less pool, and see how the size of the
> zdb
> > image compares to the calculated size of DDT needed.  So long as
> zdb
> > is the same or a little smaller than the DDT it predicts, the
> tool''s
> > still useful, just sometimes it will report ``DDT too big but not
> sure
> > by how much'''', by coredumping/thrashing instead of
finishing.
> 
> In my experience, more swap doesn''t help break through the 2GB
memory
> barrier.  As zdb is an intentionally unsupported tool, methinks
> recompile
> may be required (or write your own).
I guess this tool might not work too well, then, with 20TiB in 47M files?

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

Roy Sigurd Karlsbakk

2010-Apr-02 04:34 UTC

head link

[zfs-discuss] dedup and memory/l2arc requirements

> You can estimate the amount of disk space needed for the deduplication
> table
> and the expected deduplication ratio by using "zdb -S poolname"
on
> your existing
> pool. 
This is all good, but it doesn''t work too well for planning. Is there a
rule of thumb I can use for a general overview? Say I want 125TB space and I
want to dedup that for backup use. It''ll probably be quite efficient
dedup, so long alignment will match. By the way, is there a way to auto-align
data for dedup in case of backup? Or does zfs do this by itself?

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/

Richard Elling

2010-Apr-02 18:26 UTC

head link

[zfs-discuss] dedup and memory/l2arc requirements

On Apr 1, 2010, at 5:39 PM, Roy Sigurd Karlsbakk wrote:> Hi all
> 
> I''ve been told (on #opensolaris, irc.freenode.net) that
opensolaris needs a lot of memory and/or l2arc for dedup to function properly.
How much memory or l2arc should I get for a 12TB zpool (8x2GB in RAIDz2), and
then, how much for 125TB (after RAIDz2 overhead)? Is there a function into which
I can plug my recordsize and volume size to get the appropriate numbers?
You can estimate the amount of disk space needed for the deduplication table
and the expected deduplication ratio by using "zdb -S poolname" on
your existing
pool.  Be patient, for an existing pool with lots of objects, this can take some
time to run.

# ptime zdb -S zwimming
Simulated DDT histogram:

bucket              allocated                       referenced          
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1    2.27M    239G    188G    194G    2.27M    239G    188G    194G
     2     327K   34.3G   27.8G   28.1G     698K   73.3G   59.2G   59.9G
     4    30.1K   2.91G   2.10G   2.11G     152K   14.9G   10.6G   10.6G
     8    7.73K    691M    529M    529M    74.5K   6.25G   4.79G   4.80G
    16      673   43.7M   25.8M   25.9M    13.1K    822M    492M    494M
    32      197   12.3M   7.02M   7.03M    7.66K    480M    269M    270M
    64       47   1.27M    626K    626K    3.86K    103M   51.2M   51.2M
   128       22    908K    250K    251K    3.71K    150M   40.3M   40.3M
   256        7    302K     48K   53.7K    2.27K   88.6M   17.3M   19.5M
   512        4    131K   7.50K   7.75K    2.74K    102M   5.62M   5.79M
    2K        1      2K      2K      2K    3.23K   6.47M   6.47M   6.47M
    8K        1    128K      5K      5K    13.9K   1.74G   69.5M   69.5M
 Total    2.63M    277G    218G    225G    3.22M    337G    263G    270G

dedup = 1.20, compress = 1.28, copies = 1.03, dedup * compress / copies = 1.50


real     8:02.391932786
user     1:24.231855093
sys        15.193256108

In this file system, 2.75 million blocks are allocated. The in-core size
of a DDT entry is approximately 250 bytes.  So the math is pretty simple:
	in-core size = 2.63M * 250 = 657.5 MB

If your dedup ratio is 1.0, then this number will scale linearly with size.
If the dedup rate > 1.0, then this number will not scale linearly, it will be
less. So you can use the linear scale as a worst-case approximation.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Miles Nordin

2010-Apr-02 21:03 UTC

head link

[zfs-discuss] dedup and memory/l2arc requirements

>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
    re> # ptime zdb -S zwimming Simulated DDT histogram:
    re>  refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE  
DSIZE
    re>   Total    2.63M    277G    218G    225G    3.22M    337G    263G   
270G

    re>        in-core size = 2.63M * 250 = 657.5 MB

Thanks, that is really useful!  It''ll probably make the difference
between trying dedup and not, for me.

It is not working for me yet.  It got to this point in prstat:

  6754 root     2554M 1439M sleep   60    0   0:03:31 1.9% zdb/106

and then ran out of memory:

 $ pfexec ptime zdb -S tub
 out of memory -- generating core dump

I might add some swap I guess.  I will have to try it on another
machine with more RAM and less pool, and see how the size of the zdb
image compares to the calculated size of DDT needed.  So long as zdb
is the same or a little smaller than the DDT it predicts, the tool''s
still useful, just sometimes it will report ``DDT too big but not sure
by how much'''', by coredumping/thrashing instead of finishing.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100402/b3772c5f/attachment.bin>

Richard Elling

2010-Apr-02 23:57 UTC

head link

[zfs-discuss] dedup and memory/l2arc requirements

On Apr 2, 2010, at 2:03 PM, Miles Nordin wrote:
>>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
> 
>    re> # ptime zdb -S zwimming Simulated DDT histogram:
>    re>  refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE
DSIZE
>    re>   Total    2.63M    277G    218G    225G    3.22M    337G    263G
270G
> 
>    re>        in-core size = 2.63M * 250 = 657.5 MB
> 
> Thanks, that is really useful!  It''ll probably make the difference
> between trying dedup and not, for me.
> 
> It is not working for me yet.  It got to this point in prstat:
> 
>  6754 root     2554M 1439M sleep   60    0   0:03:31 1.9% zdb/106
> 
> and then ran out of memory:
> 
> $ pfexec ptime zdb -S tub
> out of memory -- generating core dump
This is annoying. By default, zdb is compiled as a 32-bit executable and
it can be a hog. Compiling it yourself is too painful for most folks :-(
> I might add some swap I guess.  I will have to try it on another
> machine with more RAM and less pool, and see how the size of the zdb
> image compares to the calculated size of DDT needed.  So long as zdb
> is the same or a little smaller than the DDT it predicts, the
tool''s
> still useful, just sometimes it will report ``DDT too big but not sure
> by how much'''', by coredumping/thrashing instead of
finishing.
In my experience, more swap doesn''t help break through the 2GB memory
barrier.  As zdb is an intentionally unsupported tool, methinks recompile
may be required (or write your own).
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Richard Elling

2010-Apr-04 03:01 UTC

head link

[zfs-discuss] dedup and memory/l2arc requirements

On Apr 1, 2010, at 9:34 PM, Roy Sigurd Karlsbakk wrote:
>> You can estimate the amount of disk space needed for the deduplication
>> table
>> and the expected deduplication ratio by using "zdb -S
poolname" on
>> your existing
>> pool. 
> 
> This is all good, but it doesn''t work too well for planning. Is
there a rule of thumb I can use for a general overview?
If you know the average record size for your workload, then you can calculate
the average number of records when given the total space.  This should get 
you in the ballpark.
> Say I want 125TB space and I want to dedup that for backup use.
It''ll probably be quite efficient dedup, so long alignment will match.
By the way, is there a way to auto-align data for dedup in case of backup? Or
does zfs do this by itself?
ZFS does not change alignment.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

Darren J Moffat

2010-Apr-06 08:30 UTC

head link

[zfs-discuss] dedup and memory/l2arc requirements

On 03/04/2010 00:57, Richard Elling wrote:>
> This is annoying. By default, zdb is compiled as a 32-bit executable and
> it can be a hog. Compiling it yourself is too painful for most folks :-(
/usr/sbin/zdb is actually a link to /usr/lib/isaexec

$ ls -il /usr/sbin/zdb /usr/lib/isaexec
     300679 -r-xr-xr-x  92 root     bin         8248 Nov 16 10:26 
/usr/lib/isaexec*
     300679 -r-xr-xr-x  92 root     bin         8248 Nov 16 10:26 
/usr/sbin/zdb*


$ ls -il /usr/sbin/i86/zdb /usr/sbin/amd64/zdb
     200932 -r-xr-xr-x   1 root     bin       173224 Mar 15 10:20 
/usr/sbin/amd64/zdb*
     200933 -r-xr-xr-x   1 root     bin       159960 Mar 15 10:20 
/usr/sbin/i86/zdb*

This means both 32 and 64 bit versions are already available and if the 
kernel is 64 bit then the 64 bit version of zdb will be run if you run 
/usr/sbin/zdb.

-- 
Darren J Moffat

zfs discuss - Apr 2010 - dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements

[zfs-discuss] dedup and memory/l2arc requirements