thr3ads.net - zfs discuss - [zfs-discuss] ZFS Dedup Performance [Jan 2010]

If this information is useful, please help other people find it:
Share via:

James Lee

2010-Jan-08 18:00 UTC

[zfs-discuss] ZFS Dedup Performance

I haven''t seen much discussion on how deduplication affects
performance.
 I''ve enabled dudup on my 4-disk raidz array and have seen a
significant
drop in write throughput, from about 100 MB/s to 3 MB/s.  I can''t
imagine such a decrease is normal.
> # zpool iostat nest 1 (with dedup enabled):
> ...
> nest        1.05T   411G     91     18   197K  2.35M
> nest        1.05T   411G    147     15   443K  1.98M
> nest        1.05T   411G     82     28   174K  3.59M
> # zpool iostat nest 1 (with dedup disabled):
> ...
> nest        1.05T   410G      0    787      0  96.9M
> nest        1.05T   410G      1    899   253K  95.0M
> nest        1.05T   409G      0    533      0  48.5M
I do notice when dedup is enabled that the drives sound like they are
constantly seeking.  iostat shows average service times around 20 ms
which is normal for my drives and prstat shows that my processor and
memory aren''t a bottleneck.  What could cause such a marked decrease in
throughput?  Is anyone else experiencing similar effects?

Thanks,

James

Ray Van Dolson

2010-Jan-08 18:04 UTC

head link

[zfs-discuss] ZFS Dedup Performance

On Fri, Jan 08, 2010 at 10:00:14AM -0800, James Lee
wrote:> I haven''t seen much discussion on how deduplication affects
performance.
>  I''ve enabled dudup on my 4-disk raidz array and have seen a
significant
> drop in write throughput, from about 100 MB/s to 3 MB/s.  I can''t
> imagine such a decrease is normal.
Seems like I''ve seen other posts with similar numbers (maybe 9MB/s or
so?).

Sounded like adding SSD for caching really improved performance
however.
> 
> > # zpool iostat nest 1 (with dedup enabled):
> > ...
> > nest        1.05T   411G     91     18   197K  2.35M
> > nest        1.05T   411G    147     15   443K  1.98M
> > nest        1.05T   411G     82     28   174K  3.59M
> 
> > # zpool iostat nest 1 (with dedup disabled):
> > ...
> > nest        1.05T   410G      0    787      0  96.9M
> > nest        1.05T   410G      1    899   253K  95.0M
> > nest        1.05T   409G      0    533      0  48.5M
> 
> I do notice when dedup is enabled that the drives sound like they are
> constantly seeking.  iostat shows average service times around 20 ms
> which is normal for my drives and prstat shows that my processor and
> memory aren''t a bottleneck.  What could cause such a marked
decrease in
> throughput?  Is anyone else experiencing similar effects?
> 
> Thanks,
> 
> James
Ray

Lutz Schumann

2010-Jan-08 19:42 UTC

head link

[zfs-discuss] ZFS Dedup Performance

See the reads on the pool with the low I/O ? I suspect reading the DDT causes
the writes to slow down.

See this bug bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913566.
It seems to give some backgrounds.

Can you test setting the "primarycache=metadata" on the volume you
test ? This would be my initial test. My suggestion would be that it may improve
the situation because your ARC can be better utilized for DDT (this does not
make much sence for production without a SSD cache, because you practially
disable all caches for reading without a L2ARC (aka SSD)!)

As I read the bug report above - it seems the if the DDT (deduplication table)
does not fit into memory or dropped from there the DDT has to be read from disk
causing massive random I/O.
-- 
This message posted from opensolaris.org

Ian Collins

2010-Jan-08 19:44 UTC

head link

[zfs-discuss] ZFS Dedup Performance

James Lee wrote:> I haven''t seen much discussion on how deduplication affects
performance.
>  I''ve enabled dudup on my 4-disk raidz array and have seen a
significant
> drop in write throughput, from about 100 MB/s to 3 MB/s.  I can''t
> imagine such a decrease is normal.
>
>   What is you data?

I''ve found data that lends its self to deduplication writes slightly 
faster while data that does not (video, iso images) writes dramatically 
slower. So I turn dedupe (and compression) off for filesystems 
containing "random" data.

-- 
Ian.

James Dickens

2010-Jan-08 20:55 UTC

head link

[zfs-discuss] ZFS Dedup Performance

On Fri, Jan 8, 2010 at 1:44 PM, Ian Collins <ian at ianshome.com> wrote:
> James Lee wrote:
>
>> I haven''t seen much discussion on how deduplication affects
performance.
>>  I''ve enabled dudup on my 4-disk raidz array and have seen a
significant
>> drop in write throughput, from about 100 MB/s to 3 MB/s.  I
can''t
>> imagine such a decrease is normal.
>>
>>
>>
> What is you data?
>
> I have seen the same,  fsstat reports 4-7 seconds of small writes thenbursts of 40-80MB/s but without dedup i see 80-150MB/s writes on my 4x 500GB
sata drives, split between two controllers. 6GB of ram, and about 1.5TB of
storage with 1.2TB used. if I disable dedup, speed goes backup. While doing
dedup writes  zfs destroy pool/filesystem takes about 100x time as usual
even if the pool is that is being destroyed is empty reports say its far
worse when over 100GB of data is on a drive. my dedup ratio for the pool is
1.15x. Read performance seems about the same or slightly faster I
didn''t
really benchmark this work load since my clients seem to be the bottleneck.

As money is tight at the moment i don''t have the funds for a SSD to
test
with, but have disk space on non-utilized disk to try but haven''t
researched
the effect of adding and removing (if possible) l2arc or zil log slices on a
pool. it would be great to enable a 5-50GB slice off a sata drive to use as
logging device for greater performance.


James Dickens
uadmin.blogspot.com

I''ve found data that lends its self to deduplication writes slightly
faster> while data that does not (video, iso images) writes dramatically slower. So
> I turn dedupe (and compression) off for filesystems containing
"random"
> data.
>
> --
> Ian.
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> mail.opensolaris.org/mailman/listinfo/zfs-discuss
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100108/63aac08d/attachment.html>

James Lee

2010-Jan-08 21:21 UTC

head link

[zfs-discuss] ZFS Dedup Performance

On 01/08/2010 02:42 PM, Lutz Schumann wrote:> See the reads on the pool with the low I/O ? I suspect reading the
> DDT causes the writes to slow down.
> 
> See this bug
> bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913566.
> It seems to give some backgrounds.
> 
> Can you test setting the "primarycache=metadata" on the volume
you
> test ? This would be my initial test. My suggestion would be that it
> may improve the situation because your ARC can be better utilized for
> DDT (this does not make much sence for production without a SSD
> cache, because you practially disable all caches for reading without
> a L2ARC (aka SSD)!)
> 
> As I read the bug report above - it seems the if the DDT
> (deduplication table) does not fit into memory or dropped from there
> the DDT has to be read from disk causing massive random I/O.
The symptoms described in that bug report do match up with mine.  I have
also experienced long hang times (>1hr) destroying a dataset while the
disk just thrashes.

I tried setting "primarycache=metadata", but that did not help.  I
pulled the DDT statistics for my pool, but don''t know how to determine
its physical size-on-disk from that.  If deduplication ends up requiring
a separate sort-of log device, that will be a real shame.
> # zdb -DD nest
> DDT-sha256-zap-duplicate: 780321 entries, size 338 on disk, 174 in core
> DDT-sha256-zap-unique: 6188123 entries, size 335 on disk, 164 in core
> 
> DDT histogram (aggregated over all DDTs):
> 
> bucket              allocated                       referenced          
> ______   ______________________________   ______________________________
> refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
> ------   ------   -----   -----   -----   ------   -----   -----   -----
>      1    5.90M    752G    729G    729G    5.90M    752G    729G    729G
>      2     756K   94.0G   93.7G   93.6G    1.48M    188G    187G    187G
>      4    5.36K    152M   80.3M   81.5M    22.4K    618M    325M    330M
>      8      258   4.05M   1.93M   2.00M    2.43K   36.7M   16.3M   16.9M
>     16       30    434K     42K   50.9K      597   10.2M    824K   1003K
>     32        5    255K   65.5K   66.6K      204   10.5M   3.26M   3.30M
>     64       20   2.02M    906K    910K    1.41K    141M   62.0M   62.2M
>    128        4      2K      2K   2.99K      723    362K    362K    541K
>    256        1     512     512     766      277    138K    138K    207K
>    512        2      1K      1K   1.50K    1.62K    830K    830K   1.21M
>  Total    6.65M    846G    823G    823G    7.41M    941G    917G    917G
> 
> dedup = 1.11, compress = 1.03, copies = 1.00, dedup * compress / copies =
1.14

Steve Radich, BitShop, Inc.

2010-Jan-22 22:57 UTC

head link

[zfs-discuss] ZFS Dedup Performance

We''re having to split data to multiple pools if we enable dedup, 1+ TB
pools each (one 6x750gb is particularly bad).

The timeouts cause COMSTAR / iSCSI to fail, Windows clients are dropping the
persistent targets due to timeouts (> 15 seconds it seems). This is causing
bigger problems.

Disabling dedup is an option, but it shouldn''t be *THAT* much load I
wouldn''t think. Having it on a cache drive is reasonable, however if
this is required OpenSolaris should add something like DDTCacheDevice so we can
dedicate a device to it seperate from the secondcache.

I''ll drop in a 150gb cache drive tonight to see if it improves things.

Steve Radich
BitShop.com
-- 
This message posted from opensolaris.org

Steve Radich, BitShop, Inc.

2010-Jan-22 23:00 UTC

head link

[zfs-discuss] ZFS Dedup Performance

I should note that trying zfs set primarycache=metadata tank1 took a few
minutes. Seems changing what is cached in ram would be instant (we
don''t need to flush out from ram the data, just don''t put it
back in ram again).

During this disk i/o seemed slow, could have been unrelated.
-- 
This message posted from opensolaris.org

Steve Radich, BitShop, Inc.

2010-Mar-23 19:22 UTC

head link

[zfs-discuss] ZFS Dedup Performance

bitshop.com/Blogs/tabid/95/EntryId/78/Bug-in-OpenSolaris-SMB-Server-causes-slow-disk-i-o-always.aspx

This explains just how major of a bug this issue is IMHO - The SMB slowdown from
Windows 2003 is doing something odd in the Kernel I think now from the symptoms
- See the tests for rsync performance.

Our file move used to bring the server to almost unusable (in fact some SAN
clients would say iSCSI host disappeared and shutdown). Now during the copy /
load on the disks the iSCSI clients are insanely fast - Only difference is
server/smb is disabled.

I think ZFS De-Dup just made it appear worse.
-- 
This message posted from opensolaris.org

Miles Nordin

2010-Mar-24 18:35 UTC

head link

[zfs-discuss] ZFS Dedup Performance

>>>>> "srbi" == Steve Radich, BitShop, Inc <stever
at bitshop.com> writes:
  srbi>
bitshop.com/Blogs/tabid/95/EntryId/78/Bug-in-OpenSolaris-SMB-Server-causes-slow-disk-i-o-always.aspx

I''m having trouble understanding many things in here like ``our file
move'''' (moving what from where to where with what protocol?)
and
``with SMB running'''' (with the server enabled on Solaris, with
filesystems mounted, with activity on the mountpoints?  what does
running mean?) and ``RAID-0/stripe reads is the slow point''''
(what
does this mean?  How did you determine which part of the stack is
limiting the observed speed?  This is normally quite difficult and
requires comparing several experiments, not doing just one experiment
like ``a file move between zfs pools''''.).  What is ``bytes the
negotiated protocol allows''''?  mtu, mss, window size?  Can you
show us
in what tool you see one number and where you see the other number
that''s too big?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100324/b4f60a1e/attachment.bin>

zfs discuss - Jan 2010 - ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance

[zfs-discuss] ZFS Dedup Performance