Just curious if anyone has looked into the relationship between zpool dedupe, zfs zend dedupe, memory use, and network throughput. For example, does ''zfs send -D'' use the same DDT as the pool? Or does it require more memory for it''s own DDT, thus impacting performance of both? If you have a deduped pool on both ends of the send, does -D make any difference? If neither pool is deduped, does -D make a difference? We''re waiting on a replacement backplane for our newest zfs-based storage box, so won''t be able to look into this ourselves until next week at the earliest. Thought i''d check if anyone else has already done some comparisons or benchmarks. Cheers, Freddie fjwcadh at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110906/aea03bd5/attachment.html>
On Sep 6, 2011, at 9:01 PM, Freddie Cash wrote:> Just curious if anyone has looked into the relationship between zpool dedupe, zfs zend dedupe, memory use, and network throughput. >Yes.> For example, does ''zfs send -D'' use the same DDT as the pool? >No.> Or does it require more memory for it''s own DDT, thus impacting performance of both? >Yes, no.> If you have a deduped pool on both ends of the send, does -D make any difference? >Yes, if the data is deduplicable.> If neither pool is deduped, does -D make a difference? >Yes, if the data is deduplicable.> We''re waiting on a replacement backplane for our newest zfs-based storage box, so won''t be able to look into this ourselves until next week at the earliest. Thought i''d check if anyone else has already done some comparisons or benchmarks. >I''m not aware of any benchmarks, and I''d be surprised if they could be applied to real-world cases. zfs send deduplication is very, very, very dependent on the data being sent. It is also dependent on the release, since it is broken in many OpenSolaris and derived builds. Fixes have recently been submitted into the illumos source tree. Recent Nexenta distributions also have the fixes. -- richard -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110906/de6dfc01/attachment.html>
On Tue, Sep 06, 2011 at 10:05:54PM -0700, Richard Elling wrote:> On Sep 6, 2011, at 9:01 PM, Freddie Cash wrote: > > > For example, does ''zfs send -D'' use the same DDT as the pool? > > No.My understanding was that ''zfs send -D'' would use the pool''s DDT in building its own, if present. If blocks were known by the filesystem to be duplicate, it would use that knowledge to skip some work seeding its own ddt and stream back-references. This doesn''t change the stream contents vs what it would have generated without these hints, so "No" still works as a short answer :) That understanding was based on discussions and blog posts at the time, not looking at code. At least in theory, it should help avoid reading and checksumming extra data blocks if this knowledge can be used, so less work regardless of measurable impact on send throughput. (It''s more about diminished impact to other concurrent activities) The point has mostly been moot in practice, though, because I''ve found "zfs send -D" just plain doesn''t work and often generates invalid streams, as you note. Good to know there are fixes. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110907/5f77da3d/attachment.bin>
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Richard Elling > > For example, does ''zfs send -D'' use the same DDT as the pool? > > No. > > > Or does it require more memory for it''s own DDT, thus impacting > performance of both? > > Yes, no.How can this be? If zfs send -D does not use the same DDT as the pool, then it must require memory for its own DDT. But Richard, the second half of your answer seems to contradict this. Perhaps you are denying that the extra memory usage impacts performance of the system?> If you have a deduped pool on both ends of the send, does -D make any > difference? > > If neither pool is deduped, does -D make a difference?Yes. If the originating pool is dedup''d on disk, then it''s just dedup''d on disk. And if the recipient pool is dedup''d on disk, then it''s just dedup''d on disk. In either case, traditionally the data would not be dedup''d in transit (zfs send.) zfs send -D only causes the data to be dedup''d in the data stream from the sender to the receiver. This presumably saves network bandwidth and accelerates the network traffic.
On 09/ 6/11 11:45 PM, Daniel Carosone wrote:> On Tue, Sep 06, 2011 at 10:05:54PM -0700, Richard Elling wrote: >> On Sep 6, 2011, at 9:01 PM, Freddie Cash wrote: >> >>> For example, does ''zfs send -D'' use the same DDT as the pool? >> No. > My understanding was that ''zfs send -D'' would use the pool''s DDT in > building its own, if present.It does not use the pool''s DDT, but it does use the SHA-256 checksums that have already been calculated for on-disk dedup, thus speeding the generation of the send stream.> If blocks were known by the filesystem > to be duplicate, it would use that knowledge to skip some work seeding > its own ddt and stream back-references. This doesn''t change the stream > contents vs what it would have generated without these hints, so "No" > still works as a short answer :) > > That understanding was based on discussions and blog posts at the > time, not looking at code. At least in theory, it should help avoid > reading and checksumming extra data blocks if this knowledge can be > used, so less work regardless of measurable impact on send throughput. > (It''s more about diminished impact to other concurrent activities) > > The point has mostly been moot in practice, though, because I''ve found > "zfs send -D" just plain doesn''t work and often generates invalid > streams, as you note. Good to know there are fixes. > > -- > Dan. > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110907/93c14ebf/attachment.html>
Thanks for the replies everyone. That was along the lines of what I was thinking (-D is a "win" for network usage savings, if it works) but wanted to double-check before I started playing with out new boxes. Will be interesting to see whether or not -D works with ZFSv28 in FreeBSD 8-STABLE/9-BETA. And whether or not "zfs send" is faster/better/easier/more reliable than rsyncing snapshots (which is what we do currently). Thanks for the info. -- Freddie Cash fjwcash at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110907/fd1be7e0/attachment.html>
On Wed, Sep 07, 2011 at 08:47:36AM -0600, Lori Alt wrote:> On 09/ 6/11 11:45 PM, Daniel Carosone wrote: >> My understanding was that ''zfs send -D'' would use the pool''s DDT in >> building its own, if present. > It does not use the pool''s DDT, but it does use the SHA-256 checksums > that have already been calculated for on-disk dedup, thus speeding the > generation of the send stream.Ah, thanks for the clarification. Presumably the same is true if the pool is using checksum=sha256, without dedup? Still a moot point for now :) -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110908/2a03a5d2/attachment.bin>
On 09/ 7/11 02:20 PM, Daniel Carosone wrote:> On Wed, Sep 07, 2011 at 08:47:36AM -0600, Lori Alt wrote: >> On 09/ 6/11 11:45 PM, Daniel Carosone wrote: >>> My understanding was that ''zfs send -D'' would use the pool''s DDT in >>> building its own, if present. >> It does not use the pool''s DDT, but it does use the SHA-256 checksums >> that have already been calculated for on-disk dedup, thus speeding the >> generation of the send stream. > Ah, thanks for the clarification. Presumably the same is true if the > pool is using checksum=sha256, without dedup? >Yes, I think so.> Still a moot point for now :) > > -- > Dan.
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Freddie Cash > > Will be interesting to see whether or not -D works with ZFSv28 in FreeBSD 8- > STABLE/9-BETA. And whether or not "zfs send" is faster/better/easier/more > reliable than rsyncing snapshots (which is what we do currently).Holy crap, quit wasting time on the -D thing and/or dedup, and yes, start using zfs send | zfs receive instead of using rsync! (Presuming you want to replicate your whole filesystem, and no exclusions and stuff like that.) rsync needs to walk the whole directory tree, and perform all sorts of comparison operations at a level above the filesystem to determine what changed and so forth... And then needs to calcuate diffs... ZFS instantly knows which blocks changed incrementally so it doesn''t need to do any of that work. ZFS just instantly starts streaming all the changed blocks, with magnificent efficiency. Typically people abandoning rsync in favor of zfs send | receive will experience a couple orders of magnitude performance gain. Depends on your data usage patterns, but that''s typical.