On Opensolaris build 134, upgraded from older versions, I have an rpool for which I had switch on dedup for a few weeks. After that I switched to back on. Now it seems the dedup ratio is stuck at a value of 1.68. Even when I copy more then 90 GB of data it still remains at 1.68. Any ideas ? Paul Here is some evidence? Before the copy : $ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 931G 132G 799G 14% 1.68x ONLINE - $ After the copy : $ zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT rpool 931G 225G 706G 24% 1.68x ONLINE - $ It has only been enabled for 11 days last month. $ pfexec zpool history |grep dedup 2010-02-11.21:19:42 zfs set dedup=verify rpool 2010-02-22.21:38:15 zfs set dedup=off rpool And it is off on all filesystems: $ zfs get -r dedup rpool NAME PROPERTY VALUE SOURCE rpool dedup off local rpool at 20100227 dedup - - rpool/ROOT dedup off inherited from rpool rpool/ROOT at 20100227 dedup - - rpool/ROOT/b131-zones dedup off inherited from rpool rpool/ROOT/b131-zones at 20100227 dedup - - rpool/ROOT/b132 dedup off inherited from rpool rpool/ROOT/b132 at 20100227 dedup - - rpool/ROOT/b133 dedup off inherited from rpool rpool/ROOT/b134 dedup off inherited from rpool rpool/ROOT/b134 at install dedup - - rpool/ROOT/b134 at 2010-02-07-11:19:05 dedup - - rpool/ROOT/b134 at 2010-02-20-15:59:22 dedup - - rpool/ROOT/b134 at 20100227 dedup - - rpool/ROOT/b134 at 2010-03-11-19:18:51 dedup - - rpool/dump dedup off inherited from rpool rpool/dump at 20100227 dedup - - rpool/export dedup off inherited from rpool rpool/export at 20100227 dedup - - rpool/export/home dedup off inherited from rpool rpool/export/home at 20100227 dedup - - rpool/export/home/beheer dedup off inherited from rpool rpool/export/home/beheer at 20100227 dedup - - rpool/export/home/paulz dedup off inherited from rpool rpool/export/home/paulz at 20100227 dedup - - rpool/export/share dedup off inherited from rpool rpool/export/share at 20100227 dedup - - rpool/local dedup off inherited from rpool rpool/local at 20100227 dedup - - rpool/paulzmail dedup off inherited from rpool rpool/paulzmail at 20100227 dedup - - rpool/pkg dedup off inherited from rpool rpool/pkg at 20100227 dedup - - rpool/swap dedup off inherited from rpool rpool/swap at 20100227 dedup - - rpool/zones dedup off inherited from rpool rpool/zones at 20100227 dedup - - rpool/zones/buildzone dedup off inherited from rpool rpool/zones/buildzone at 20100227 dedup - - rpool/zones/buildzone/ROOT dedup off inherited from rpool rpool/zones/buildzone/ROOT at 20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zbe-1 at 20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-2 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zbe-2 at 20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-3 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zbe-4 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zbe-4 at 2010-02-07-11:19:09 dedup - - rpool/zones/buildzone/ROOT/zbe-4 at 2010-02-20-15:59:27 dedup - - rpool/zones/buildzone/ROOT/zbe-4 at 20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-4 at 2010-03-11-19:18:56 dedup - - rpool/zones/dev dedup off inherited from rpool rpool/zones/dev at 20100227 dedup - - rpool/zones/dev/ROOT dedup off inherited from rpool rpool/zones/dev/ROOT at 20100227 dedup - - rpool/zones/dev/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/dev/ROOT/zbe-1 at 20100227 dedup - - rpool/zones/dev/ROOT/zbe-2 dedup off inherited from rpool rpool/zones/dev/ROOT/zbe-2 at 20100227 dedup - - rpool/zones/dev/ROOT/zbe-3 dedup off inherited from rpool rpool/zones/dev/ROOT/zbe-4 dedup off inherited from rpool rpool/zones/dev/ROOT/zbe-4 at 2010-02-07-11:19:11 dedup - - rpool/zones/dev/ROOT/zbe-4 at 2010-02-20-15:59:29 dedup - - rpool/zones/dev/ROOT/zbe-4 at 20100227 dedup - - rpool/zones/dev/ROOT/zbe-4 at 2010-03-11-19:18:58 dedup - - rpool/zones/dns dedup off inherited from rpool rpool/zones/dns at 20100227 dedup - - rpool/zones/dns/ROOT dedup off inherited from rpool rpool/zones/dns/ROOT at 20100227 dedup - - rpool/zones/dns/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/dns/ROOT/zbe-1 at 20100227 dedup - - rpool/zones/dns/ROOT/zbe-2 dedup off inherited from rpool rpool/zones/dns/ROOT/zbe-2 at 20100227 dedup - - rpool/zones/dns/ROOT/zbe-3 dedup off inherited from rpool rpool/zones/dns/ROOT/zbe-4 dedup off inherited from rpool rpool/zones/dns/ROOT/zbe-4 at 2010-02-07-11:19:10 dedup - - rpool/zones/dns/ROOT/zbe-4 at 2010-02-20-15:59:28 dedup - - rpool/zones/dns/ROOT/zbe-4 at 20100227 dedup - - rpool/zones/dns/ROOT/zbe-4 at 2010-03-11-19:18:57 dedup - - rpool/zones/gate dedup off inherited from rpool rpool/zones/gate/ROOT dedup off inherited from rpool rpool/zones/gate/ROOT/zbe dedup off inherited from rpool rpool/zones/gate/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/gate/ROOT/zbe-1 at 2010-03-11-19:19:01 dedup - - rpool/zones/mailzone dedup off inherited from rpool rpool/zones/mailzone at 20100227 dedup - - rpool/zones/mailzone/ROOT dedup off inherited from rpool rpool/zones/mailzone/ROOT at 20100227 dedup - - rpool/zones/mailzone/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/mailzone/ROOT/zbe-1 at 20100227 dedup - - rpool/zones/mailzone/ROOT/zbe-2 dedup off inherited from rpool rpool/zones/mailzone/ROOT/zbe-2 at 20100227 dedup - - rpool/zones/mailzone/ROOT/zbe-3 dedup off inherited from rpool rpool/zones/mailzone/ROOT/zbe-4 dedup off inherited from rpool rpool/zones/mailzone/ROOT/zbe-4 at 2010-02-07-11:19:08 dedup - - rpool/zones/mailzone/ROOT/zbe-4 at 2010-02-20-15:59:25 dedup - - rpool/zones/mailzone/ROOT/zbe-4 at 20100227 dedup - - rpool/zones/mailzone/ROOT/zbe-4 at 2010-03-11-19:18:54 dedup - - rpool/zones/punch dedup off inherited from rpool rpool/zones/punch at 20100227 dedup - - rpool/zones/punch/ROOT dedup off inherited from rpool rpool/zones/punch/ROOT at 20100227 dedup - - rpool/zones/punch/ROOT/zbe dedup off inherited from rpool rpool/zones/punch/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/punch/ROOT/zbe-1 at 20100227 dedup - - rpool/zones/punch/ROOT/zbe-1 at 2010-03-11-19:18:59 dedup - - rpool/zones/webzone dedup off inherited from rpool rpool/zones/webzone at 20100227 dedup - - rpool/zones/webzone/ROOT dedup off inherited from rpool rpool/zones/webzone/ROOT at 20100227 dedup - - rpool/zones/webzone/ROOT/zbe-2 dedup off inherited from rpool rpool/zones/webzone/ROOT/zbe-2 at 20100227 dedup - - rpool/zones/webzone/ROOT/zbe-3 dedup off inherited from rpool rpool/zones/webzone/ROOT/zbe-3 at 20100227 dedup - - rpool/zones/webzone/ROOT/zbe-4 dedup off inherited from rpool rpool/zones/webzone/ROOT/zbe-5 dedup off inherited from rpool rpool/zones/webzone/ROOT/zbe-5 at 2010-02-07-11:19:07 dedup - - rpool/zones/webzone/ROOT/zbe-5 at 2010-02-20-15:59:24 dedup - - rpool/zones/webzone/ROOT/zbe-5 at 20100227 dedup - - rpool/zones/webzone/ROOT/zbe-5 at 2010-03-11-19:18:53 dedup - -
Someone correct me if I''m wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that''s already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. -- This message posted from opensolaris.org
On 16 mrt 2010, at 19:48, valrhona at gmail.com wrote:> Someone correct me if I''m wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that''s already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio.The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. Paul> -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 16 mrt 2010, at 19:48, valrhona at gmail.com wrote:> Someone correct me if I''m wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that''s already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio.The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. Paul> -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 3/17/10 1:21 AM, Paul van der Zwan wrote:> > On 16 mrt 2010, at 19:48, valrhona at gmail.com wrote: > >> Someone correct me if I''m wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that''s already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. > > The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the > dedupratio or it used a method that is giving unexpected results. > > Paulbeadm list -a and/or other snapshots that were taken before turning off dedup?
On 17 mrt 2010, at 10:56, zfs ml wrote:> On 3/17/10 1:21 AM, Paul van der Zwan wrote: >> >> On 16 mrt 2010, at 19:48, valrhona at gmail.com wrote: >> >>> Someone correct me if I''m wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that''s already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. >> >> The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the >> dedupratio or it used a method that is giving unexpected results. >> >> Paul > > beadm list -a > and/or other snapshots that were taken before turning off dedup?Possibly but that should not matter. If I triple the amount of data in the pool, with dedup switch off, the dedupratio should IMHO change because the amount of non-deduped data has changed. Paul
Hello, On 17 mar 2010, at 16.22, Paul van der Zwan <Paul.Vanderzwan at Sun.COM> wrote:> > On 16 mrt 2010, at 19:48, valrhona at gmail.com wrote: > >> Someone correct me if I''m wrong, but it could just be a >> coincidence. That is, perhaps the data that you copied happens to >> lead to a dedup ratio relative to the data that''s already on there. >> You could test this out by copying a few gigabytes of data you know >> is unique (like maybe a DVD video file or something), and that >> should change the dedup ratio. > > The first copy of that data was unique and even dedup is switched > off for the entire pool so it seems a bug in the calculation of the > dedupratio or it used a method that is giving unexpected results.I wonder if the dedup ratio is calculated by the contents of the DDT or by all the data contents of the whole pool, i''we only looked at the ratio for datasets which had dedup on for the whole lifetime. If the former, data added when it''s switched off will never alter the ratio (until rewritten when with dedup on). The source should have the answer, but i''m on mail only for a few weeks. It''a probably for the whole dataset, that makes the most sense, just a thought. Regards Henrik http://sparcv9.blogspot.com
On 18 mrt 2010, at 10:07, Henrik Johansson wrote:> Hello, > > On 17 mar 2010, at 16.22, Paul van der Zwan <Paul.Vanderzwan at Sun.COM> wrote: > >> >> On 16 mrt 2010, at 19:48, valrhona at gmail.com wrote: >> >>> Someone correct me if I''m wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that''s already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. >> >> The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the >> dedupratio or it used a method that is giving unexpected results. > > I wonder if the dedup ratio is calculated by the contents of the DDT or by all the data contents of the whole pool, i''we only looked at the ratio for datasets which had dedup on for the whole lifetime. If the former, data added when it''s switched off will never alter the ratio (until rewritten when with dedup on). The source should have the answer, but i''m on mail only for a few weeks. > > It''a probably for the whole dataset, that makes the most sense, just a thought. >It looks like the ratio only gets updated when dedup is switched on and freezes if you switch dedup off for the entire pool, like I did. I tried to have a look at the source but it was way too complex to figure it out in the time I had available so far. Best regards, Paul van der Zwan Sun Microsystems Nederland> Regards > > Henrik > http://sparcv9.blogspot.com
I remembered reading a post about this a couple of months back. This post by Jeff Bonwick confirms that the dedupratio is calculated only on the data that you''ve attempted to deduplicate, i.e. only the data written whilst dedup is turned on - http://mail.opensolaris.org/pipermail/zfs-discuss/2009-December/034721.html. Regards, Craig -- This message posted from opensolaris.org
On 18 mar 2010, at 18.38, Craig Alder <craig.alder at sun.com> wrote:> I remembered reading a post about this a couple of months back. > This post by Jeff Bonwick confirms that the dedupratio is calculated > only on the data that you''ve attempted to deduplicate, i.e. only the > data written whilst dedup is turned on - http://mail.opensolaris.org/pipermail/zfs-discuss/2009-December/034721.html > .Ah, I was on the right track then with the DDT then :) guess most people have it turned on/off from the begining until BP rewrite to ensure everything is deduplicated(which is probably a good idea). Regards Henrik http://sparcv9.blogspot.com
As noted, the ratio caclulation applies over the data attempted to dedup, not the whole pool. However, I saw a commit go by just in the last couple of days about the dedupratio calculation being misleading, though I didn''t check the details. Presumably this will be reported differently from the next builds. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100319/2a000872/attachment.bin>