I''m running zpool version 23 (via ZFS fuse on Linux) and have a zpool with deduplication turned on. I am testing how well deduplication will work for the storage of many, similar ISO files and so far am seeing unexpected results (or perhaps my expectations are wrong). The ISO''s I''m testing with are the 32-bit and 64-bit versions of the RHEL5 DVD ISO''s. While both have their differences, they do contain a lot of similar data as well. If I explode both ISO files and copy them to my ZFS filesystem I see about a 1.24x dedup ratio. However, if I have only the ISO files on the ZFS filesystem, the ratio is 1.00x -- no savings at all. Does this make sense? I''m going to experiment with other combinations of ISO files as well... Thanks, Ray
On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson <rvandolson at esri.com> wrote:> The ISO''s I''m testing with are the 32-bit and 64-bit versions of the > RHEL5 DVD ISO''s. ?While both have their differences, they do contain a > lot of similar data as well.Similar != identical. Dedup works on blocks in zfs, so unless the iso files have identical data aligned at 128k boundaries you won''t see any savings.> If I explode both ISO files and copy them to my ZFS filesystem I see > about a 1.24x dedup ratio.Each file starts a new block, so the identical files can be deduped. -B -- Brandon High : bhigh at freaks.com
On Fri, Jun 04, 2010 at 11:16:40AM -0700, Brandon High wrote:> On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson <rvandolson at esri.com> wrote: > > The ISO''s I''m testing with are the 32-bit and 64-bit versions of the > > RHEL5 DVD ISO''s. ?While both have their differences, they do contain a > > lot of similar data as well. > > Similar != identical. > > Dedup works on blocks in zfs, so unless the iso files have identical > data aligned at 128k boundaries you won''t see any savings. > > > If I explode both ISO files and copy them to my ZFS filesystem I see > > about a 1.24x dedup ratio. > > Each file starts a new block, so the identical files can be deduped. > > -BMakes sense. So, as someone else suggested, decreasing my block size may improve the deduplication ratio. recordsize I presume is the value to tweak? Thanks, Ray
> Makes sense. So, as someone else suggested, decreasing my block size > may improve the deduplication ratio. > > recordsize I presume is the value to tweak?It is, but keep in mind that zfs will need about 150 bytes for each block. 1TB with 128k blocks will need about 1GB memory for the index to stay in RAM. 64k blocks, the double, et cetera... l2arc will help a lot if memory is low Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Fri, Jun 04, 2010 at 12:37:01PM -0700, Ray Van Dolson wrote:> On Fri, Jun 04, 2010 at 11:16:40AM -0700, Brandon High wrote: > > On Fri, Jun 4, 2010 at 9:30 AM, Ray Van Dolson <rvandolson at esri.com> wrote: > > > The ISO''s I''m testing with are the 32-bit and 64-bit versions of the > > > RHEL5 DVD ISO''s. ?While both have their differences, they do contain a > > > lot of similar data as well. > > > > Similar != identical. > > > > Dedup works on blocks in zfs, so unless the iso files have identical > > data aligned at 128k boundaries you won''t see any savings. > > > > > If I explode both ISO files and copy them to my ZFS filesystem I see > > > about a 1.24x dedup ratio. > > > > Each file starts a new block, so the identical files can be deduped. > > > > -B > > Makes sense. So, as someone else suggested, decreasing my block size > may improve the deduplication ratio. > > recordsize I presume is the value to tweak?Yes, but I''d not expect that much commonality between 32-bit and 64-bit Linux ISOs... Do the same check again with the ISOs "exploded", as you say. Nico --
On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson <rvandolson at esri.com> wrote:> Makes sense. ?So, as someone else suggested, decreasing my block size > may improve the deduplication ratio.It might. It might make your performance tank, too. Decreasing the block size increases the size of the dedup table (DDT). Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT gets too large to fit in memory, it will have to be read from disk, which will destroy any sort of write performance (although a L2ARC on SSD can help) If you move to 64k blocks, you''ll double the DDT size and may not actually increase your ratio. Moving to 8k blocks will increase your DDT by a factor of 16, and still may not help. Changing the recordsize will not affect files that are already in the dataset. You''ll have to recopy them to re-write with the smaller block size. -B -- Brandon High : bhigh at freaks.com
On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote:> On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson <rvandolson at esri.com> wrote: > > Makes sense. ?So, as someone else suggested, decreasing my block size > > may improve the deduplication ratio. > > It might. It might make your performance tank, too. > > Decreasing the block size increases the size of the dedup table (DDT). > Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT > gets too large to fit in memory, it will have to be read from disk, > which will destroy any sort of write performance (although a L2ARC on > SSD can help) > > If you move to 64k blocks, you''ll double the DDT size and may not > actually increase your ratio. Moving to 8k blocks will increase your > DDT by a factor of 16, and still may not help. > > Changing the recordsize will not affect files that are already in the > dataset. You''ll have to recopy them to re-write with the smaller block > size. > > -BGotcha. Just trying to make sure I understand how all this works, and if I _would_ in fact see an improvement in dedupe-ratio by tweaking the recordsize with our data-set. Once we know that we can decide if it''s worth the extra costs in RAM/L2ARC. Thanks all. Ray
On 05.06.10 00:10, Ray Van Dolson wrote:> On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote: >> On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson <rvandolson at esri.com> wrote: >>> Makes sense. So, as someone else suggested, decreasing my block size >>> may improve the deduplication ratio. >> It might. It might make your performance tank, too. >> >> Decreasing the block size increases the size of the dedup table (DDT). >> Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT >> gets too large to fit in memory, it will have to be read from disk, >> which will destroy any sort of write performance (although a L2ARC on >> SSD can help) >> >> If you move to 64k blocks, you''ll double the DDT size and may not >> actually increase your ratio. Moving to 8k blocks will increase your >> DDT by a factor of 16, and still may not help. >> >> Changing the recordsize will not affect files that are already in the >> dataset. You''ll have to recopy them to re-write with the smaller block >> size. >> >> -B > > Gotcha. Just trying to make sure I understand how all this works, and > if I _would_ in fact see an improvement in dedupe-ratio by tweaking the > recordsize with our data-set. >You can use zdb -S to assess how effective deduplication can be without actually turning it on your pool. regards victor
----- "Brandon High" <bhigh at freaks.com> skrev:> Decreasing the block size increases the size of the dedup table > (DDT). > Every entry in the DDT uses somewhere around 250-270 bytes.Are you sure it''s that high? I was told it''s ~150 per block, or ~1,2GB per terabytes of storage with only 128k blocks Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Sun, Jun 6, 2010 at 3:26 AM, Roy Sigurd Karlsbakk <roy at karlsbakk.net> wrote:> ----- "Brandon High" <bhigh at freaks.com> skrev: >> Decreasing the block size increases the size of the dedup table >> (DDT). >> Every entry in the DDT uses somewhere around 250-270 bytes. > > Are you sure it''s that high? I was told it''s ~150 per block, or ~1,2GB per terabytes of storage with only 128k blocksNo, that''s the number that stuck in my head though. Even if the DDT entry is smaller, the point I was making doesn''t change. A smaller record size will increase dedup performance at the cost of a larger DDT. Once the DDT is too large to fit in the ARC, performance tanks. -B -- Brandon High : bhigh at freaks.com
----- "Brandon High" <bhigh at freaks.com> skrev:> On Sun, Jun 6, 2010 at 3:26 AM, Roy Sigurd Karlsbakk > <roy at karlsbakk.net> wrote: > > ----- "Brandon High" <bhigh at freaks.com> skrev: > >> Decreasing the block size increases the size of the dedup table > >> (DDT). > >> Every entry in the DDT uses somewhere around 250-270 bytes. > > > > Are you sure it''s that high? I was told it''s ~150 per block, or > ~1,2GB per terabytes of storage with only 128k blocks > > No, that''s the number that stuck in my head though. > > Even if the DDT entry is smaller, the point I was making doesn''t > change. A smaller record size will increase dedup performance at the > cost of a larger DDT. Once the DDT is too large to fit in the ARC, > performance tanks.I cannot but agree with that - if you want to use dedup, (a) wait for the next release (since there are several rather bad bugs in 134) and (b) get a truckload of SSDs for L2ARC to make the system usable. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Sun, Jun 6, 2010 at 10:46 AM, Brandon High <bhigh at freaks.com> wrote:> No, that''s the number that stuck in my head though.Here''s a reference from Richard Elling: (http://mail.opensolaris.org/pipermail/zfs-discuss/2010-March/038018.html) "Around 270 bytes, or one 512 byte sector." -B -- Brandon High : bhigh at freaks.com
----- "Brandon High" <bhigh at freaks.com> skrev:> On Sun, Jun 6, 2010 at 10:46 AM, Brandon High <bhigh at freaks.com> > wrote: > > No, that''s the number that stuck in my head though. > > Here''s a reference from Richard Elling: > (http://mail.opensolaris.org/pipermail/zfs-discuss/2010-March/038018.html) > "Around 270 bytes, or one 512 byte sector."I guess this then means you''ll need to change the 1GiB per 1TiB deduplicated to 2GiB per 1TiB and way more with smaller blocks... Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
----- "Ray Van Dolson" <rvandolson at esri.com> skrev:> FYI; > > With 4K recordsize, I am seeing 1.26x dedupe ratio between the RHEL > 5.4 > ISO and the RHEL 5.5 ISO file. > > However, it took about 33 minutes to copy the 2.9GB ISO file onto the > filesystem. :) Definitely would need more RAM in this setup... > > RayWith 4k recordsize, you won''t have enough memory slots for the memory. Grab a few X25-Ms or something to do the buffering Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Fri, Jun 04, 2010 at 01:10:44PM -0700, Ray Van Dolson wrote:> On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote: > > On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson <rvandolson at esri.com> wrote: > > > Makes sense. ?So, as someone else suggested, decreasing my block size > > > may improve the deduplication ratio. > > > > It might. It might make your performance tank, too. > > > > Decreasing the block size increases the size of the dedup table (DDT). > > Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT > > gets too large to fit in memory, it will have to be read from disk, > > which will destroy any sort of write performance (although a L2ARC on > > SSD can help) > > > > If you move to 64k blocks, you''ll double the DDT size and may not > > actually increase your ratio. Moving to 8k blocks will increase your > > DDT by a factor of 16, and still may not help. > > > > Changing the recordsize will not affect files that are already in the > > dataset. You''ll have to recopy them to re-write with the smaller block > > size. > > > > -B > > Gotcha. Just trying to make sure I understand how all this works, and > if I _would_ in fact see an improvement in dedupe-ratio by tweaking the > recordsize with our data-set. > > Once we know that we can decide if it''s worth the extra costs in > RAM/L2ARC. > > Thanks all.FYI; With 4K recordsize, I am seeing 1.26x dedupe ratio between the RHEL 5.4 ISO and the RHEL 5.5 ISO file. However, it took about 33 minutes to copy the 2.9GB ISO file onto the filesystem. :) Definitely would need more RAM in this setup... Ray