Folks, Let''s say I have a volume being shared over iSCSI. The dedup has been turned on. Let''s say I copy the same file twice under different names at the initiator end. Let''s say each file ends up taking 5 blocks. For dedupe to work, each block for a file must match the corresponding block from the other file. Essentially, each pair of block being compared must have the same start location into the actual data. For a shared filesystem, ZFS may internally ensure that the block starts match. However, over iSCSI, the initiator does not even know about the whole block mechanism that zfs has. It is just sending raw bytes to the target. This makes me wonder if dedup actually works over iSCSI. Can someone please enlighten me on what I am missing? Thank you in advance for your help. Regards, Peter -- This message posted from opensolaris.org
On 10/22/10 15:34, Peter Taps wrote:> Folks, > > Let''s say I have a volume being shared over iSCSI. The dedup has been turned on. > > Let''s say I copy the same file twice under different names at the initiator end. Let''s say each file ends up taking 5 blocks. > > For dedupe to work, each block for a file must match the corresponding block from the other file. Essentially, each pair of block being compared must have the same start location into the actual data. >No, ZFS doesn''t care about the file offset, just that the checksum of the blocks matches.> For a shared filesystem, ZFS may internally ensure that the block starts match. However, over iSCSI, the initiator does not even know about the whole block mechanism that zfs has. It is just sending raw bytes to the target. This makes me wonder if dedup actually works over iSCSI. > > Can someone please enlighten me on what I am missing? > > Thank you in advance for your help. > > Regards, > Peter >
Hi Neil, if the file offset does not match, the chances that the checksum would match, especially sha256, is almost 0. May be I am missing something. Let''s say I have a file that contains 11 letters - ABCDEFGHIJK. Let''s say the block size is 5. For the first file, the block contents are "ABCDE," "FGHIJ", and "K." For the second file, let''s say the blocks are " ABCD", "EFGHI", and "JK." The chance that any checksum would match is very less. The chance that any "checksum+verify" would match is even less. Regards, Peter -- This message posted from opensolaris.org
On 10/22/10 17:28, Peter Taps wrote:> Hi Neil, > > if the file offset does not match, the chances that the checksum would match, especially sha256, is almost 0. > > May be I am missing something. Let''s say I have a file that contains 11 letters - ABCDEFGHIJK. Let''s say the block size is 5. > > For the first file, the block contents are "ABCDE," "FGHIJ", and "K." > > For the second file, let''s say the blocks are " ABCD", "EFGHI", and "JK." > > The chance that any checksum would match is very less. The chance that any "checksum+verify" would match is even less. > > Regards, > PeterThe block size and contents has to match for ZFS dedup. See http://blogs.sun.com/bonwick/entry/zfs_dedup Neil.
Neil Perrin wrote:> On 10/22/10 15:34, Peter Taps wrote: >> Folks, >> >> Let''s say I have a volume being shared over iSCSI. The dedup has been >> turned on. >> >> Let''s say I copy the same file twice under different names at the >> initiator end. Let''s say each file ends up taking 5 blocks. >> >> For dedupe to work, each block for a file must match the >> corresponding block from the other file. Essentially, each pair of >> block being compared must have the same start location into the >> actual data. >> > > No, ZFS doesn''t care about the file offset, just that the checksum of > the blocks matches. >One conclusion is that one should be careful not to mess up file alignments when working with large files (like you might have in virtualization scenarios). I.e. if you have a bunch of virtual machine image clones, they''ll dedupe quite well initially. However, if you then make seemingly minor changes inside some of those clones (like changing their partition offsets to do 1mb alignment), you''ll lose most or all of the dedupe benefits. General purpose compression tends to be less susceptible to changes in data offsets but also has its limits based on algorithm and dictionary size. I think dedupe can be viewed as a special-case of compression that happens to work quite well for certain workloads when given ample hardware resources (compared to what would be needed to run without dedupe).