Is it inappropriate to post a relevant news item to this list? Please correct me off-list if so. http://blogs.sun.com/bonwick/en_US/entry/zfs_dedup -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi!> Is it inappropriate to post a relevant news item to this list? Please > correct me off-list if so. > > http://blogs.sun.com/bonwick/en_US/entry/zfs_dedupIn any way thats a nice and interesting feature, thanks! :) With best regards from the Soul, Alex. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 3 Nov 2009 07:51:52 am Alex Dedul wrote:> In any way thats a nice and interesting feature, thanks! :)My concern would be that it increases the impact of a corruption of the block that has been de-dup''d - in other words if the block that now represents the same data in lots of files gets trashed then all those files have corrupt data. This implies that you would want to keep around at least one other copy of that data to be resilient in the face of corruption (checksums will let you detect it, but not necessarily recover from it without duplicate copies). Given that ZFS prides itself on detecting errors it would be strange if they hadn''t considered this in the implementation but I couldn''t see any mention of it. cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC This email may come with a PGP signature as a file. Do not panic. For more info see: http://en.wikipedia.org/wiki/OpenPGP
I don''t think ZFS has any goals of being resilient to errors as far as recovering from them without backups / redundant disks.... Errors would still be detected though more data of course would be unavailable potentially. On Nov 2, 2009, at 6:04 PM, Chris Samuel wrote:> On Tue, 3 Nov 2009 07:51:52 am Alex Dedul wrote: > >> In any way thats a nice and interesting feature, thanks! :) > > My concern would be that it increases the impact of a corruption of > the block > that has been de-dup''d - in other words if the block that now > represents the > same data in lots of files gets trashed then all those files have > corrupt > data. This implies that you would want to keep around at least one > other copy > of that data to be resilient in the face of corruption (checksums > will let you > detect it, but not necessarily recover from it without duplicate > copies). > > Given that ZFS prides itself on detecting errors it would be strange > if they > hadn''t considered this in the implementation but I couldn''t see any > mention of > it. > > cheers, > Chris-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 2, 2009 at 3:04 PM, Chris Samuel <chris@csamuel.org> wrote:> On Tue, 3 Nov 2009 07:51:52 am Alex Dedul wrote: > > > In any way thats a nice and interesting feature, thanks! :) > > My concern would be that it increases the impact of a corruption of the block > that has been de-dup''d - in other words if the block that now represents the > same data in lots of files gets trashed then all those files have corrupt > data. This implies that you would want to keep around at least one other copy > of that data to be resilient in the face of corruption (checksums will let you > detect it, but not necessarily recover from it without duplicate copies). > > Given that ZFS prides itself on detecting errors it would be strange if they > hadn''t considered this in the implementation but I couldn''t see any mention of > it.In the PSARC description and mailing list thread, there''s mention of a setting for how many references before another copy of the block is kept. The default was going to be 100. So after 100 references to the same block, a second copy of the block would be stored on disk. After 200 references, a third copy of the block would be stored on disk. And so on. There was some confusion on whether or not the 100 would be the minimum or the maximum, though. Either way, the admin sets the policy for how many extra copies to keep around. And it would play nicely with the "copies=X" setting as well (ie if copies=2, then you''d start with 2 copies of each deduped block, and after 100 references, you''d store 2 more copies on disk, and so on). -- Freddie Cash fjwcash@gmail.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html