Hello, I''ve stumbled upon this article: http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/ Reportedly Sandforce SF1200 SSD controller does internally block-level data de-duplication. This effectively removes the additional protection given by writing multiple metadata copies. This technique may be used, or can be used in the future by manufactureres of other drives too. I would like to ask, if the metadata copies written to a btrfs system with enabled metadata mirroring are identical, or is there something that makes them unique on-disk, therefore preventing their de-duplication. I tried googling for the answer, but didn''t net anything that would answer my question. If the metadata copies are identical, I''d like to ask if it would be possible to change this without major disruption? I know that changes to on-disk format aren''t a thing made lightly, but I''d be grateful for any comments. The increase of the risk of file system corruption introduced by data de-duplication on Sandforce controllers was down-played in the vendor''s reply included in the article, but still, what''s the point of duplicating metadata on file system level, if storage below can remove that redundancy? Regards, Paweł -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Saturday 09 of July 2011 08:19:30 Paweł Brodacki wrote:> Hello, > > I''ve stumbled upon this article: > http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/ > > Reportedly Sandforce SF1200 SSD controller does internally block-level > data de-duplication. This effectively removes the additional > protection given by writing multiple metadata copies. This technique > may be used, or can be used in the future by manufactureres of other > drives too.Only a problem in a single disk installation> I would like to ask, if the metadata copies written to a btrfs system > with enabled metadata mirroring are identical, or is there something > that makes them unique on-disk, therefore preventing their > de-duplication. I tried googling for the answer, but didn''t net > anything that would answer my question.There is a difference between root inode copies, don''t think there''s any difference between metadata tree copies. I''m quite certain they are bit for bit identical.> If the metadata copies are identical, I''d like to ask if it would be > possible to change this without major disruption? I know that changes > to on-disk format aren''t a thing made lightly, but I''d be grateful for > any comments.That would be a big change for little to no benefit.> The increase of the risk of file system corruption introduced by data > de-duplication on Sandforce controllers was down-played in the > vendor''s reply included in the article, but still, what''s the point of > duplicating metadata on file system level, if storage below can remove > that redundancy?You shouldn''t depend on single drive, metadata raid is there to protect against single bad blocks, not disk crash. If you want redundancy, use mulitple disks. Either HDD or SSD. And have readable backups. Regards, Hubert
On Sat, 2011-07-09 at 08:19 +0200, Paweł Brodacki wrote:> Hello, > > I''ve stumbled upon this article: > http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/ > > Reportedly Sandforce SF1200 SSD controller does internally block-level > data de-duplication. This effectively removes the additional > protection given by writing multiple metadata copies. This technique > may be used, or can be used in the future by manufactureres of other > drives too. > > I would like to ask, if the metadata copies written to a btrfs system > with enabled metadata mirroring are identical, or is there something > that makes them unique on-disk, therefore preventing their > de-duplication. I tried googling for the answer, but didn''t net > anything that would answer my question. > > If the metadata copies are identical, I''d like to ask if it would be > possible to change this without major disruption? I know that changes > to on-disk format aren''t a thing made lightly, but I''d be grateful for > any comments. > > The increase of the risk of file system corruption introduced by data > de-duplication on Sandforce controllers was down-played in the > vendor''s reply included in the article, but still, what''s the point of > duplicating metadata on file system level, if storage below can remove > that redundancy? > > Regards, > PawełHello, Sorry I add my 0.03$. It is possible to workaround it by using encryption. If something other then ebc is used the identical elements in unecrypted mode are stored as different on hdd. The drawbacks: - Encryption overhead (you may want to use non-secure mode as you''re not interested in security) - There is avalanche effect (whole [encryption] block gets corrupted even if one bit of block is corrupted). Regards
On Mon, 19 Sep 2011, 06:15:51 EST, Hubert Kario <hubert@kario.pl> wrote:> You shouldn''t depend on single drive, metadata > raid is there to protect against single bad > blocks, not disk crash.I guess the issue here is you no longer even have that protection with this sort of dedup. cheers, Chris -- Chris Samuel - http://www.csamuel.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html