I noticed in the compression support that the checksum is over the uncompressed data. While this has the advantages that the checksum does not have to be changed as transformations are changed and the system might catch errors in the compression layer, this design decision will be problematic if/when encryption is supported: Plaintext checksums would leak substantial amounts of information about the content of files. The system could be switched to a keyed cryptographic hash, but then you will have made the "checksum" part of the file system intimately tied to the cryptographic part (including having to deal with key management, not being able to check blocks with keys that are currently unavailable which would break automated scrubbing), and a potential source of security problems. I think there is currently enough space to store per-block a 64 bit checksum for integrity, a 64bit nonce for uniqueness, and a 128bit cryptographic hash for authentication. A minor additional point, by applying the checksum before other transformations you lose the straightforward algebraic relationship with the disk bits and check data. One advantage checksums and RS codes have over cryptographic hashes is that advanced recovery tools could be created which utilize all available data (multiple mirror blocks, raid data, known corrupted sectors from the disk, and multiple check) to provide maximum likelihood decodes, for cases where known-perfect decoding is not possible, and the data can''t be otherwise replaced. I don''t know if anyone will bother building such tools, but maybe someone cares. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 06 November 2008, Gregory Maxwell wrote:> > While this has the advantages that the checksum does not have to be > changed as transformations are changed and the system might catch > errors in the compression layer, this design decision will be > problematic if/when encryption is supported: Plaintext checksums > would leak substantial amounts of information about the content of > files. The system could be switched to a keyed cryptographic hash,Indeed. The most obvious (and quite trivial) attack one can do is build a huge database of checksums for known files or chunks of files. AFAIK this has already been done by law enforcement/security agencies to detect "illegal" files, so it''s definitely an issue that would affect any future encryption code implemented in btrfs. Regards Cláudio -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thanks for entertaining my comments. On Thu, Nov 6, 2008 at 9:40 AM, Chris Mason <chris.mason@oracle.com> wrote:> For encryption we have a few choices. We can checksum the encrypted > blocks and disallow compression, or we can use a stronger checksum.[snip] The first sounds viable, although mutually exclusive features aren''t very user-friendly. The latter would need to be a probably need to be a secret-keyed HMAC to prevent watermarking attacks and information leakage, I don''t think simply salting with the sector number or other public information would be sufficient: I could still take your encrypted disk and answer the question "Does Chris have block X?" by checking the checksums. But if the checksum is a keyed transform we can''t sweep the file-system for bit-rot without having all the keys available. I expect the common use case for encrypted btrfs is similar to eCryptfs where each user may have their own keys which are only available when they are logged in (single key whole disk encryption can still be easily done with a dmcrypt). So having all the keys at once may not be especially realistic. But being able to scan for failing blocks is important in catching errors before there is too much damage. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 06 novembre 2008 à 09:58, Gregory Maxwell a écrit:> The latter would need to be a probably need to be a secret-keyed HMAC > to prevent watermarking attacks and information leakage, [...]Dm-crypt on every disks seems a good alternative, doesn''t it ? You would use dm-crypt for your swap anyway. Did I miss something ? -- Xavier Nicollet -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 6, 2008 at 10:27 AM, Xavier Nicollet <nicollet@jeru.org> wrote:> Le 06 novembre 2008 à 09:58, Gregory Maxwell a écrit: >> The latter would need to be a probably need to be a secret-keyed HMAC >> to prevent watermarking attacks and information leakage, [...] > > Dm-crypt on every disks seems a good alternative, doesn't it ? > You would use dm-crypt for your swap anyway. > > Did I miss something ?Dmcrypt is fine but a rather blunt tool: It's all or nothing, and only a single key. It also can not store a unique nonce per block update, which may create some (theoretical) security weaknesses. The whole thing will need to be mounted with keys in memory even when you only care about a few files. (so someone who gains access to the system could access high security files even if the system was just being used for web-browsing at the time) With a more intelligent you could have per-subvolume keying, or even better per-file allowing the encrypted filesystem to contain a mix of files with differing security classes. Take a look at http://ecryptfs.sourceforge.net/ for an example of a more-sophisticated filesystem encryption feature set. At the least I think it would be useful if btrfs provided dmcrypt functionality per subvolume, though full ecryptfs level functionality would be quite interesting. NrybXǧv^){.n+{n߲)w*jgݢj/zޖ2ޙ&)ߡaGhj:+vw٥
* Gregory Maxwell:> I noticed in the compression support that the checksum is over the > uncompressed data. > > While this has the advantages that the checksum does not have to be > changed as transformations are changed and the system might catch > errors in the compression layer, this design decision will be > problematic if/when encryption is supported: Plaintext checksums > would leak substantial amounts of information about the content of > files.Would this be an issue if metadata (including file names) were encrypted as well? -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 6, 2008 at 11:31 AM, Florian Weimer <fweimer@bfk.de> wrote:> * Gregory Maxwell: > >> I noticed in the compression support that the checksum is over the >> uncompressed data. >> >> While this has the advantages that the checksum does not have to be >> changed as transformations are changed and the system might catch >> errors in the compression layer, this design decision will be >> problematic if/when encryption is supported: Plaintext checksums >> would leak substantial amounts of information about the content of >> files. > > Would this be an issue if metadata (including file names) were > encrypted as well?If the checksum is encrypted, no, at least not obviously. But as I''ve mentioned if the checksum is encrypted then you can''t scrub the FS for checksum errors without the keys. A lack of metadata encryption would be another possible information leak, but at least it''s one which can probably be understood by users. It would be nice if subvolume level encryption provided metadata encryption. If metadata is encrypted but block checksums are unencrypted and on plaintext then information will leak about the files, even if the checksum is replaced with an unkeyed or non-secret-keyed cryptographic hash, a secret-keyed hash is equivalent to an encrypted checksum. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 06, 2008 at 12:15:12PM +0000, Claudio Martins spake thusly:> AFAIK this has already been done by law enforcement/security agencies to > detect "illegal" files, so it''s definitely an issue that would affect any > future encryption code implemented in btrfs.Indeed, it has: http://www.schneier.com/blog/archives/2008/11/us_court_rules.html -- Tracy Reed http://tracyreed.org
On Thu, 2008-11-06 at 09:40 -0500, Chris Mason wrote:> On Thu, 2008-11-06 at 01:34 -0500, Gregory Maxwell wrote: > > I noticed in the compression support that the checksum is over the > > uncompressed data. > > Thanks for looking things over, more eyes always helps. > > > > > While this has the advantages that the checksum does not have to be > > changed as transformations are changed and the system might catch > > errors in the compression layer, this design decision will be > > problematic if/when encryption is supported: Plaintext checksums > > would leak substantial amounts of information about the content of > > files. > > We checksum the uncompressed data because it allows us to layer other > transformations without confusing the code, and because the checksums > are strictly tied to logical offsets in the file. Additional metadata > would be required to do things differently. It''s possible but I''d > prefer not to introduce that complexity. >Just FYI, the new disk format I''ve pushed out checksums the data on disk instead of the uncompressed (or unencrypted) data. There are lots of tradeoffs here, but I think this is a much better system overall. Thanks for your feedback, it sparked me thinking of this now, before we tried to finalize the disk format. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html