I've been thinking about how to best combine encryption and zfs on freebsd, and though I've seen some discussion about each individual option, I haven't found an explicit compare/contrast. I'm thinking of either encryption the devices directly via geli and then making a zfs pool of the foo.eli devices once they're "geli attach"ed (zfs on geli) or making a zfs pool of the unencrypted devices and providing a zvol from the pool as a raw device which I can then use geli to encrypt, and lay a ufs (or even another zfs) filesystem down on this (geli on zfs). While zfs on geli is less complex (in the sense that geli on zfs involves two layers of filesystems), I'm concerned as to whether encrypting the device will somehow affect zfs' ability to detect silent corruption, self-heal, or in any other way adversely affect zfs' functionality. In my mind, if I use geli on zfs, then I've got zfs directly on a device and the zvol it's providing will be transparently gaining the benefits of zfs' various features, providing a "safety layer" against device failure and silent corruption that I'm not sure if geli would detect. In a simpler sense, the question is this: if there is some damage to the device that zfs could putatively work around, if I were to have geli directly on that device, is it the case that geli would _not_ provide this protection and the device would be less robust than if it had had zfs directly on it? If this is not the case, then it seems clear to me that the less complicated zfs on geli route is the way to go, but I'm curious if anyone has any input on this. Thanks! Todd
> While zfs on geli is less complex (in the sense that geli on zfs > involves two layers of filesystems), I'm concerned as to whether > encrypting the device will somehow affect zfs' ability to detect > silent corruption, self-heal, or in any other way adversely affect > zfs' functionality. In my mind, if I use geli on zfs, then I've > got zfs directly on a device and the zvol it's providing will be > transparently gaining the benefits of zfs' various features, providing > a "safety layer" against device failure and silent corruption that > I'm not sure if geli would detect.These are very good questions - I ran ZFS on top of geli for a long time, and what I found was that when there were problems with the underlying discs, then geli would have problems and those would not be reported back to ZFS properly. I got lockups under those circumstances - when is witched to ZFS on top directly what I got were discs dropping out and ZFS properly continuing with the remaining drives. I never managed to characterise it well enougnh to file a PR I am afraid though - it only ever happened with failing hardware which made it hard to reproduce. -pete.
On Tue, Jun 28, 2011 at 8:45 PM, Todd Wasson <tsw5@duke.edu> wrote:> While zfs on geli is less complex (in the sense that geli on zfs involves > two layers of filesystems), I'm concerned as to whether encrypting the > device will somehow affect zfs' ability to detect silent corruption, > self-heal, or in any other way adversely affect zfs' functionality. ?In my > mind, if I use geli on zfs, then I've got zfs directly on a device and the > zvol it's providing will be transparently gaining the benefits of zfs' > various features, providing a "safety layer" against device failure and > silent corruption that I'm not sure if geli would detect.I'm going out on a limb here, but that's how I see it without intimate knowledge of the geom/geli and zfs code paths involved. Basically, you leave the transparent encryption of sectors to geli, and the integrity checking and repairing to zfs, and you should be *mostly* secure, except for the failure modes below. .eli devices behave just like normal disks, in the sense that they are a block device that transparently encrypts and decrypts sectors when they are accessed. So what could go wrong there: 1. single sectors may be corrupted on disk (e.g. bits flipping). 2. geli metadata (keys etc...) are destroyed. Same for glabel metadata. Case 1.) is probably harmless, because geli would return a corrupted sectors' content to zfs... which zfs will likely detect because it wouldn't checksum correctly. So zfs will correct it out of redundant storage, and write it back through a new encryption. BE CAREFUL: don't enable hmac integrity checks in geli, as that would prevent geli from returning corrupted data and would result in hangs! Case 2.) is a bigger problem. If a sector containing vital geli metadata (perhaps portions of keys?) gets corrupted, and geli had no way to detect and/or correct this (e.g. by using redundant sectors on the same .eli volume!), the whole .eli, or maybe some stripes out of it, could become useless. ZFS couldn't repair this at all... at least not automatically. You'll have to MANUALLY reformat the failed .eli device, and resilver it from zfs redundant storage later. There may be other failure modes involved as well. I don't know. But in most practical day to day uses, with enough redundancy and regular backups, a zfs-over-geli should be good enough. I wouldn't put {zfs,ufs}-over-geli-over-raw-zpool though, as this would involve considerable overhead, IMHO. In this case, I'd rather use a gmirror as a backend, as in a setup: {zfs,ufs}-over-geli-over-{gmirror,graid3} or something similar. But I've never tried this though. -cpghost. -- Cordula's Web. http://www.cordula.ws/
Thanks to both C. P. and Pete for your responses. Comments inline:> Case 1.) is probably harmless, because geli would return a > corrupted sectors' content to zfs... which zfs will likely detect > because it wouldn't checksum correctly. So zfs will correct it > out of redundant storage, and write it back through a new > encryption. BE CAREFUL: don't enable hmac integrity checks > in geli, as that would prevent geli from returning corrupted > data and would result in hangs!Perhaps the hmac integrity checks were related to the lack of reporting of problems back to zfs that Pete referred to? Maybe we need someone with more technical experience with the filesystem / disk access infrastructure to weigh in, but it still doesn't seem clear to me what the best option is.> Case 2.) is a bigger problem. If a sector containing vital > geli metadata (perhaps portions of keys?) gets corrupted, > and geli had no way to detect and/or correct this (e.g. by > using redundant sectors on the same .eli volume!), the whole > .eli, or maybe some stripes out of it, could become useless. > ZFS couldn't repair this at all... at least not automatically. > You'll have to MANUALLY reformat the failed .eli device, and > resilver it from zfs redundant storage later.This is precisely the kind of thing that made me think about putting zfs directly on the disks instead of geli... This, and other unknown issues that could crop up and are out of geli's ability to guard against.> There may be other failure modes involved as well. I don't know. > But in most practical day to day uses, with enough redundancy > and regular backups, a zfs-over-geli should be good enough.I understand the point here, but I'm specifically thinking about my backup server. As I understand it, part of the purpose of zfs is to be reliable enough to run on a backup server itself, given some redundancy as you say. Perhaps asking for encryption as well is asking too much (at least, unless zfs v30 with zfs-crypto ever gets open-sourced and ported) but I'd really like to maintain zfs' stability while also having an option for encryption.> I wouldn't put {zfs,ufs}-over-geli-over-raw-zpool though, as this > would involve considerable overhead, IMHO. In this case, I'd > rather use a gmirror as a backend, as in a setup: > {zfs,ufs}-over-geli-over-{gmirror,graid3} > or something similar. But I've never tried this though.I understand about the overhead, but I'm interested in using zfs via a zraid to avoid using gmirror or graid3, because of the benefits (detection of silent corruption, etc.) that you get with a zraid. I think your suggestion is a pretty good one in terms of performance/reliability tradeoff, though. In my specific case I'm more likely to pay a performance cost instead of a reliability cost, but only because my server spends most of its time hanging around idling, and throughput isn't really an issue. Thanks regardless, though. Todd