The Validated Execution project is investigating how to utilize ZFS snapshots as the basis of a "validated" filesystem. Given that the blocks of the dataset form a Merkel tree of hashes, it seemed straightforward to validate the individual objects in the snapshot and then sign the hash of the root as a means of indicating that the contents of the dataset were validated. Unfortunately, the block hashes are used to assure the integrity of the physical representation of the dataset. Those hash values can be updated during scrub operations, or even during data error recovery, while the logical content of the dataset remains intact. This would invalidate the signature mechanism proposed above, even though the logical content remains undisturbed. We want to build on the data integrity given us by ZFS. However, we need some means of knowing that the dataset we are currently using is in fact the same snapshot that was validated earlier. We can''t use the name, since cloning, promotion, and renaming can lead to a different snapshot having the name under which the prior snapshot was validated. My attempt to forge a replacement snapshot stumbled over the creation time property, but that seems capable of duplication with minimal effort. Does the snapshot dataset include identity information? While a dataset index would be a help, is there perhaps a UUID generated when the snapshot is taken? With regard to the signing mechanism, it might be useful to be able to set properties on a snapshot. Since ZFS expressly prohibits this, how feasible would it be to provide for creation of a snapshot from a snapshot while setting a specific property on the child snapshot, thus avoiding the exposure to modification of the filesystem objects that cloning and snapshotting would entail? Thanks -JZ
John Zolnowsky x69422/408-404-5064 wrote:> The Validated Execution project is investigating how to utilize ZFS > snapshots as the basis of a "validated" filesystem. Given that the > blocks of the dataset form a Merkel tree of hashes, it seemed > straightforward to validate the individual objects in the snapshot and > then sign the hash of the root as a means of indicating that the > contents of the dataset were validated. > > Unfortunately, the block hashes are used to assure the integrity of the > physical representation of the dataset. Those hash values can be > updated during scrub operations, or even during data error recovery, > while the logical content of the dataset remains intact. This would > invalidate the signature mechanism proposed above, even though the > logical content remains undisturbed. > > We want to build on the data integrity given us by ZFS. However, we > need some means of knowing that the dataset we are currently using is > in fact the same snapshot that was validated earlier. We can''t use the > name, since cloning, promotion, and renaming can lead to a different > snapshot having the name under which the prior snapshot was validated. > My attempt to forge a replacement snapshot stumbled over the creation > time property, but that seems capable of duplication with minimal > effort. > > Does the snapshot dataset include identity information? While a dataset > index would be a help, is there perhaps a UUID generated when the > snapshot is taken?Each dataset including snapshots has a unique guid (uint64_t).> With regard to the signing mechanism, it might be useful to be able to > set properties on a snapshot. Since ZFS expressly prohibits this, how > feasible would it be to provide for creation of a snapshot from a > snapshot while setting a specific property on the child snapshot, thus > avoiding the exposure to modification of the filesystem objects that > cloning and snapshotting would entail?You *can* set some properties on a snapshot. In particular you can set "user" properties eg: # zfs set exec=off tank/fs at s cannot set property for ''tank/fs at s'': this property can not be modified for snapshots # zfs set valex:enable=on tank/fs at s # zfs get -r valex:enable tank/fs NAME PROPERTY VALUE SOURCE tank/fs valex:enable - - tank/fs at s valex:enable on local You can see there that valex:enable doesn''t exist on the original dataset but we did successfully set it on the snapshot. If user properties aren''t sufficient then you can control which properties can be set for a snapshot. For ZFS Crypto I had to ensure that the encryption property (which wouldn''t normally be "copied" to the snapshot was so that I can ensure clones have the same encryption value as the parent they came from). -- Darren J Moffat
> The Validated Execution project is investigating how to utilize ZFS > snapshots as the basis of a "validated" filesystem. Given that the > blocks of the dataset form a Merkel tree of hashes, it seemed > straightforward to validate the individual objects in the snapshot and > then sign the hash of the root as a means of indicating that the > contents of the dataset were validated.Yep, that would work.> Unfortunately, the block hashes are used to assure the integrity of the > physical representation of the dataset. Those hash values can be > updated during scrub operations, or even during data error recovery, > while the logical content of the dataset remains intact.Actually, that''s not true -- at least not today. Once you''ve taken a snapshot, the content will never change. Scrub, resilver, and self-heal operations repair damaged copies of data, but they don''t alter the data itself, and therefore don''t alter its checksum. This will change when we add support for block rewrite, which will allow us to do things like migrate data from one device to another, or to recompress existing data, which *will* affect the checksum. You may be able to tolerate this by simply precluding it, if you''re targeting a restricted environment. For example, do you need this feature for anything other than the root pool? Jeff