thr3ads.net - zfs discuss - [zfs-discuss] Some questions from a non-user [Jan 2006]

If this information is useful, please help other people find it:
Share via:

Wout Mertens

2006-Jan-19 16:41 UTC

[zfs-discuss] Some questions from a non-user

Hi everyone,

sorry if these questions have really obvious answers for people  
running zfs already. I promise to try it out myself at some point :)

Is it possible to query the stored checksum for a file? Ideally in  
some sort of read-only extended attribute accessible from scripts.  
This would make a tripwire-like tool very fast, and rsync more accurate.

--

Since everything is copy-on-write, and the checksums are stored,  
would it be possible to store common blocks just once?

--

Will encryption be supported at some point? Since there''s already  
compression, I think the only issue would be to get the key in the  
zfs layer in a secure way.

--

Is it possible to store a zfs filesystem in an auto-expanding file?  
Mac OS X has sparseimage files, which contain just the used blocks of  
your filesystem. This, combined with encryption, gives flexible  
secure home directories.


Thanks for any answers,

Wout.

Casper.Dik at Sun.COM

2006-Jan-19 17:28 UTC

head link

[zfs-discuss] Some questions from a non-user

>Is it possible to query the stored checksum for a file? Ideally in  
>some sort of read-only extended attribute accessible from scripts.  
>This would make a tripwire-like tool very fast, and rsync more accurate.
Not that I know
>Since everything is copy-on-write, and the checksums are stored,  
>would it be possible to store common blocks just once?
It would be, but it would require:
	- a master hash table where you can lookup <size, hash> -> block
	- a bitwise compare of the block to be stored.
>Will encryption be supported at some point? Since there''s already  
>compression, I think the only issue would be to get the key in the  
>zfs layer in a secure way.
It is planned.
>Is it possible to store a zfs filesystem in an auto-expanding file?  
>Mac OS X has sparseimage files, which contain just the used blocks of  
>your filesystem. This, combined with encryption, gives flexible  
>secure home directories.
But in a rather awkward implementation; if ZFS can encrypt at a different
level, then surely that is preferred.

Auto-expanding and copy on write would seem to be give you larger files
than you would want.

Casper

Darren J Moffat

2006-Jan-19 17:32 UTC

head link

[zfs-discuss] Some questions from a non-user

On Thu, 2006-01-19 at 16:41, Wout Mertens wrote:> Is it possible to query the stored checksum for a file? Ideally in  
> some sort of read-only extended attribute accessible from scripts.  
> This would make a tripwire-like tool very fast, and rsync more accurate.
This is RFE# 6259754.  Not implemented yes.
> Will encryption be supported at some point? Since there''s already
> compression, I think the only issue would be to get the key in the  
> zfs layer in a secure way.
Yes it is planned.  Key management is one of the hard problems
in this area.  Luckily we have the Solaris Cryptographic Framework
to help us out; this includes support for hardware security modules
and hardware acceleration.  Key management isn''t just about getting
the key there, thats actually the easy bit.  The hard bit is
getting the architecture correct so that we can support things
like secure delete on demand, secure delete at time N, Escrowed
keys, keys that allow backup but not restore etc etc.
> Is it possible to store a zfs filesystem in an auto-expanding file?  
> Mac OS X has sparseimage files, which contain just the used blocks of  
> your filesystem. This, combined with encryption, gives flexible  
> secure home directories.
If flexible secure home directories is what you want the ZFS
encryption will give you that.  Just create a separate ZFS filesystem
for each user.  Filesystems in ZFS are really really cheap and
easy to manage (since we aren''t tied to using /etc/vfstab and
/etc/dfs/dfstab to mount and share them).

-- 
Darren J Moffat

Robert Milkowski

2006-Jan-20 08:43 UTC

head link

[zfs-discuss] Some questions from a non-user

Hello Wout,

Thursday, January 19, 2006, 5:41:09 PM, you wrote:

WM> Hi everyone,


WM> Is it possible to query the stored checksum for a file? Ideally in  
WM> some sort of read-only extended attribute accessible from scripts.  
WM> This would make a tripwire-like tool very fast, and rsync more accurate.

I can see it could work for tripwire-like tools but I''m not sure it
would for rsync-like ''coz generally you don''t know what block
size
will be used after file is copied to other filesystem so checksum will
be actually for different block. But maybe I''m overlooking
something...

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                   http://milek.blogspot.com

Wout Mertens

2006-Jan-20 08:55 UTC

head link

[zfs-discuss] Some questions from a non-user

Hi Robert,

On 20 Jan 2006, at 09:43, Robert Milkowski wrote:
> WM> Is it possible to query the stored checksum for a file? Ideally in
> WM> some sort of read-only extended attribute accessible from scripts.
> WM> This would make a tripwire-like tool very fast, and rsync more  
> accurate.
>
> I can see it could work for tripwire-like tools but I''m not sure
it
> would for rsync-like ''coz generally you don''t know what
block size
> will be used after file is copied to other filesystem so checksum will
> be actually for different block. But maybe I''m overlooking
> something...
With "more accurate", I meant that instead of relying on mtime to see
if a file changed, you have the checksum available. Normally, the  
speed of checksumming is prohibitive, but with zfs, you get it for  
free with the data protection features!

Rsync would still have to do its own block-level checksumming on both  
sides for synchronizing files, but generally, the network is slower  
than the checksumming, so that''s ok.

A real smart rsync could use the zfs block checksums and ask the  
remote side for the checksums for the same blocks though, that would  
eliminate half of the work. It would come at the cost of teaching  
rsync about zfs internals, which seems to me like a very bad thing.

Wout.

Wout Mertens

2006-Jan-20 09:03 UTC

head link

[zfs-discuss] Some questions from a non-user

On 19 Jan 2006, at 18:32, Darren J Moffat wrote:
> On Thu, 2006-01-19 at 16:41, Wout Mertens wrote:
>> Is it possible to query the stored checksum for a file? Ideally in
>> some sort of read-only extended attribute accessible from scripts.
>> This would make a tripwire-like tool very fast, and rsync more  
>> accurate.
>
> This is RFE# 6259754.  Not implemented yes.
Is that the actual number? How many are before it? :-(
>> Will encryption be supported at some point? Since there''s
already
>> compression, I think the only issue would be to get the key in the
>> zfs layer in a secure way.
>
> Yes it is planned.  Key management is one of the hard problems
> in this area.  Luckily we have the Solaris Cryptographic Framework
> to help us out; this includes support for hardware security modules
> and hardware acceleration.  Key management isn''t just about
getting
> the key there, thats actually the easy bit.  The hard bit is
> getting the architecture correct so that we can support things
> like secure delete on demand, secure delete at time N, Escrowed
> keys, keys that allow backup but not restore etc etc.
That''ll teach me being an armchair programmer :) Can''t wait to
see
what you guys make of it, should be an interesting read...
>> Is it possible to store a zfs filesystem in an auto-expanding file?
>> Mac OS X has sparseimage files, which contain just the used blocks of
>> your filesystem. This, combined with encryption, gives flexible
>> secure home directories.
>
> If flexible secure home directories is what you want the ZFS
> encryption will give you that.  Just create a separate ZFS filesystem
> for each user.  Filesystems in ZFS are really really cheap and
> easy to manage (since we aren''t tied to using /etc/vfstab and
> /etc/dfs/dfstab to mount and share them).
Hmmm, very good point. The only thing you lose in this situation is  
the possibility of taking a users'' files and putting them on a  
different system, without knowing what they are...

Would encryption be implemented in such a way that you can access the  
encrypted data? I''m thinking backups here...

Thanks for your answers,

Wout.

Wout Mertens

2006-Jan-20 09:09 UTC

head link

[zfs-discuss] Some questions from a non-user

On 19 Jan 2006, at 18:28, Casper.Dik at Sun.COM wrote:
>> Since everything is copy-on-write, and the checksums are stored,
>> would it be possible to store common blocks just once?
>
> It would be, but it would require:
> 	- a master hash table where you can lookup <size, hash> -> block
> 	- a bitwise compare of the block to be stored.
That''s what I was thinking... Would that be prohibitive?

Also, I know that several products implementing this kind of  
behaviour (disk based backup systems) simply skip the bitwise  
compare. The assumption is: A hash collision is a very rare  
occurence. A hash collision with the constraint that your files must  
make sense somehow (i.e. not random data but images, documents,  
programs, etc) should be rarer still.
>> Is it possible to store a zfs filesystem in an auto-expanding file?
>> Mac OS X has sparseimage files, which contain just the used blocks of
>> your filesystem. This, combined with encryption, gives flexible
>> secure home directories.
>
> But in a rather awkward implementation; if ZFS can encrypt at a  
> different
> level, then surely that is preferred.
>
> Auto-expanding and copy on write would seem to be give you larger  
> files
> than you would want.
Hmmm, aren''t the released blocks reused afterwards? So you''d
have a
larger footprint but still less than the amount of space you reserved  
initially.

But I hadn''t considered just making a new filesystem, I still need to  
get old habits out of my head :) The only disadvantage I can see then  
is backups, like I mentioned in my reply to Darren.

Thanks,

Wout.

Casper.Dik at Sun.COM

2006-Jan-20 09:38 UTC

head link

[zfs-discuss] Some questions from a non-user

>On 19 Jan 2006, at 18:28, Casper.Dik at Sun.COM wrote:
>
>>> Since everything is copy-on-write, and the checksums are stored,
>>> would it be possible to store common blocks just once?
>>
>> It would be, but it would require:
>> 	- a master hash table where you can lookup <size, hash> ->
block
>> 	- a bitwise compare of the block to be stored.
>
>That''s what I was thinking... Would that be prohibitive?
Not sure; but you''d also need a "copy-on-write" method of
updating the
hash table, reference counting for blocks and other things.  Perhaps
not difficult but certainly more than a minor matter of programming.

On a per-file basis this might be more inbteresting.
>Also, I know that several products implementing this kind of  
>behaviour (disk based backup systems) simply skip the bitwise  
>compare. The assumption is: A hash collision is a very rare  
>occurence. A hash collision with the constraint that your files must  
>make sense somehow (i.e. not random data but images, documents,  
>programs, etc) should be rarer still.
Still, that does not make good engineering sense; you know that you''ve
built-in data corruption, even though the chances are small.
>But I hadn''t considered just making a new filesystem, I still need
to
>get old habits out of my head :) The only disadvantage I can see then  
>is backups, like I mentioned in my reply to Darren.

ZFS does require some readjustment of your mindset.

Casper

Bill la Forge

2006-Jan-20 10:57 UTC

head link

[zfs-discuss] Some questions from a non-user

Since everything is copy-on-write, and the checksums are stored,
would it be possible to store common blocks just once?

[cut]
>>Also, I know that several products implementing this kind of  
>>behaviour (disk based backup systems) simply skip the bitwise  
>>compare. The assumption is: A hash collision is a very rare  
>>occurence. A hash collision with the constraint that your files must  
>>make sense somehow (i.e. not random data but images, documents,  
>>programs, etc) should be rarer still.
>>    
>>
>
>Still, that does not make good engineering sense; you know that
you''ve
>built-in data corruption, even though the chances are small.
>  
>The systems I know that depend on the hash as the key for the data use a 
secure hash,
not a checksum, i.e. peer-to-peer systems. There''s a huge difference 
here in terms of liklyhood
of a collision. I don''t think we can call a checksum a hash here.

Bill la Forge

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060120/081142fa/attachment.html>

Robert Milkowski

2006-Jan-20 11:28 UTC

head link

[zfs-discuss] Some questions from a non-user

Hello Wout,

Friday, January 20, 2006, 9:55:23 AM, you wrote:

WM> Hi Robert,

WM> On 20 Jan 2006, at 09:43, Robert Milkowski wrote:
>> WM> Is it possible to query the stored checksum for a file? Ideally
in
>> WM> some sort of read-only extended attribute accessible from
scripts.
>> WM> This would make a tripwire-like tool very fast, and rsync more  
>> accurate.
>>
>> I can see it could work for tripwire-like tools but I''m not
sure it
>> would for rsync-like ''coz generally you don''t know
what block size
>> will be used after file is copied to other filesystem so checksum will
>> be actually for different block. But maybe I''m overlooking
>> something...
WM> With "more accurate", I meant that instead of relying on mtime
to see
WM> if a file changed, you have the checksum available. Normally, the  
WM> speed of checksumming is prohibitive, but with zfs, you get it for  
WM> free with the data protection features!

WM> Rsync would still have to do its own block-level checksumming on both  
WM> sides for synchronizing files, but generally, the network is slower  
WM> than the checksumming, so that''s ok.

WM> A real smart rsync could use the zfs block checksums and ask the  
WM> remote side for the checksums for the same blocks though, that would  
WM> eliminate half of the work. It would come at the cost of teaching  
WM> rsync about zfs internals, which seems to me like a very bad thing.

The problem is that when you are trying to keep in sync two
filesystems then you expect mtime to be the same. However as ZFS uses
DIFFERENT block sizes you probably can''t gurantee that there are the
same block sizes (and number of block) in a source file and file which
is (r)synced. You would need to keep you own database of checksums for
each block in given file so tripware-like utilities coule take
advantage of this.

To make me clear - I belive that if you have file named A on one zfs
filesystem and then you copy it to another ZFS filesystem (file A'')
then you can''t gurantee that A and A'' have the same number of
blocks
and use the same block sizes.

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                   http://milek.blogspot.com

Wout Mertens

2006-Jan-20 12:57 UTC

head link

[zfs-discuss] Some questions from a non-user

> The problem is that when you are trying to keep in sync two
> filesystems then you expect mtime to be the same. However as ZFS uses
> DIFFERENT block sizes you probably can''t gurantee that there are
the
> same block sizes (and number of block) in a source file and file which
> is (r)synced. You would need to keep you own database of checksums for
> each block in given file so tripware-like utilities coule take
> advantage of this.
Actually, I think we are talking about two different things here. I  
was proposing using the file-level checksum to detect whether a file  
had changed versus the remote copy. I wasn''t proposing using the  
block-level checksums, since that would indeed change between  
filesystems.

Now, this will only work if the checksum stored in the dnode is the  
same for all the block layouts of the files.

In fact, thinking about it, that seems to become unlikely :-(

If that checksum changes as the file layout changes, it is only  
moderately useful. A resilver would probably change the checksum, no?

Wout.

Nicolas Williams

2006-Jan-20 13:43 UTC

head link

[zfs-discuss] Some questions from a non-user

On Fri, Jan 20, 2006 at 12:28:21PM +0100, Robert Milkowski
wrote:> The problem is that when you are trying to keep in sync two
> filesystems then you expect mtime to be the same. However as ZFS uses
> DIFFERENT block sizes you probably can''t gurantee that there are
the
> same block sizes (and number of block) in a source file and file which
> is (r)synced. You would need to keep you own database of checksums for
> each block in given file so tripware-like utilities coule take
> advantage of this.
Also, a file''s ZFS checksum is affected by the checksums of its
indirect
blocks, and therefore by the block locations of its data and indirect
blocks.  All of which means that ZFS checksums are very strongly tied to
the filesystem, or, rather, dataset, where the objects in question
reside.

But ZFS checksums might still be useful if you''re trying to detect
change locally (as would be the file''s dnode block pointer, a/c/mtime,
and generation number).

Nico
--

Wout Mertens

2006-Jan-20 16:24 UTC

head link

[zfs-discuss] Some questions from a non-user

On 20 Jan 2006, at 19:55, Darren J Moffat wrote:
> On Fri, 2006-01-20 at 09:03, Wout Mertens wrote:
>> Would encryption be implemented in such a way that you can access the
>> encrypted data? I''m thinking backups here...
>
> That is one of the highest priority requirements that we have.
>
> We really want a method where backups can be done without the backup
> operator actually being able to read the data, and hopefully  
> without the
> backup program requiring too much operating system privilege.
>
> This is why key management is the key, pardon the pun, to the
> whole ZFS crypto story and why it is hard to do well.
<Geeking out, sorry if this is redundant information. Consider it a  
fanboy post if so.>

There is a LUFS filesystem called CryptoFS that implements something  
akin to this.

    http://linux.softpedia.com/get/System/Filesystems/ 
CryptoFS-1474.shtml

Basically, you have a regular directory with encrypted files that is  
used as the backend storage. When you mount the CryptoFS, it uses the  
key to decrypt filenames and the files themselves. The information  
you''re leaking is the rough size of your file and the layout of your  
filesystem. You can make simple backups of the backend storage. I  
assume access control is through the access control of the backend  
storage.

(For those who wonder, if you know rough sizes and layout, you can  
guess which "standard" files are available, making plaintext attacks  
possible to retrieve the encryption key)

Since ZFS is a lower layer and stores everything as objects it could  
do one better, encrypting the directory blocks as well. (I''ve no idea  
what the directory structure on disk is, sorry). ZFS would only be  
leaking rough sizes and the number of objects in that case, and that  
information can be salted by adding random junk.

BTW, a thought occurred to me; OS X sparse images are not very backup- 
friendly. They''re multi-gigabyte files that change all the time.
We actually have an rsync running on the OS X systems using sparse  
images that syncs the unencrypted files while they''re mounted to the  
backup server.

So apart from CryptoFS (Linux only), there''s no real, cheap, easy,  
solution to encrypted storage + backups.
ZFS could finally make tape backups secure without weird/expensive  
workarounds!

</geeking out>

Wout.

Bill Sommerfeld

2006-Jan-20 16:24 UTC

head link

[zfs-discuss] Some questions from a non-user

On Fri, 2006-01-20 at 09:55 +0100, Wout Mertens wrote:> A real smart rsync could use the zfs block checksums and ask the  
> remote side for the checksums for the same blocks though, that would  
> eliminate half of the work. It would come at the cost of teaching  
> rsync about zfs internals, which seems to me like a very bad thing.
on the other hand, it would join a number of similar "layer
violations"
floating about -- there''s a patch to gzip to add an option which makes
the files it compresses more amenable to incremental transfer by rsync
as well as a related package called "zsync" which has another way to
efficiently do incremental transfers on compressed files.

					- Bill

Wout Mertens

2006-Jan-20 16:56 UTC

head link

[zfs-discuss] Some questions from a non-user

On 20 Jan 2006, at 14:43, Nicolas Williams wrote:
> Also, a file''s ZFS checksum is affected by the checksums of its  
> indirect
> blocks, and therefore by the block locations of its data and indirect
> blocks.  All of which means that ZFS checksums are very strongly  
> tied to
> the filesystem, or, rather, dataset, where the objects in question
> reside.
Sigh, I was afraid of that.

I still think a static, instant, checksum of the files would be very  
valuable. I tried finding some way to still have it, but the only way  
I can think of involves using weak checksums that obey to sum(concat 
(a,b)) == f(sum(a),sum(b)).

Not an option methinks :-(
> But ZFS checksums might still be useful if you''re trying to detect
> change locally (as would be the file''s dnode block pointer,
a/c/mtime,
> and generation number).
Myeah. If you keep a separate list of checksums you would know that  
if the checksum was still the same, they file is still the same. If  
the checksums differ, you would need to use a different checksum on  
just the file to verify sameness. Should speed up tripwire-like tools  
at least.

Wout.

Darren J Moffat

2006-Jan-20 18:53 UTC

head link

[zfs-discuss] Some questions from a non-user

On Fri, 2006-01-20 at 09:03, Wout Mertens wrote:> On 19 Jan 2006, at 18:32, Darren J Moffat wrote:
> 
> > On Thu, 2006-01-19 at 16:41, Wout Mertens wrote:
> >> Is it possible to query the stored checksum for a file? Ideally in
> >> some sort of read-only extended attribute accessible from scripts.
> >> This would make a tripwire-like tool very fast, and rsync more  
> >> accurate.
> >
> > This is RFE# 6259754.  Not implemented yes.
> 
> Is that the actual number? How many are before it? :-(
Yes that is the unique primary key in the database.  There aren''t
actually that many bugs/rfes (they use the same primary key space
and the data tables have a flag rfe/bug).  There are some "holes"
due to past migrations between bug databases and front end tools.

The actual number is meaning less since there is a priority and
severity and justification associated with bugs and rfes as well.

So now there aren''t 6,259,753 things to do before someone gets to this
:-)

-- 
Darren J Moffat

Darren J Moffat

2006-Jan-20 18:55 UTC

head link

[zfs-discuss] Some questions from a non-user

On Fri, 2006-01-20 at 09:03, Wout Mertens wrote:> Would encryption be implemented in such a way that you can access the  
> encrypted data? I''m thinking backups here...
That is one of the highest priority requirements that we have.

We really want a method where backups can be done without the backup
operator actually being able to read the data, and hopefully without the
backup program requiring too much operating system privilege.

This is why key management is the key, pardon the pun, to the
whole ZFS crypto story and why it is hard to do well.

-- 
Darren J Moffat

Darren J Moffat

2006-Jan-20 19:46 UTC

head link

[zfs-discuss] Some questions from a non-user

On Fri, 2006-01-20 at 10:57, Bill la Forge wrote:> >   
> The systems I know that depend on the hash as the key for the data use
> a secure hash, 
> not a checksum, i.e. peer-to-peer systems. There''s a huge
difference
> here in terms of liklyhood
> of a collision. I don''t think we can call a checksum a hash here.
What is not cryptographically secure about SHA256 which is
one of the options for ZFS filesystems.

zfs set
....
        checksum        YES      YES   on | off | fletcher2 | fletcher4
| sha256


-- 
Darren J Moffat

Jeff Bonwick

2006-Jan-20 21:50 UTC

head link

[zfs-discuss] Some questions from a non-user

> A resilver would probably change the checksum, no?
No.  A resilver just fixes any damaged copies of the data.
The valid data doesn''t change, so its checksum doesn''t change
either.

The idea of passing ZFS block checksums up the stack for further
upstream (or over-the-wire) validation is tempting, but as folks
have noted, it''s tricky.  It requires either that the checksum
function be partitionable (which greatly reduces its strength),
or that each layer above us can cope with whatever block size
we give it.  It''s certainly possible, but it''s not a cake
walk.

Jeff

zfs discuss - Jan 2006 - Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user

[zfs-discuss] Some questions from a non-user