thr3ads.net - zfs discuss - [zfs-discuss] question about COW and snapshots [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Simon Walter

2011-Jun-14 17:25 UTC

[zfs-discuss] question about COW and snapshots

I''m looking to create a NAS with versioning for non-technical users 
(Windows and Mac). I want the users to be able to simply save a file, 
and a revision/snapshot is created. I could use a revision control 
software like SVN (it has autoversioning with WebDAV), and I will if 
this filesystem level idea does not pan out. However, since this is a 
NAS for any files(Word, JPEGs, MP3, etc), not just source code, I 
thought doing this at the filesystem level via some kind of COW would be 
better.

So now my (ignorant) question: can ZFS make a snapshot every time it''s 
written to? Can all writes be available as snapshots so all previous 
"versions" are available? Or does one need to explicitly create a
snapshot?

I''ve read about auto-snapshot. But that seems like nice cron job 
(scheduled snapshots). Am I mistaken?

I looked in to ext3cow, next3, and then some. ZFS seems the most stable 
and ready for use. However, is the above possible? Or how is it 
possible? Is there a better way of doing this? Other filesystems?

Thanks for ideas and suggestions,

Simon

Edward Ned Harvey

2011-Jun-15 02:05 UTC

head link

[zfs-discuss] question about COW and snapshots

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Simon Walter
> 
> I''m looking to create a NAS with versioning for non-technical
users
> (Windows and Mac). I want the users to be able to simply save a file,
> and a revision/snapshot is created. I could use a revision control
> software like SVN (it has autoversioning with WebDAV), and I will if
> this filesystem level idea does not pan out. However, since this is a
> NAS for any files(Word, JPEGs, MP3, etc), not just source code, I
> thought doing this at the filesystem level via some kind of COW would be
> better.
CVS works on individual files.
SVN works on a subset of a directory.
ZFS works on whole file systems.

You want to save snapshots or revs of individual files, upon writes.  Saves,
Closes, etc.  The closest I know to get what you want is ... 
Google Docs
Alfresco
Sharepoint

There''s probably something... I''m sure this isn''t the
first time anyone ever
thought of that.  But the answer isn''t ZFS.

Richard Elling

2011-Jun-15 03:56 UTC

head link

[zfs-discuss] question about COW and snapshots

On Jun 14, 2011, at 10:25 AM, Simon Walter wrote:
> I''m looking to create a NAS with versioning for non-technical
users (Windows and Mac). I want the users to be able to simply save a file, and
a revision/snapshot is created. I could use a revision control software like SVN
(it has autoversioning with WebDAV), and I will if this filesystem level idea
does not pan out. However, since this is a NAS for any files(Word, JPEGs, MP3,
etc), not just source code, I thought doing this at the filesystem level via
some kind of COW would be better.
I can''t answer the "better" question, but with ZFS you can
delegate to the user the ability to make
snapshots when they want. You can even have applications like databases make
snapshots when
they want. Using this feature, you don''t need to archive every write.
> 
> So now my (ignorant) question: can ZFS make a snapshot every time
it''s written to?
I hope not, that would suck most heinously.
> Can all writes be available as snapshots so all previous
"versions" are available?
That would suck worse.
> Or does one need to explicitly create a snapshot?
Yes.
> 
> I''ve read about auto-snapshot. But that seems like nice cron job
(scheduled snapshots). Am I mistaken?
No, that is how it works.
> 
> I looked in to ext3cow, next3, and then some. ZFS seems the most stable and
ready for use. However, is the above possible? Or how is it possible? Is there a
better way of doing this? Other filesystems?
> 
> Thanks for ideas and suggestions,
I use TimeMachine with a ZFS repository. It is time-based snapshots, too.
 -- richard

Simon Walter

2011-Jun-15 10:44 UTC

head link

[zfs-discuss] question about COW and snapshots

On 06/15/2011 12:56 PM, Richard Elling wrote:>> So now my (ignorant) question: can ZFS make a snapshot every time
it''s written to?
> I hope not, that would suck most heinously.
>
>> Can all writes be available as snapshots so all previous
"versions" are available?
> That would suck worse.
>
Why would that suck? Can ZFS not handle that amount of snapshots? I must 
be missing something.

Simon Walter

2011-Jun-15 10:45 UTC

head link

[zfs-discuss] question about COW and snapshots

Thanks for the comments. So ZFS alone cannot do what I''d like.

In linux there Gamin. Or there is also a kernel patch which gives you 
/proc/fschanges. I could monitor this file for changes and take a 
snapshot when a change occurs or under certain circumstances. However, 
the Linux COW type FS are all not as production ready as ZFS. Does 
(Open)Solaris have a /proc/fschanges equivalent to monitor for FS activity?

Edward Ned Harvey

2011-Jun-15 11:29 UTC

head link

[zfs-discuss] question about COW and snapshots

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Richard Elling
> 
> That would suck worse.
Don''t mind Richard.  He is of the mind that ZFS is perfect for
everything
just the way it is, and anybody who wants anything different should adjust
their thought process.  

Richard, just because it''s not something you want, doesn''t
mean you should
rain on somebody else''s parade.  If Simon wants something like that,
kudos
to him.

I know I''ve certainly had many situations where people wanted to
snapshot or
rev individual files everytime they''re modified.  As I said - perfect
example is Google Docs.  Yes it is useful.  But no, it''s not what ZFS
does.

Darren J Moffat

2011-Jun-15 11:45 UTC

head link

[zfs-discuss] question about COW and snapshots

On 06/15/11 12:29, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>> bounces at opensolaris.org] On Behalf Of Richard Elling
>>
>> That would suck worse.
>
> Don''t mind Richard.  He is of the mind that ZFS is perfect for
everything
> just the way it is, and anybody who wants anything different should adjust
> their thought process.
I suspect rather than that it is more that Richard equated "write" to 
write(2) / dmu_write() calls and that would suck performance wise.

I also suspect that what Simon wants isn''t a snapshot of every little 
write(2) level call but when the file is completed being updated, maybe 
on close(2) [ but that assumes the app does actually call close() ].
> I know I''ve certainly had many situations where people wanted to
snapshot or
> rev individual files everytime they''re modified.  As I said -
perfect
> example is Google Docs.  Yes it is useful.  But no, it''s not what
ZFS does.
Exactly versions of a whole file, but that is different to a snapshot on 
every write.

How you interpret "on every write" depends on where in the stack you
are
coming from.  If you think about an application a "write" is whey you 
save the document but at the ZPL layer that is multiple write(2) calls 
and maybe even some rename(2)/unlink(2)/close(2) calls as well.
If you move further down then doing a snapshot on every dmu_write() call 
is fundamentally at odds with how ZFS works.

-- 
Darren J Moffat

Toby Thain

2011-Jun-15 12:01 UTC

head link

[zfs-discuss] question about COW and snapshots

On 15/06/11 7:45 AM, Darren J Moffat wrote:> On 06/15/11 12:29, Edward Ned Harvey wrote:
>>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>>> bounces at opensolaris.org] On Behalf Of Richard Elling
>>>
>>> That would suck worse.
>>
>> Don''t mind Richard.  He is of the mind that ZFS is perfect for
everything
>> just the way it is, and anybody who wants anything different should
>> adjust
>> their thought process.
> 
> I suspect rather than that it is more that Richard equated
"write" to
> write(2) / dmu_write() calls and that would suck performance wise.
> 
> I also suspect that what Simon wants isn''t a snapshot of every
little
> write(2) level call but when the file is completed being updated, maybe
> on close(2) [ but that assumes the app does actually call close() ].
> 
That''s how I interpreted it.
>> I know I''ve certainly had many situations where people wanted
to
>> snapshot or
>> rev individual files everytime they''re modified.  As I said -
perfect
>> example is Google Docs.  Yes it is useful.  But no, it''s not
what ZFS
>> does.
> 
> Exactly versions of a whole file, but that is different to a snapshot on
> every write.
> 
> How you interpret "on every write" depends on where in the stack
you are
> coming from.  If you think about an application a "write" is whey
you
> save the document but at the ZPL layer that is multiple write(2) calls
> and maybe even some rename(2)/unlink(2)/close(2) calls as well.
That''s one big problem with the naive plan of using snapshots.

Another one is that snapshots are per-filesystem, while the intention
here is to capture a document in one user session. Taking a snapshot
will of course say nothing about the state of other user sessions. Any
document in the process of being saved by another user, for example,
will be corrupt.

The proposal seems to be aimed at the wrong part of the stack. The
comparison with Google Docs is revealing.

--Toby
> If you move further down then doing a snapshot on every dmu_write() call
> is fundamentally at odds with how ZFS works.
>

Simon Walter

2011-Jun-15 12:03 UTC

head link

[zfs-discuss] question about COW and snapshots

On 06/15/2011 07:45 PM, Simon Walter wrote:> Thanks for the comments. So ZFS alone cannot do what I''d like.
>
> In linux there Gamin. Or there is also a kernel patch which gives you 
> /proc/fschanges. I could monitor this file for changes and take a 
> snapshot when a change occurs or under certain circumstances. However, 
> the Linux COW type FS are all not as production ready as ZFS. Does 
> (Open)Solaris have a /proc/fschanges equivalent to monitor for FS 
> activity?
Perhaps what I am looking for is Dtrace? Something like this but for the 
entire FS? http://opensolaris.org/jive/message.jspa?messageID=227128

Is that possible? Is there a /proc/fschanges or similar where all FS 
access is reported?

Simon Walter

2011-Jun-15 12:30 UTC

head link

[zfs-discuss] question about COW and snapshots

On 06/15/2011 09:01 PM, Toby Thain wrote:>>> I know I''ve certainly had many situations where people
wanted to
>>> snapshot or
>>> rev individual files everytime they''re modified.  As I
said - perfect
>>> example is Google Docs.  Yes it is useful.  But no, it''s
not what ZFS
>>> does.
>> Exactly versions of a whole file, but that is different to a snapshot
on
>> every write.
>>
>> How you interpret "on every write" depends on where in the
stack you are
>> coming from.  If you think about an application a "write" is
whey you
>> save the document but at the ZPL layer that is multiple write(2) calls
>> and maybe even some rename(2)/unlink(2)/close(2) calls as well.
> That''s one big problem with the naive plan of using snapshots.
>
> Another one is that snapshots are per-filesystem, while the intention
> here is to capture a document in one user session. Taking a snapshot
> will of course say nothing about the state of other user sessions. Any
> document in the process of being saved by another user, for example,
> will be corrupt.
Would it be? I think that''s pretty lame for ZFS to corrupt data. If I 
were to manually create a snapshot and two users were writing to the FS, 
how would ZFS handle that? Are you saying it would corrupt the data? I 
thought snapshots could be taken regardless of if there is activity.

If I monitor (via Dtrace?) for what equates to a "save", would that be
sufficient? If It''s particular sequence, then should it not be able to 
be monitored for? Since it is a NAS, only one or two daemons will be 
writing to the particular FS. I can get expected behaviour from these 
daemons.

If it really is a retarded idea, at least with the current FSs 
available, then I''ll just use SVN and manage the repos somehow. I just 
thought I''d see if it''s an option.

Anyone know how Google Docs does it?

Michael Schuster

2011-Jun-15 12:36 UTC

head link

[zfs-discuss] question about COW and snapshots

On 15.06.2011 14:30, Simon Walter wrote:
>> Another one is that snapshots are per-filesystem, while the intention
>> here is to capture a document in one user session. Taking a snapshot
>> will of course say nothing about the state of other user sessions. Any
>> document in the process of being saved by another user, for example,
>> will be corrupt.
>
> Would it be? I think that''s pretty lame for ZFS to corrupt data.
I think "corrupt" is not the right word to use here -
"inconsistent" is
probably better. ZFS has no idea when a document is "OK", so if your 
snapshot happens between two writes (even from a single user), it will 
be consistent from the POV of the FS, but may not be from the POV of the 
application.

HTH
Michael
-- 
Michael Schuster
http://recursiveramblings.wordpress.com/

Jim Klimov

2011-Jun-15 12:56 UTC

head link

[zfs-discuss] question about COW and snapshots

>From our experience as a company working with Alfresco,I would also suggest installing that and seeing how it goes. 
 
It is a portable java webapp running under Tomcat which 
implements not only WebDav and web-interfaces (own 
and sharepoint-like), but also many commont protocols - 
FTP, NFS, SMB/CIFS - to access your stored files. 
Hierarchy of "workspaces" may be protected by ACLs, 
and it is not difficult to add some special triggers (i.e. when 
an XML file is saved to a certain directory, the server calls 
some XSLT transform on it so a PDF file or whatever pops 
up in another directory). All saved and committed files are 
versioned.
 
At first I was reluctant to abandon ZFS snapshots, kernel
NFS and CIFS in favor of some clunky web-app, and
many of our production files are still in ZFS, but at least
as far as Unix integrations like homedirs goes - it is
possible. Wouldn''t vouch for access speeds, like building
a kernel over such NFS homedir, though. For us it is 
mostly static content such as internal docs and published
distros, or as a storage-with-webservices for customers.

----- Original Message -----
From: Simon Walter <simon at gikaku.com>
Date: Wednesday, June 15, 2011 14:48
Subject: Re: [zfs-discuss] question about COW and snapshots
To: zfs-discuss at opensolaris.org
> Thanks for the comments. So ZFS alone cannot do what I''d like.
> 
> In linux there Gamin. Or there is also a kernel patch which 
> gives you 
> /proc/fschanges. I could monitor this file for changes and take 
> a 
> snapshot when a change occurs or under certain circumstances. 
> However, 
> the Linux COW type FS are all not as production ready as ZFS. 
> Does 
> (Open)Solaris have a /proc/fschanges equivalent to monitor for 
> FS activity?
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 
 
-- 

+============================================================+ 
|                                                            | 
| ?????? ???????,                                 Jim Klimov | 
| ??????????? ????????                                   CTO | 
| ??? "??? ? ??"                                  JSC COS&HT | 
|                                                            | 
| +7-903-7705859 (cellular)          mailto:jimklimov at cos.ru | 
|                        CC:admin at cos.ru,jimklimov at gmail.com | 
+============================================================+ 
| ()  ascii ribbon campaign - against html mail              | 
| /\                        - against microsoft attachments  | 
+============================================================+ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110615/64fdfb67/attachment.html>

Jim Klimov

2011-Jun-15 13:14 UTC

head link

[zfs-discuss] question about COW and snapshots

> >> How you interpret "on every write" depends on where in
the
> stack you are
> >> coming from.  If you think about an application a 
> "write" is whey you
> >> save the document but at the ZPL layer that is multiple 
> write(2) calls
> >> and maybe even some rename(2)/unlink(2)/close(2) calls as well.
> > That''s one big problem with the naive plan of using
snapshots.
> >
> > Another one is that snapshots are per-filesystem, while the 
> intention> here is to capture a document in one user session. 
> Taking a snapshot
> > will of course say nothing about the state of other user 
> sessions. Any
> > document in the process of being saved by another user, for example,
> > will be corrupt.
> 
> Would it be? I think that''s pretty lame for ZFS to corrupt data. 
> If I 
> were to manually create a snapshot and two users were writing to 
> the FS, 
> how would ZFS handle that? Are you saying it would corrupt the 
> data? I 
> thought snapshots could be taken regardless of if there is activity.
Just as said above, ZFS write(2)''s some blocks to a file system.
There may be many files open and used on this filesystem. 
The moment when one file is close()''d other files may remain open.
Say at this moment you take a snapshot. This per se does not break
stuff. It just places a cutoff line - blocks on disk are in the snapshot,
unclosed files are not yet completed from its point of view, their new
blocks will go into the live dataset and ultimately into next snapshot.
 
And then you find that you need to roll back to this snapshot, or
to clone it, or extract random files from its read-only view into the 
filesystem. Those files which were incomplete at the moment of 
snapshot - they would remain incomplete. If you try to use these 
half-files, most applications would agree that they are broken - 
just as if you pressed "reset" in the middle of a heavy write.
 
Some programs such as databases can roll back incomplete
writes and forget they were there - but the rest of their data
in such incomplete files will become consistent. Thanks to
ZFS COW, they will have available any blocks ever written 
to disk and referenced in this snapshot, so they can decide
what and how to roll back according to their internal structures.
 
For this reason many well-behaved backup tools for, say, databases, 
trigger a cache flush and read-only freeze for the database, then
they make a snapshot and unfreeze the DB. After that the snapshot
data may be sent to tape or a remote server, and it will be consistent
on-disk if it ever needs to be rolled back to.

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110615/d9ae6f6e/attachment-0001.html>

Eric D. Mudama

2011-Jun-15 18:40 UTC

head link

[zfs-discuss] question about COW and snapshots

On Wed, Jun 15 at  7:29, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>> bounces at opensolaris.org] On Behalf Of Richard Elling
>>
>> That would suck worse.
>
>Don''t mind Richard.  He is of the mind that ZFS is perfect for
everything
>just the way it is, and anybody who wants anything different should adjust
>their thought process.
>
>Richard, just because it''s not something you want, doesn''t
mean you should
>rain on somebody else''s parade.  If Simon wants something like
that, kudos
>to him.
>
>I know I''ve certainly had many situations where people wanted to
snapshot or
>rev individual files everytime they''re modified.  As I said -
perfect
>example is Google Docs.  Yes it is useful.  But no, it''s not what
ZFS does.
suck worse = every single file would show a snapshot version for every
change anywhere in the filesystem, not just the changes unique to that file.

Imagine scrolling through a few hundred thousand snapshots in the
windows "old version" dialog because your 5000 files were getting
edited 2-3 times/day for a month.  Imagine trying to parse the results
of ''zfs list -t snapshot''.  Picture the disaster of that
system in 5
years of operation.

IMO, this problem begs for one of today''s content management systems
plus 10 minutes of training on how to use it effectively.  Save the
snapshotting for the periodic backup of the CMS system and/or users''s
systems.  Sure, map work areas via NFS or CIFS, and give them a
time-machine like picture of history for that work area (hourly for a
day, daily for a week, weekly for a month, monthly for a year, etc.)

--eric

-- 
Eric D. Mudama
edmudama at bounceswoosh.org

Richard Elling

2011-Jun-15 19:21 UTC

head link

[zfs-discuss] question about COW and snapshots

On Jun 15, 2011, at 4:45 AM, Darren J Moffat wrote:
> On 06/15/11 12:29, Edward Ned Harvey wrote:
>>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>>> bounces at opensolaris.org] On Behalf Of Richard Elling
>>> 
>>> That would suck worse.
>> 
>> Don''t mind Richard.  He is of the mind that ZFS is perfect for
everything
>> just the way it is, and anybody who wants anything different should
adjust
>> their thought process.
ouch
> 
> I suspect rather than that it is more that Richard equated
"write" to write(2) / dmu_write() calls and that would suck
performance wise.
Yep, especially because you also have to write the metadata that said when you
wrote, and so on.
> 
> I also suspect that what Simon wants isn''t a snapshot of every
little write(2) level call but when the file is completed being updated, maybe
on close(2) [ but that assumes the app does actually call close() ].
> 
>> I know I''ve certainly had many situations where people wanted
to snapshot or
>> rev individual files everytime they''re modified.  As I said -
perfect
>> example is Google Docs.  Yes it is useful.  But no, it''s not
what ZFS does.
> 
> Exactly versions of a whole file, but that is different to a snapshot on
every write.
Yes, and as others have said, the appropriate interface is critical. A WebDAV
PUT does have
a bounded endpoint where version control makes sense. An mmap(2) does not.
Adding a call
to zfs snapshot when a WedDAV PUT completes seems trivial. But you could also
call a traditional
source code control system check-in as well. Before long, you can have a useful
CMS system.

There are a large number of programs that are designed to not close files, and
it will be difficult
to derive meaningful snapshot opportunities in those cases.
> How you interpret "on every write" depends on where in the stack
you are coming from.  If you think about an application a "write" is
whey you save the document but at the ZPL layer that is multiple write(2) calls
and maybe even some rename(2)/unlink(2)/close(2) calls as well.
> If you move further down then doing a snapshot on every dmu_write() call is
fundamentally at odds with how ZFS works.
Yep.
 -- richard

Toby Thain

2011-Jun-15 23:21 UTC

head link

[zfs-discuss] question about COW and snapshots

On 15/06/11 8:30 AM, Simon Walter wrote:> On 06/15/2011 09:01 PM, Toby Thain wrote:
>>>> I know I''ve certainly had many situations where people
wanted to
>>>> snapshot or
>>>> rev individual files everytime they''re modified.  As I
said - perfect
>>>> example is Google Docs.  Yes it is useful.  But no,
it''s not what ZFS
>>>> does.
>>> Exactly versions of a whole file, but that is different to a
snapshot on
>>> every write.
>>>
>>> How you interpret "on every write" depends on where in
the stack you are
>>> coming from.  If you think about an application a "write"
is whey you
>>> save the document but at the ZPL layer that is multiple write(2)
calls
>>> and maybe even some rename(2)/unlink(2)/close(2) calls as well.
>> That''s one big problem with the naive plan of using snapshots.
>>
>> Another one is that snapshots are per-filesystem, while the intention
>> here is to capture a document in one user session. Taking a snapshot
>> will of course say nothing about the state of other user sessions. Any
>> document in the process of being saved by another user, for example,
>> will be corrupt.
> 
> Would it be? I think that''s pretty lame for ZFS to corrupt data. 
ZFS isn''t corrupting anything (Michael is correct,
"inconsistent" would
have been a better word). The inevitable inconsistency *from the
application''s perspective* results from the error of thinking a
snapshot
is automagically correct for document sessions (which are not aware of
what happens at levels underneath).

Likewise, you can backup your RDBMS with tar, if you like, but the
result may not have integrity from the database''s point of view. (Same
applies to filesystem backups with dd, etc, etc).

> If I
> were to manually create a snapshot and two users were writing to the FS,
> how would ZFS handle that? Are you saying it would corrupt the data? I
> thought snapshots could be taken regardless of if there is activity.
Of course they can, but (as Jim explains) this cannot guarantee
consistency on higher levels without *interacting* with higher levels
(for example, quiescing a database, or fully flushing a document).
> 
> If I monitor (via Dtrace?) for what equates to a "save", would
that be
> sufficient? 
Darren explained some reasons why this may not be trivial.
> If It''s particular sequence, then should it not be able to
> be monitored for? Since it is a NAS, only one or two daemons will be
> writing to the particular FS. I can get expected behaviour from these
> daemons.
> 
> If it really is a retarded idea, at least with the current FSs
> available, 
It''s not a fault of the filesystem. It''s just an architectural
problem
to solve, most likely on a different layer.

--Toby
> then I''ll just use SVN and manage the repos somehow. I just
> thought I''d see if it''s an option.
> 
> Anyone know how Google Docs does it?
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Erik Trimble

2011-Jun-16 00:09 UTC

head link

[zfs-discuss] question about COW and snapshots

We had a similar discussion a couple of years ago here, under the title 
"A Versioning FS". Look through the archives for the full discussion.

The jist is that application-level versioning (and consistency) is 
completely orthogonal to filesystem-level snapshots and consistency.  
IMHO, they should never be mixed together - there are way too many 
corner cases and application-specific memes for a filesystem to ever 
fully handle file-level versioning and *application*-level data 
consistency.  Don''t mistake one for the other, and, don''t try
to *use*
one for the other.  They''re completely different creatures.

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

Simon Walter

2011-Jun-16 07:09 UTC

head link

[zfs-discuss] question about COW and snapshots

On 06/16/2011 09:09 AM, Erik Trimble wrote:> We had a similar discussion a couple of years ago here, under the 
> title "A Versioning FS". Look through the archives for the full 
> discussion.
>
> The jist is that application-level versioning (and consistency) is 
> completely orthogonal to filesystem-level snapshots and consistency.  
> IMHO, they should never be mixed together - there are way too many 
> corner cases and application-specific memes for a filesystem to ever 
> fully handle file-level versioning and *application*-level data 
> consistency.  Don''t mistake one for the other, and, don''t
try to *use*
> one for the other.  They''re completely different creatures.
>
I guess that is true of the current FSs available. Though it would be 
nice to essentially have a versioning FS in the kernel rather than an 
application in userspace. But I regress. I''ll use SVN and webdav.

Thanks for the advice everyone.

Erik Trimble

2011-Jun-16 07:22 UTC

head link

[zfs-discuss] question about COW and snapshots

On 6/16/2011 12:09 AM, Simon Walter wrote:> On 06/16/2011 09:09 AM, Erik Trimble wrote:
>> We had a similar discussion a couple of years ago here, under the 
>> title "A Versioning FS". Look through the archives for the
full
>> discussion.
>>
>> The jist is that application-level versioning (and consistency) is 
>> completely orthogonal to filesystem-level snapshots and consistency.  
>> IMHO, they should never be mixed together - there are way too many 
>> corner cases and application-specific memes for a filesystem to ever 
>> fully handle file-level versioning and *application*-level data 
>> consistency.  Don''t mistake one for the other, and,
don''t try to
>> *use* one for the other.  They''re completely different
creatures.
>>
>
> I guess that is true of the current FSs available. Though it would be 
> nice to essentially have a versioning FS in the kernel rather than an 
> application in userspace. But I regress. I''ll use SVN and webdav.
>
> Thanks for the advice everyone.
>It''s not really a technical problem, it''s a knowledge locality
problem.
The *knowledge* of where to checkmark, where to version, and what data 
consistency means is held at the application level, and can ONLY be 
known by each individual application. There''s no way a filesystem (or 
anything like that) can make the proper decisions without the 
application telling it what those decisions should be. So, what would 
the point be in having a "smart" versioning FS, since the intelligence
can''t be built into the FS, it would still have to be built into each 
and every application.

So, if your apps have to be programmed to be 
versioning/consistency/checkmarking aware in any case, how would having 
a fancy Versioning filesystem be any better than using what we do now? 
(i.e. svn/hg/cvs/git on top of ZFS/btrfs/et al)   ZFS at least makes 
significant practical advances by rolling the logical volume manager 
into the filesystem level, but I can''t see any such advantage for a 
Versioning FS.

-- 
Erik Trimble
Java Platform Group Infrastructure
Mailstop:  usca22-317
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (UTC-0800)

Toby Thain

2011-Jun-16 12:24 UTC

head link

[zfs-discuss] question about COW and snapshots

On 16/06/11 3:09 AM, Simon Walter wrote:> On 06/16/2011 09:09 AM, Erik Trimble wrote:
>> We had a similar discussion a couple of years ago here, under the
>> title "A Versioning FS". Look through the archives for the
full
>> discussion.
>>
>> The jist is that application-level versioning (and consistency) is
>> completely orthogonal to filesystem-level snapshots and consistency. 
>> IMHO, they should never be mixed together - there are way too many
>> corner cases and application-specific memes for a filesystem to ever
>> fully handle file-level versioning and *application*-level data
>> consistency.  Don''t mistake one for the other, and,
don''t try to *use*
>> one for the other.  They''re completely different creatures.
>>
> 
> I guess that is true of the current FSs available. Though it would be
> nice to essentially have a versioning FS in the kernel rather than an
> application in userspace. But I regress. I''ll use SVN and webdav.

To use Svn correctly here, you have to resolve the same issue. Svn has a
global revision, just as a snapshot is a state for an *entire*
filesystem. You don''t seem to have taken that into sufficient account
when talking about ZFS; it doesn''t align with your goal of consistency
from the point of view of a *single document*.

You''ll only be able to make a useful snapshot in Svn at moments when
*all* documents in the repository are in a consistent state (I''m
assuming this is a multi-user system). That''s a much stronger guarantee
than you probably ''require'' for your purpose, so it makes me
wonder
whether what you really want is a document database (or, to be honest,
an ordinary filesystem; you can "snapshot" single documents in an
ordinary filesystem using say hard links) where the state of each
session/document is *independent*. You can see that the latter model is
much more like Google Docs, not to mention simpler; and the "snapshot"
model is not like it at all.

--Toby
> 
> Thanks for the advice everyone.
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Frank Van Damme

2011-Jun-16 13:45 UTC

head link

[zfs-discuss] question about COW and snapshots

Op 15-06-11 05:56, Richard Elling schreef:> You can even have applications like databases make snapshots when
> they want.
Makes me think of a backup utility called mylvmbackup, which is written
with Linux in mind - basically it locks mysql tables, takes an LVM
snapshot and releases the lock (and then you backup the database files
from the snapshot). Should work at least as well with ZFS.

-- 
No part of this copyright message may be reproduced, read or seen,
dead or alive or by any means, including but not limited to telepathy
without the benevolence of the author.

Frank Van Damme

2011-Jun-16 13:47 UTC

head link

[zfs-discuss] question about COW and snapshots

Op 15-06-11 14:30, Simon Walter schreef:> Anyone know how Google Docs does it?
Anyone from Google on the list? :-)

Seriously, this is the kind of feature to be found in Serious CMS
applications, like, as already mentioned, Alfresco.

-- 
No part of this copyright message may be reproduced, read or seen,
dead or alive or by any means, including but not limited to telepathy
without the benevolence of the author.

Casper.Dik at oracle.com

2011-Jun-16 13:51 UTC

head link

[zfs-discuss] question about COW and snapshots

>Op 15-06-11 05:56, Richard Elling schreef:
>> You can even have applications like databases make snapshots when
>> they want.
>
>Makes me think of a backup utility called mylvmbackup, which is written
>with Linux in mind - basically it locks mysql tables, takes an LVM
>snapshot and releases the lock (and then you backup the database files
>from the snapshot). Should work at least as well with ZFS.
If a database engine or another application keeps both the data and the
log in the same filesystem, a snapshot wouldn''t create inconsistent
data
(I think this would be true with vim and a large number of database 
engines; vim will detect the swap file and datbase should be able to 
detect the inconsistency and rollback and re-apply the log file.)

Casper

Nico Williams

2011-Jun-16 19:58 UTC

head link

[zfs-discuss] question about COW and snapshots

On Thu, Jun 16, 2011 at 8:51 AM,  <Casper.Dik at oracle.com>
wrote:> If a database engine or another application keeps both the data and the
> log in the same filesystem, a snapshot wouldn''t create
inconsistent data
> (I think this would be true with vim and a large number of database
> engines; vim will detect the swap file and datbase should be able to
> detect the inconsistency and rollback and re-apply the log file.)
Correct.  SQLite3 will be able to recover automatically from restores
of mid-transaction snapshots.

VIM does not recover automatically, but it does notice the swap file
and warns the user and gives them a way to handle the problem.

(When you save a file, VIM renames the old one out of the way, creates
a new file with the original name, writes the new contents to it,
closes it, then unlinks the swap file.  On recovery VIM notices the
swap file and gives the user a menu of choices.)

I believe this is the best solution: write applications so they can
recover from being restarted with data restored from a mid-transaction
snapshot.

Nico
--

Nico Williams

2011-Jun-16 20:02 UTC

head link

[zfs-discuss] question about COW and snapshots

That said, losing committed transactions when you needed and thought
you had ACID semantics... is bad.  But that''s implied in any
restore-from-backups situation.  So you replicate/distribute
transactions so that restore from backups (or snapshots) is an
absolutely last resort matter, and if you ever have to restore from
backups you also spend time manually tracking down (from
counterparties, "paper" trails kept elsewhere, ...) any missing
transactions.

Nico
--

Richard Elling

2011-Jun-16 20:20 UTC

head link

[zfs-discuss] question about COW and snapshots

On Jun 16, 2011, at 12:09 AM, Simon Walter wrote:
> On 06/16/2011 09:09 AM, Erik Trimble wrote:
>> We had a similar discussion a couple of years ago here, under the title
"A Versioning FS". Look through the archives for the full discussion.
>> 
>> The jist is that application-level versioning (and consistency) is
completely orthogonal to filesystem-level snapshots and consistency.  IMHO, they
should never be mixed together - there are way too many corner cases and
application-specific memes for a filesystem to ever fully handle file-level
versioning and *application*-level data consistency.  Don''t mistake one
for the other, and, don''t try to *use* one for the other. 
They''re completely different creatures.
>> 
> 
> I guess that is true of the current FSs available. Though it would be nice
to essentially have a versioning FS in the kernel rather than an application in
userspace.
You can run OpenVMS :-)
 -- richard

Paul Kraus

2011-Jun-16 20:32 UTC

head link

[zfs-discuss] question about COW and snapshots

On Thu, Jun 16, 2011 at 4:20 PM, Richard Elling
<richard.elling at gmail.com> wrote:
> You can run OpenVMS :-)
Since *you* brought it up (I was not going to :-), how does VMS''
versioning FS handle those issues ?

I know that SAM-FS has rules for _when_ copies of a file are made, so
that intermediate states are not captured. The last time I touched
SAM-FS there was _not_ a nice user interface to the previous version,
you had to trudge through log files and then pull the version you
wanted directly from secondary storage (but they did teach us how to
that in the SAM-FS / QFS class).

-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

Freddie Cash

2011-Jun-16 21:14 UTC

head link

[zfs-discuss] question about COW and snapshots

The OpenVMS filesystem is what you are looking for.

On Thu, Jun 16, 2011 at 12:09 AM, Simon Walter <simon at gikaku.com>
wrote:
> On 06/16/2011 09:09 AM, Erik Trimble wrote:
>
>> We had a similar discussion a couple of years ago here, under the title
"A
>> Versioning FS". Look through the archives for the full discussion.
>>
>> The jist is that application-level versioning (and consistency) is
>> completely orthogonal to filesystem-level snapshots and consistency. 
IMHO,
>> they should never be mixed together - there are way too many corner
cases
>> and application-specific memes for a filesystem to ever fully handle
>> file-level versioning and *application*-level data consistency. 
Don''t
>> mistake one for the other, and, don''t try to *use* one for the
other.
>>  They''re completely different creatures.
>>
>>
> I guess that is true of the current FSs available. Though it would be nice
> to essentially have a versioning FS in the kernel rather than an
application
> in userspace. But I regress. I''ll use SVN and webdav.
>
> Thanks for the advice everyone.
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


-- 
Freddie Cash
fjwcash at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110616/efed0cfa/attachment.html>

Erik Trimble

2011-Jun-16 23:23 UTC

head link

[zfs-discuss] question about COW and snapshots

On 6/16/2011 1:32 PM, Paul Kraus wrote:> On Thu, Jun 16, 2011 at 4:20 PM, Richard Elling
> <richard.elling at gmail.com>  wrote:
>
>> You can run OpenVMS :-)
> Since *you* brought it up (I was not going to :-), how does VMS''
> versioning FS handle those issues ?
>It doesn''t, per se.  VMS''s filesystem has a
"versioning" concept (i.e.
every time you do a close() on a file, it creates a new file with the 
version number appended, e.g.  foo;1  and foo;2  are the same file, 
different versions).  However, it is completely missing the rest of the 
features we''re talking about, like data *consistency* in that file.
It''s
still up to the app using the file to figure out what data consistency 
means, and such.  Really, all VMS adds is versioning, nothing else (no 
API, no additional features, etc.).
> I know that SAM-FS has rules for _when_ copies of a file are made, so
> that intermediate states are not captured. The last time I touched
> SAM-FS there was _not_ a nice user interface to the previous version,
> you had to trudge through log files and then pull the version you
> wanted directly from secondary storage (but they did teach us how to
> that in the SAM-FS / QFS class).
I''d have to look, but I *think* there is a better way to get to the
file
history/version information now.

-- 
Erik Trimble
Java Platform Group Infrastructure
Mailstop:  usca22-317
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (UTC-0800)

Ross Walker

2011-Jun-18 02:02 UTC

head link

[zfs-discuss] question about COW and snapshots

On Jun 16, 2011, at 7:23 PM, Erik Trimble <erik.trimble at oracle.com>
wrote:
> On 6/16/2011 1:32 PM, Paul Kraus wrote:
>> On Thu, Jun 16, 2011 at 4:20 PM, Richard Elling
>> <richard.elling at gmail.com>  wrote:
>> 
>>> You can run OpenVMS :-)
>> Since *you* brought it up (I was not going to :-), how does
VMS''
>> versioning FS handle those issues ?
>> 
> It doesn''t, per se.  VMS''s filesystem has a
"versioning" concept (i.e. every time you do a close() on a file, it
creates a new file with the version number appended, e.g.  foo;1  and foo;2  are
the same file, different versions).  However, it is completely missing the rest
of the features we''re talking about, like data *consistency* in that
file. It''s still up to the app using the file to figure out what data
consistency means, and such.  Really, all VMS adds is versioning, nothing else
(no API, no additional features, etc.).
I believe NTFS was built on the same concept of file streams the VMS FS used for
versioning.

It''s a very simple versioning system.

Personnally I use Sharepoint, but there are other content management systems out
there that provide what your looking for, so no need to bring out the crypt
keeper.

-Ross

Michael Sullivan

2011-Jun-18 04:44 UTC

head link

[zfs-discuss] question about COW and snapshots

On 17 Jun 11, at 21:02 , Ross Walker wrote:
> On Jun 16, 2011, at 7:23 PM, Erik Trimble <erik.trimble at
oracle.com> wrote:
> 
>> On 6/16/2011 1:32 PM, Paul Kraus wrote:
>>> On Thu, Jun 16, 2011 at 4:20 PM, Richard Elling
>>> <richard.elling at gmail.com>  wrote:
>>> 
>>>> You can run OpenVMS :-)
>>> Since *you* brought it up (I was not going to :-), how does
VMS''
>>> versioning FS handle those issues ?
>>> 
>> It doesn''t, per se.  VMS''s filesystem has a
"versioning" concept (i.e. every time you do a close() on a file, it
creates a new file with the version number appended, e.g.  foo;1  and foo;2  are
the same file, different versions).  However, it is completely missing the rest
of the features we''re talking about, like data *consistency* in that
file. It''s still up to the app using the file to figure out what data
consistency means, and such.  Really, all VMS adds is versioning, nothing else
(no API, no additional features, etc.).
> 
> I believe NTFS was built on the same concept of file streams the VMS FS
used for versioning.
> 
> It''s a very simple versioning system.
> 
> Personnally I use Sharepoint, but there are other content management
systems out there that provide what your looking for, so no need to bring out
the crypt keeper.
> 
I think from following this whole discussion people are wanting
"Versions" which will be offered by OS X Lion soon. However, it is
dependent upon applications playing nice,behaving and using the
"standard" API''s.

It would likely take a major overhaul in the way ZFS handles snapshots to create
them at the object level rather than the filesystems level.  Might be a nice
exploratory exercise for those in the know with the ZFS roadmap, but then there
are two "roadmaps" right?

Also consistency and integrity cannot be guaranteed on the object level since an
application may have more than a single filesystem object in use at a time and
operations would need to be transaction based with commits and rollbacks.

Way off-topic, but Smalltalk and its variants do this by maintaining the state
of everything in an operating environment image.

But then again, I could be wrong.

Mike

---
Michael Sullivan                   
mps at axsh.us
http://www.axsh.us/
Phone: +1-662-259-8888
Mobile: +1-662-202-7716

Toby Thain

2011-Jun-18 15:45 UTC

head link

[zfs-discuss] question about COW and snapshots

On 18/06/11 12:44 AM, Michael Sullivan wrote:> ...
> Way off-topic, but Smalltalk and its variants do this by maintaining the
state of everything in an operating environment image.
> 
...Which is in memory, so things are rather different from the world of
filesystems.

--Toby
> But then again, I could be wrong.
> 
> Mike
> 
> ---
> Michael Sullivan                   
> mps at axsh.us
> http://www.axsh.us/
> Phone: +1-662-259-8888
> Mobile: +1-662-202-7716
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

zfs discuss - Jun 2011 - question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots

[zfs-discuss] question about COW and snapshots