Hello, with reference to bug id #4852821: user undo I have implemented a basic prototype that has the current functionality: 1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted Unfortunately, it is non-trivial to completely reproduce the namespace of deleted files: for now, deleting "/foo/bar" will result in ".zfs/deleted/bar". 2) As a result of 1, deleted files move out of .zfs/deleted in FIFO. Ie. if you remove /foo/bar twice, the most recent copy will be the one remaining in .zfs/deleted. 3) If another user deletes /foo/bar, and you try to delete /foo/bar, you will be denied permissions. Again, this is due to namespace clashes. I''m leaning towards completely reproducing the namespace, but would like to get a feel for whether the benefits outweigh the code complexity. Advice would be appreciated. Also, I presume I can request-sponsor for 4852821 and get someone from the zfs team to mentor me? Thanks again for all your time. :) -- Regards, Jeremy
Jeremy Teo wrote:> Hello, > > with reference to bug id #4852821: user undo > > I have implemented a basic prototype that has the current functionality: > > 1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted > Unfortunately, it is non-trivial to completely reproduce the namespace > of deleted files: for now, deleting "/foo/bar" will result in > ".zfs/deleted/bar".I had exactly the same issue when I implemented this for ext2 years ago :-) [ I never finished it - got bored with it since I didn''t actually need it myself :-) ].> 2) As a result of 1, deleted files move out of .zfs/deleted in FIFO. > Ie. if you remove /foo/bar twice, the most recent copy will be the one > remaining in .zfs/deleted.That seems okay, and you have that issue even if you do reproduce the namespace, for example: $ rm foo/bar < you now have .zfs/deleted/foo/bar > $ mkdir foo/bar < you now have .zfs/deleted/foo/bar and foo/bar > $ rm foo/bar Now what do we do ? the FIFO seems reasonable to me, if you need better than that use snapshots.> 3) If another user deletes /foo/bar, and you try to delete /foo/bar, > you will be denied permissions. Again, this is due to namespace > clashes.Thats not good and I think this would violate POSIX requirements for unlink(2).> I''m leaning towards completely reproducing the namespace, but would > like to get a feel for whether the benefits outweigh the code > complexity. Advice would be appreciated.POSIX compliance is a must. The FIFO idea actually sounds pretty good. How does this interact with snapshots ?> Also, I presume I can request-sponsor for 4852821 and get someone from > the zfs team to mentor me?The request-sponsor is really for when you are done. If you want code help I''ve found asking on zfs-code at opensolaris.org gets great feedback. -- Darren J Moffat
On 5/24/06, Jeremy Teo <white.wristband at gmail.com> wrote:> Hello, > > with reference to bug id #4852821: user undo > > I have implemented a basic prototype that has the current functionality: > > 1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted > Unfortunately, it is non-trivial to completely reproduce the namespace > of deleted files: for now, deleting "/foo/bar" will result in > ".zfs/deleted/bar". > 2) As a result of 1, deleted files move out of .zfs/deleted in FIFO. > Ie. if you remove /foo/bar twice, the most recent copy will be the one > remaining in .zfs/deleted. > 3) If another user deletes /foo/bar, and you try to delete /foo/bar, > you will be denied permissions. Again, this is due to namespace > clashes. >how about changing the name of the file to uid or username-filename this atleast gets you the ability to let each user the ability to delete there own file, shouldn''t be much work. Another possible enhancement would be adding anything field in stat(stat) in the files name after its deleted. This would be set per filesystem. mod, uid, username(the code should do the conversion), gid, size, mtime and just parse a format string like $mtime-$name. James Dickens uadmin.blogspot.com> I''m leaning towards completely reproducing the namespace, but would > like to get a feel for whether the benefits outweigh the code > complexity. Advice would be appreciated. > > Also, I presume I can request-sponsor for 4852821 and get someone from > the zfs team to mentor me? > > Thanks again for all your time. :) > > > -- > Regards, > Jeremy > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Wed, 2006-05-24 at 12:22, James Dickens wrote:> how about changing the name of the file to uid or username-filename > this atleast gets you the ability to let each user the ability to > delete there own file, shouldn''t be much work. Another possible > enhancement would be adding anything field in stat(stat) in the files > name after its deleted. This would be set per filesystem. mod, uid, > username(the code should do the conversion), gid, size, mtime and just > parse a format string like $mtime-$name.A number of (generally older) systems have had the concept of numbered file versions. (I recall seeing this during casual use of ITS, TOPS-20, and VMS; GNU Emacs and its derivatives emulate this via the use of .~NN~ backup copies, but this "pollutes" the directory namespace). Adding a version/generation number to the filename in the "deleted" directory would allow multiple versions to coexist. It might also make sense to populate the "deleted" directory with an older version when file contents are deleted via an open(..., ...|O_TRUNC) or when a file is deleted via rename. - Bill
Other possibilities:
 - put a .deleted directory in every directory (not on by default, for
   POSIX compliance)
 - put a link in .deleted named after the file''s dnode and append a
text
   ({fname, dnode#}) entry to a log file so it can more easily be found
Ultimately deleted files'' space has to be reclaimed though, so
something
has to delete .deleted files, no?
Nico
--
Ummmm. Remind me why we should support "undo" (or, more aptly named, "safe delete") in ZFS? Isn''t this an application feature, not a filesystem feature? I would expect something like this behavior when using Nautilus, but certainly not when using "rm". That is, maybe there should be a library which has a "safe delete" system call for use by applications, and has code specific to the various filesystems to implement the feature, but I can''t really see the point in implementing "safe delete" at the filesystem level. It screws with too many long-standing assumptions. If you want to do avoid a namespace collision, you''ll probably have to implement the "recycle bin" as a DB and file collection. Move the file being deleted over to /your_pool/your_fs/.zfs/deleted/, and rename it by a unique ID (whatever the ZFS equivalent of Inodes is, for example). Keep the ACL/permissions on the file, but put the complete pathname (relative to the root of the filesystem) [and, maybe something like create_date as well] in a hashtable with the ID key. For directories, you might need something a little more fancy in the DB (like keeping the full ACL/perm metadata there). Thus, you''d end up with something like: % ls /your_pool/your_fs/.zfs/deleted/ files.db dirs.db 10021238 01924132 13243542 And, once again, I''ve totally forgotten where the ZFS bug list lives. Pointers, please? -- Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Wed, May 24, 2006 at 11:22:23AM -0700, Erik Trimble wrote:> Ummmm. > > Remind me why we should support "undo" (or, more aptly named, "safe > delete") in ZFS? > > Isn''t this an application feature, not a filesystem feature? I would > expect something like this behavior when using Nautilus, but certainly > not when using "rm".This is exactly why it should be supported. It is application-independent. If you count the number of Solaris users who type ''rm'' versus the number that click-and-drag files to the trash bin, I''d wager that you''d find many, many, orders of magnitude more folks who don''t _want_ to rely on application features. The recycle bin is also per-user, not per-filesystem. The location of the copy is dependent on who did the original deletion, and may not be accessible (i.e. over NFS) in the same way as the original filesystem.> That is, maybe there should be a library which has a "safe delete" > system call for use by applications, and has code specific to the > various filesystems to implement the feature, but I can''t really see the > point in implementing "safe delete" at the filesystem level. It screws > with too many long-standing assumptions.You don''t have to have use it, it would be a property like anything else in ZFS, and one which would default to ''off''. It obviously cannot be on by default because it would violate too many POSIX rules. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
On Wed, 2006-05-24 at 11:31 -0700, Eric Schrock wrote:> On Wed, May 24, 2006 at 11:22:23AM -0700, Erik Trimble wrote > > Isn''t this an application feature, not a filesystem feature? I would > > expect something like this behavior when using Nautilus, but certainly > > not when using "rm". > > This is exactly why it should be supported. It is > application-independent. If you count the number of Solaris users who > type ''rm'' versus the number that click-and-drag files to the trash bin, > I''d wager that you''d find many, many, orders of magnitude more folks who > don''t _want_ to rely on application features. The recycle bin is also > per-user, not per-filesystem. The location of the copy is dependent on > who did the original deletion, and may not be accessible (i.e. over NFS) > in the same way as the original filesystem.But my point being that "undo" is appropriate at the APPLICATION level, not the FILESYSTEM level. An application (whether Nautilus or "rm") should have the ability to call a system library to support "undo", which has the relevant code, but ZFS itself should have no concept of "undo". This keeps the applications FS-agnostic, so you support "undo" across ZFS, UFS, NFS, etc. So, our mythical system library (libundelete.so) should support a couple of generic functions (say: int safe_unlink(const char *path), and void empty_recyclebin(const char *path) which look for an ENV variable to determine if they should recycle or should delete, as appropriate) for applications to call, and then the library code has to support implementing this on various FSes. Maybe, MAYBE, after we implement a generic system library call to support "undo" across all (reasonable) FSes, we consider putting in "undo" in the actual FS for performance reasons, so that the library can simply call the FS libraries to do the "undo". -- Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Wed, May 24, 2006 at 12:18:48PM -0700, Erik Trimble wrote:> > But my point being that "undo" is appropriate at the APPLICATION level, > not the FILESYSTEM level. An application (whether Nautilus or "rm") > should have the ability to call a system library to support "undo", > which has the relevant code, but ZFS itself should have no concept of > "undo". This keeps the applications FS-agnostic, so you support "undo" > across ZFS, UFS, NFS, etc. > > So, our mythical system library (libundelete.so) should support a couple > of generic functions (say: int safe_unlink(const char *path), and void > empty_recyclebin(const char *path) which look for an ENV variable to > determine if they should recycle or should delete, as appropriate) for > applications to call, and then the library code has to support > implementing this on various FSes. > > Maybe, MAYBE, after we implement a generic system library call to > support "undo" across all (reasonable) FSes, we consider putting in > "undo" in the actual FS for performance reasons, so that the library can > simply call the FS libraries to do the "undo". >No, this is not the point of this RFE. We are not trying to implement a wide-ranging subsystem that understands how to manage semantically valid undo points. This would never, ever, be supported by any significant number of applications, and is probably impossible at the filesystem level. The point is rather to provide "undelete", which will account for 99% of all the times that someone would want to have ''undo''. This is a vastly simpler problem, and probably more useful. Feel free to think of it as ''undelete'' instead of ''undo'' if it makes things easier. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
On Wed, May 24, 2006 at 02:43:38PM -0700, Eric Schrock wrote:> No, this is not the point of this RFE. We are not trying to implement a > wide-ranging subsystem that understands how to manage semantically valid > undo points. This would never, ever, be supported by any significant > number of applications, and is probably impossible at the filesystem > level. > > The point is rather to provide "undelete", which will account for 99% of > all the times that someone would want to have ''undo''. This is a vastly > simpler problem, and probably more useful. Feel free to think of it as > ''undelete'' instead of ''undo'' if it makes things easier.While we can probably make some ''versions'' of files, deleted or otherwise, naturally show up in .zfs/.deleted/.version/.something directories, I wonder if we might not want an API that could let one get at all ''versions'' (one version per-txg) of a file still available -- i.e., going backwards in the transaction group history, if all old blocks for a given ''version'' of a file are still not reclaimed, you can then re-create the file. ACLs get interesting. For deleted files that should up in <root>/.zfs/deleted we have the problem that directory permissions in the path(s) to the file are lost, so the deleted file name really needs to be something not meaningful to humans (say, dnode/gen numbers), and any indexing needs to be per-file owner. For file versions available through some API which ACL should be checked? The current one, or the old one(s), or both/all? An API might evolve to allow for per-file snapshots/clones. Nico --
On Wed, 2006-05-24 at 14:43 -0700, Eric Schrock wrote:> No, this is not the point of this RFE. We are not trying to implement a > wide-ranging subsystem that understands how to manage semantically valid > undo points. This would never, ever, be supported by any significant > number of applications, and is probably impossible at the filesystem > level. > > The point is rather to provide "undelete", which will account for 99% of > all the times that someone would want to have ''undo''. This is a vastly > simpler problem, and probably more useful. Feel free to think of it as > ''undelete'' instead of ''undo'' if it makes things easier. > > - EricSorry, semantics on my part. I mean "undelete", in a manner identical to having the Recycling Bin functionality of Nautilus or Windows Explorer. That is, when you "delete" a file, it is actually moved aside to some hidden place, where it can be recovered easily by another command. All my arguments are concerning this kind of functionality, which I''m trying to say belongs up in the app. Otherwise, it gets _very_ confusing. Let''s say that you implement "undelete" in ZFS, which, in order to work, has to (a) be an enabled attribute of the ZFS pool or filesystem, and (B) uses some sort of an ENV var to indicate that a given user''s tools will do "undelete" instead of permanent remove. Now, you end up with a situation where behavior of an app varies significantly across filesystem boundaries, which are _supposed_ to be invisible to the end-user. That is, the behavior of "rm" varies according to where in the filesystem tree I sit. Additionally, it doesn''t allow for variation; that is, deleting a file via "rm" and "nautilus" does the exact same thing, even if I wanted "rm" to actually remove the file and not just send it to the recycle bin. Rather, I would submit that for better consistency, having a new global libundelete.so (containing a modified "unlink") which implements "undelete" in a FS-agnostic way is better. You get the feature across all Filesystem Types that way, and it''s portable. It would also allow apps to decide if they want to support "undelete" or vanilla "unlink" on an app-by-app basis. The apps would have to link against the new libundelete.so to get the functionality, which I think is reasonable. -- Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
Cool - I can see my old fav''s from Netware 3.12 making a comeback. It was always great to be able to salvage things from a disk that someone did not mean to kill. :) ah - salvage - my old friend... Does this also usher in the return of purge too? :) Nathan. Erik Trimble wrote:> On Wed, 2006-05-24 at 14:43 -0700, Eric Schrock wrote: > >>No, this is not the point of this RFE. We are not trying to implement a >>wide-ranging subsystem that understands how to manage semantically valid >>undo points. This would never, ever, be supported by any significant >>number of applications, and is probably impossible at the filesystem >>level. >> >>The point is rather to provide "undelete", which will account for 99% of >>all the times that someone would want to have ''undo''. This is a vastly >>simpler problem, and probably more useful. Feel free to think of it as >>''undelete'' instead of ''undo'' if it makes things easier. >> >>- Eric > > > Sorry, semantics on my part. I mean "undelete", in a manner identical to > having the Recycling Bin functionality of Nautilus or Windows Explorer. > That is, when you "delete" a file, it is actually moved aside to some > hidden place, where it can be recovered easily by another command. > > All my arguments are concerning this kind of functionality, which I''m > trying to say belongs up in the app. Otherwise, it gets _very_ > confusing. > > > Let''s say that you implement "undelete" in ZFS, which, in order to work, > has to (a) be an enabled attribute of the ZFS pool or filesystem, and > (B) uses some sort of an ENV var to indicate that a given user''s tools > will do "undelete" instead of permanent remove. > > Now, you end up with a situation where behavior of an app varies > significantly across filesystem boundaries, which are _supposed_ to be > invisible to the end-user. That is, the behavior of "rm" varies > according to where in the filesystem tree I sit. Additionally, it > doesn''t allow for variation; that is, deleting a file via "rm" and > "nautilus" does the exact same thing, even if I wanted "rm" to actually > remove the file and not just send it to the recycle bin. > > > Rather, I would submit that for better consistency, having a new global > libundelete.so (containing a modified "unlink") which implements > "undelete" in a FS-agnostic way is better. You get the feature across > all Filesystem Types that way, and it''s portable. It would also allow > apps to decide if they want to support "undelete" or vanilla "unlink" on > an app-by-app basis. The apps would have to link against the new > libundelete.so to get the functionality, which I think is reasonable. > >
On 5/24/06, Erik Trimble <Erik.Trimble at sun.com> wrote:> So, our mythical system library (libundelete.so) should support a couple > of generic functions (say: int safe_unlink(const char *path), and void > empty_recyclebin(const char *path) which look for an ENV variable to > determine if they should recycle or should delete, as appropriate) for > applications to call, and then the library code has to support > implementing this on various FSes.If it were unlink(3C) rather than unlink(2), an interposer library could make this functionality generally available. Surely there must be a dtrace hack that could redirect calls destined for unlink() to safe_unlink(), subject to environment information. I suspect, however, that protecting every file from deletion may be a bit aggressive. Consider "here documents" from shell scripts, browser cache files, compiler temp files, etc. Mike -- Mike Gerdts http://mgerdts.blogspot.com/
On Wed, May 24, 2006 at 07:10:52PM -0500, Mike Gerdts wrote:> If it were unlink(3C) rather than unlink(2), an interposer library > could make this functionality generally available. Surely there must > be a dtrace hack that could redirect calls destined for unlink() to > safe_unlink(), subject to environment information.You most certainly can interpose on system calls just as with any C function calls -- after all applications have to call function stubs for them that in turn do the actual trapping to the kernel. Nico --
"Jeremy Teo" <white.wristband at gmail.com> wrote:> Hello, > > with reference to bug id #4852821: user undo > > I have implemented a basic prototype that has the current functionality: > > 1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted > Unfortunately, it is non-trivial to completely reproduce the namespace > of deleted files: for now, deleting "/foo/bar" will result in > ".zfs/deleted/bar". > 2) As a result of 1, deleted files move out of .zfs/deleted in FIFO. > Ie. if you remove /foo/bar twice, the most recent copy will be the one > remaining in .zfs/deleted. > 3) If another user deletes /foo/bar, and you try to delete /foo/bar, > you will be denied permissions. Again, this is due to namespace > clashes.How about appending the decimal inode number to the file name? J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
Joerg Schilling wrote:> "Jeremy Teo" <white.wristband at gmail.com> wrote: > > >>Hello, >> >>with reference to bug id #4852821: user undo >> >>I have implemented a basic prototype that has the current functionality: >> >>1) deleted files/directories are moved to /your_pool/your_fs/.zfs/deleted >>Unfortunately, it is non-trivial to completely reproduce the namespace >>of deleted files: for now, deleting "/foo/bar" will result in >>".zfs/deleted/bar". >>2) As a result of 1, deleted files move out of .zfs/deleted in FIFO. >>Ie. if you remove /foo/bar twice, the most recent copy will be the one >>remaining in .zfs/deleted. >>3) If another user deletes /foo/bar, and you try to delete /foo/bar, >>you will be denied permissions. Again, this is due to namespace >>clashes. > > > How about appending the decimal inode number to the file name? >Anything that attempts to append characters on the end of the filename will run into trouble when the file name is already at NAME_MAX. -Mark
>Anything that attempts to append characters on the end of the filename >will run into trouble when the file name is already at NAME_MAX.One simple solution is to restrict the total length of the name to NAME_MAX, truncating the original filename as necessary to allow appending. This does introduce the possibility of conflicts with very long names which happen to end in numeric strings, but that is likely to be rare and could be resolved in an ad hoc fashion (e.g. flipping a bit in the representation of "inode number" until a unique name is achieved). This message posted from opensolaris.org
Hi,
the current discussion on how to implement "undo" seems to circulate
around
concepts and tweaks for replacing any "rm" like action with
"mv" and then
fix the problems associated with namespaces, ACLs etc.
Why not use snapshots?
A snapshot-oriented implementation of undo would:
- Create a snapshot of the FS whenever anything is attempted that someone
   might want to undo. This could be done even at the most fundamental level
   (i.e. before any "zpool" or "zfs" command, where the
potential damage to
   be undone is biggest).
- The undo-feature would then exchange the live FS with the snapshot taken
   prior to the revoked action. Just tweak one or two pointers and the undo
   is done.
- This would transparently work with any app, user action, even admin action,
   depending on where the snapshotting code would be hooked up to.
- As an alternative to undo, the user can browse the .zfs hierarchy in search
   of that small file which got lost in an rm -rf orgy without having to restore
   the snapshot with all the other unwanted files.
- When ZFS wants to reclaim blocks, it would start deleting the oldest
   undo-snapshots.
- To separate undo-snapshots from user-triggered ones, the undo-code could
   place its snapshots in .zfs/snapshots/undo .
Did I miss something why undo can''t be implemented with snapshots?
Best regards,
    Constantin
-- 
Constantin Gonzalez                            Sun Microsystems GmbH, Germany
Platform Technology Group, Client Solutions                http://www.sun.de/
Tel.: +49 89/4 60 08-25 91                   http://blogs.sun.com/constantin/
Hello Constantin, On 5/29/06, Constantin Gonzalez <Constantin.Gonzalez at sun.com> wrote:> Hi, > > the current discussion on how to implement "undo" seems to circulate around > concepts and tweaks for replacing any "rm" like action with "mv" and then > fix the problems associated with namespaces, ACLs etc. > > Why not use snapshots?I hadn''t considered that yet: I was quite fixated on solving this at the ZPL level. :(> A snapshot-oriented implementation of undo would: > > - Create a snapshot of the FS whenever anything is attempted that someone > might want to undo. This could be done even at the most fundamental level > (i.e. before any "zpool" or "zfs" command, where the potential damage to > be undone is biggest). > > - The undo-feature would then exchange the live FS with the snapshot taken > prior to the revoked action. Just tweak one or two pointers and the undo > is done. > > - This would transparently work with any app, user action, even admin action, > depending on where the snapshotting code would be hooked up to. > > - As an alternative to undo, the user can browse the .zfs hierarchy in search > of that small file which got lost in an rm -rf orgy without having to restore > the snapshot with all the other unwanted files. > > - When ZFS wants to reclaim blocks, it would start deleting the oldest > undo-snapshots. > > - To separate undo-snapshots from user-triggered ones, the undo-code could > place its snapshots in .zfs/snapshots/undo . > > Did I miss something why undo can''t be implemented with snapshots?No you didn''t. Given the points you have raised, this does seem like the way to go. I''ll dig around in the snapshot code and see if I can whip up something more lightweight ie. a more targetted snapshot that can snapshot a subset of the original filesystem. Thanks for the suggestion! :) -- Regards, Jeremy
Constantin Gonzalez wrote On 05/29/06 02:50,:> Hi, > > the current discussion on how to implement "undo" seems to circulate around > concepts and tweaks for replacing any "rm" like action with "mv" and then > fix the problems associated with namespaces, ACLs etc. > > Why not use snapshots? > > A snapshot-oriented implementation of undo would: > > - Create a snapshot of the FS whenever anything is attempted that someone > might want to undo. This could be done even at the most fundamental > level > (i.e. before any "zpool" or "zfs" command, where the potential damage to > be undone is biggest). > > - The undo-feature would then exchange the live FS with the snapshot taken > prior to the revoked action. Just tweak one or two pointers and the undo > is done. > > - This would transparently work with any app, user action, even admin > action, > depending on where the snapshotting code would be hooked up to. > > - As an alternative to undo, the user can browse the .zfs hierarchy in > search > of that small file which got lost in an rm -rf orgy without having to > restore > the snapshot with all the other unwanted files. > > - When ZFS wants to reclaim blocks, it would start deleting the oldest > undo-snapshots. > > - To separate undo-snapshots from user-triggered ones, the undo-code could > place its snapshots in .zfs/snapshots/undo . > > Did I miss something why undo can''t be implemented with snapshots?Well creating a snapshot isn''t exactly free. It requires flushing out all the current in progress transactions to ensure the containing transaction group is committed on disk. This can be quick or can take a few seconds depending on the current load. So it isn''t practical to snapshot before every remove - but perhaps a courser grain might work?> > Best regards, > Constantin >-- Neil
Once again, I hate to be a harpy on this one, but are we really convinced that having a "undo" (I''m going to call is RecycleBin from now on) function for file deletion built into ZFS is a good thing? Since I''ve seen nothing to the contrary, I''m assuming that we''re doing this by changing the actual effects of an "unlink(2)" sys lib call against a file in ZFS, and having some other library call added to take care of actual deletion. Even with it being a ZFS option parameter, I can see soooo many places that it breaks assumptions and causes problems that I can''t think it''s a good thing to blindly turn on for everything. And, I''ve still not seen a good rebuttal to the idea of moving this up to the Application level, and using a new library to implements the functionality (and requires Apps to specifically (and explicitly) support RecycleBin in the design). You will notice that Windows does this. The Recycle Bin is usable from within Windows Explorer, but if you use "del" from a command prompt, it actually deletes the file. I see no reason why we shouldn''t support the same functionality (i.e. RecycleBin from within Nautilus (as it already does), and true deletion via "rm"). -Erik
Hi,
so we have two questions:
1. Is it really ZFS'' job to provide an undo functionality?
2. If it turns out to be a feature that needs to be implemented by
    ZFS, what is the better approach: Snapshot based or file-based?
My personal opinion on 1) is:
- The purpose of any Undo-like action is to provide a safety net to the user
   in case she commits an error that she wants to undo.
- So, it depends on how we define "user" here. If by user we mean your
regular
   file system user with a GUI etc., then of course it''s a matter of
the
   application.
- But if user=sysadmin, I guess a more fundamental way of implementing
"undo" is
   in order. We could either restrict the undo functionality to some admin
   interface and force admins to use just that, then it would still be a feature
   that the admin interface needs to implement.
   But in order to save all admins from shooting themselves into their knees,
the
   best way would be to provide an admin-savvy safety net.
- Now, coming from the other side, ZFS provides a nice and elegant way of
   implementing snapshots. That''s where I count 1+1: If ZFS knew how to
do
   snapshots right before any significant administrator or user action and if
   ZFS had a way of managing those snapshots so admins and users could easily
   undo any action (including zfs destroy, zpool destroy, or just rm -rf /*),
   then the benefit/investment ratio for implementing such a feature should
   be extremely interesting.
One more step towards a truly foolproof filesystem.
But: If it turns out that providing an undo function via snapshots is not
possible/elegantly feasible/cheap or if there''s any significant
roadblock that
prevents ZFS from providing an undo feature in an elegant way, then it might not
be a good idea after all and we should just forget it.
So I guess it boils down to: Can the ZFS framework be used to implement an undo
feature much more elegantly than your classic filemanager while extending the 
range of undo customers to even the CLI based admin?
Best regards,
    Constantin
Erik Trimble wrote:> Once again, I hate to be a harpy on this one, but are we really 
> convinced that having a "undo" (I''m going to call is
RecycleBin from now
> on) function for file deletion built into ZFS is a good thing?
> 
> Since I''ve seen nothing to the contrary, I''m assuming
that we''re doing
> this by changing the actual effects of an "unlink(2)" sys lib
call
> against a file in ZFS, and having some other library call added to take 
> care of actual deletion.
> 
> Even with it being a ZFS option parameter, I can see soooo many places 
> that it breaks assumptions and causes problems that I can''t think
it''s a
> good thing to blindly turn on for everything.
> 
> And, I''ve still not seen a good rebuttal to the idea of moving
this up
> to the Application level, and using a new library to implements the 
> functionality (and requires Apps to specifically (and explicitly) 
> support RecycleBin in the design).
> 
> 
> 
> You will notice that Windows does this.  The Recycle Bin is usable from 
> within Windows Explorer, but if you use "del" from a command
prompt, it
> actually deletes the file.  I see no reason why we shouldn''t
support the
> same functionality (i.e. RecycleBin from within Nautilus (as it already 
> does), and true deletion via "rm").
> 
> 
> 
> -Erik
> 
-- 
Constantin Gonzalez                            Sun Microsystems GmbH, Germany
Platform Technology Group, Client Solutions                http://www.sun.de/
Tel.: +49 89/4 60 08-25 91                   http://blogs.sun.com/constantin/
hey All, On Tue, 2006-05-30 at 16:50 +0200, Constantin Gonzalez Schmitz wrote:> - The purpose of any Undo-like action is to provide a safety net to the user > in case she commits an error that she wants to undo.So, what if the user was able to specify which applications they wanted such a safety net for (thus lessening the load on the filesystem, watching *every* delete) - or were able to specify a few sub-directories they wanted to take special care with ? [ eg. "ZFS, please provide me undo capability for files in /home/timf/Documents/plans-to-takeover-world when I''m using nautilus" ] With a tiny bit of DTrace hackery, you could have something like: ------ snapshot-on-delete.d -------- #!/usr/sbin/dtrace -qws syscall::unlink:entry /pid==$1/ { this->file = copyinstr(arg0); system ("/usr/sbin/take-undo-snapshot.sh %s",this->file); } ------------------------------------ Something like: % ./snapshot-on-delete.d `pgrep nautilus` Where the shell script "take-undo-snapshot.sh" would take another snapshot in some known namespace, up to some a pre-defined limit, if that file was found to be resident on a zfs filesystem (and optionally in some given directory) Now, it probably will scale badly if you have hundreds of users, running hundreds of applications, each one invoking a shell script on each file delete, and as Neil pointed out, many many snapshots aren''t cheap. But as a proof-of-concept, this would work fine. It''d be interesting to see how badly people wanted this functionality, before boiling the ocean (again!) to provide it :-) Of course, "redo" is a little trickier, as your application would need to know about the snapshot namespace, but at least your data is safe. cheers, tim> - So, it depends on how we define "user" here. If by user we mean your regular > file system user with a GUI etc., then of course it''s a matter of the > application. > > - But if user=sysadmin, I guess a more fundamental way of implementing "undo" is > in order. We could either restrict the undo functionality to some admin > interface and force admins to use just that, then it would still be a feature > that the admin interface needs to implement. > > But in order to save all admins from shooting themselves into their knees, the > best way would be to provide an admin-savvy safety net. > > - Now, coming from the other side, ZFS provides a nice and elegant way of > implementing snapshots. That''s where I count 1+1: If ZFS knew how to do > snapshots right before any significant administrator or user action and if > ZFS had a way of managing those snapshots so admins and users could easily > undo any action (including zfs destroy, zpool destroy, or just rm -rf /*), > then the benefit/investment ratio for implementing such a feature should > be extremely interesting. > > One more step towards a truly foolproof filesystem. > > But: If it turns out that providing an undo function via snapshots is not > possible/elegantly feasible/cheap or if there''s any significant roadblock that > prevents ZFS from providing an undo feature in an elegant way, then it might not > be a good idea after all and we should just forget it. > > So I guess it boils down to: Can the ZFS framework be used to implement an undo > feature much more elegantly than your classic filemanager while extending the > range of undo customers to even the CLI based admin? > > Best regards, > Constantin > > Erik Trimble wrote: > > Once again, I hate to be a harpy on this one, but are we really > > convinced that having a "undo" (I''m going to call is RecycleBin from now > > on) function for file deletion built into ZFS is a good thing? > > > > Since I''ve seen nothing to the contrary, I''m assuming that we''re doing > > this by changing the actual effects of an "unlink(2)" sys lib call > > against a file in ZFS, and having some other library call added to take > > care of actual deletion. > > > > Even with it being a ZFS option parameter, I can see soooo many places > > that it breaks assumptions and causes problems that I can''t think it''s a > > good thing to blindly turn on for everything. > > > > And, I''ve still not seen a good rebuttal to the idea of moving this up > > to the Application level, and using a new library to implements the > > functionality (and requires Apps to specifically (and explicitly) > > support RecycleBin in the design). > > > > > > > > You will notice that Windows does this. The Recycle Bin is usable from > > within Windows Explorer, but if you use "del" from a command prompt, it > > actually deletes the file. I see no reason why we shouldn''t support the > > same functionality (i.e. RecycleBin from within Nautilus (as it already > > does), and true deletion via "rm"). > > > > > > > > -Erik > > >-- Tim Foster, Sun Microsystems Inc, Operating Platforms Group Engineering Operations http://blogs.sun.com/timf
On Mon, May 29, 2006 at 11:00:29PM -0700, Erik Trimble wrote:> Once again, I hate to be a harpy on this one, but are we really > convinced that having a "undo" (I''m going to call is RecycleBin from now > on) function for file deletion built into ZFS is a good thing? > > Since I''ve seen nothing to the contrary, I''m assuming that we''re doing > this by changing the actual effects of an "unlink(2)" sys lib call > against a file in ZFS, and having some other library call added to take > care of actual deletion.No. The idea is a FIFO queue, bounded in space. There is no explicit ''actual deletion''. Things just pass in one way and out the other once the space is needed. If you accidentally delete something, you can quickly go back and get it, but it''s not a replacement for regular snapshots. For example, you might do: # zfs set undelete=1m home/eschrock Which would keep just 1 MB of deleted files around, which would allow for recovery of most useful file types (text files, documents, etc).> Even with it being a ZFS option parameter, I can see soooo many places > that it breaks assumptions and causes problems that I can''t think it''s a > good thing to blindly turn on for everything.There is no change in assumption. ''rm'' will still remove a file, only that it''s deletion may be delayed. This is an optional feature, and is no different than ''rm'' in the face of snapshots. If you don''t like it, don''t use the feature (just as you probably wouldn''t want snapshots). There is no "really really remove" command - the point is that the user doesn''t have to think about this, it "just works".> And, I''ve still not seen a good rebuttal to the idea of moving this up > to the Application level, and using a new library to implements the > functionality (and requires Apps to specifically (and explicitly) > support RecycleBin in the design).I''ve tried over and over, with several different points: - doesn''t work over NFS/CIFS (recycle bin ''location'' may not be accessible on all hosts, or may require cross-network traffic to delete a file). - inherently user-centric, not filesystem-centric (location of stored file depends on who''s doing the deletion) - requires application changes (which means it can _NEVER_ scale beyond a handful of apps) - ignores the predominant need (accidental ''rm'') These are real requirements, whether or not you think they are good.> You will notice that Windows does this. The Recycle Bin is usable from > within Windows Explorer, but if you use "del" from a command prompt, it > actually deletes the file. I see no reason why we shouldn''t support the > same functionality (i.e. RecycleBin from within Nautilus (as it already > does), and true deletion via "rm").Except that the whole point of this RFE is that people _want_ ''rm'' to have undelete functionality. The whole world doesn''t use Nautilius/insertapphere. I''m all for having some common ''recycle bin'' functionality (though I think no one will use it beyond a handful of apps), but that is independent of this RFE. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
On Tue, 2006-05-30 at 09:48 -0700, Eric Schrock wrote:> - doesn''t work over NFS/CIFS (recycle bin ''location'' may not be > accessible on all hosts, or may require cross-network traffic to > delete a file). > - inherently user-centric, not filesystem-centric (location of stored > file depends on who''s doing the deletion)Aah right, okay - those are reasons against my previous post about having an application register it''s interest in getting undelete capability. Good points Eric! cheers, tim -- Tim Foster, Sun Microsystems Inc, Operating Platforms Group Engineering Operations http://blogs.sun.com/timf
(I''m going to combine Constantine & Eric''s replies together, so I apologize for the possible confusion): On Tue, 2006-05-30 at 16:50 +0200, Constantin Gonzalez Schmitz wrote:> Hi, > > so we have two questions: > > 1. Is it really ZFS'' job to provide an undo functionality? > > 2. If it turns out to be a feature that needs to be implemented by > ZFS, what is the better approach: Snapshot based or file-based? > > My personal opinion on 1) is: > > - The purpose of any Undo-like action is to provide a safety net to the user > in case she commits an error that she wants to undo. > > - So, it depends on how we define "user" here. If by user we mean your regular > file system user with a GUI etc., then of course it''s a matter of the > application.Agreed. :-)> > - But if user=sysadmin, I guess a more fundamental way of implementing "undo" is > in order. We could either restrict the undo functionality to some admin > interface and force admins to use just that, then it would still be a feature > that the admin interface needs to implement. > > But in order to save all admins from shooting themselves into their knees, the > best way would be to provide an admin-savvy safety net. >As an admin, that certainly sounds noble. However, given Eric''s implementation details: On Tue, 2006-05-30 at 09:48 -0700, Eric Schrock wrote:> No. The idea is a FIFO queue, bounded in space. There is no explicit > ''actual deletion''. Things just pass in one way and out the other once > the space is needed. If you accidentally delete something, you can > quickly go back and get it, but it''s not a replacement for regular > snapshots. For example, you might do: > > # zfs set undelete=1m home/eschrockI can''t imagine that any admin would set this FIFO space anything more than a very small amount. Personally, there will always be pressure from the user base to have more usable disk space (without actually buying more disk), so best case I can picture is that the FIFO is under 1GB for a TB filesystem. Now, for critical filesystems (such as root), which have a relatively fixed size, I could see setting it to say 100mb for a 10GB root partition. The problem here is that it''s _very_ easy to blow right through the FIFO size limit with just a single "rm -rf". Or, if you are on a filesystem that has multiple users, the likelihood that several of us combine to exceed the limit (and, thus, making it likely that what you wanted to "restore" is gone for good) is much higher. This limits the usefulness of the feature considerably.> - Now, coming from the other side, ZFS provides a nice and elegant way of > implementing snapshots. That''s where I count 1+1: If ZFS knew how to do > snapshots right before any significant administrator or user action and if > ZFS had a way of managing those snapshots so admins and users could easily > undo any action (including zfs destroy, zpool destroy, or just rm -rf /*), > then the benefit/investment ratio for implementing such a feature should > be extremely interesting. > > One more step towards a truly foolproof filesystem. >The problem is that you are attempting to divine user INTENTIONS (did I _really_ want to do that). That''s a losing proposition. You will always be wrong (at best) in a significant minority of the time, which will be far above the people''s threshold for tolerance. Take a look at the "-i" function for rm. Do you know how annoying that is for an admin (and, that virtually no admin ever uses it)? Yet, it provides about the same level of protection as "undo" would. Part of being an administrator is learning proper procedures. One of the biggest problems with Windows is that it provides the ILLUSION that any JoeBlow can be an administrator. Yet, time and time again Windows has huge failures linked directly to incompetent (or untrained) administrators. We don''t want to go down this road with Solaris. Providing tools which give an 80% solution is generally very useful in the user space, but is VERY frustrating (and, in my opinion, counterproductive) in the admin space. Accidental file deletion (or, as Constantine pointed advanced, other admin commands such as "zpool destroy") is a problem. HOWEVER, you want to provide only a 100% solution to the problem. Am I going to like a solution which "sorta-kinda-might" restore the file or one which WILL restore the file. What I''m trying to say is that a competent admin will STILL have to maintain version controlled config files, and back things up on a regular basis. ZFS snapshots are very nice in these cases, as they provide a PERMANENT picture of things before changes are made. ZFS undo doesn''t alleviate the need to do any of this, but it CAN provide the ILLUSION that you can skip this. Short of a full version control and complete history retention for all files in a ZFS filesystem, filesystem-level undo isn''t a good idea. The person asking for the RFE doesn''t (in my opinion) have a competent admin staff, and is asking for a BandAid to a problem requiring skilled surgeons. Yes, that is an exaggeration, and yes, there are times when I''ve fat-fingered something and said "aaaargh. I want that back right now!". But, once again, proper sysadmin procedures already protect one from that. Guaranteed.> But: If it turns out that providing an undo function via snapshots is not > possible/elegantly feasible/cheap or if there''s any significant roadblock that > prevents ZFS from providing an undo feature in an elegant way, then it might not > be a good idea after all and we should just forget it. > > So I guess it boils down to: Can the ZFS framework be used to implement an undo > feature much more elegantly than your classic filemanager while extending the > range of undo customers to even the CLI based admin? > > Best regards, > ConstantinOn Tue, 2006-05-30 at 09:48 -0700, Eric Schrock wrote:>I''ve tried over and over, with several different points: > > - doesn''t work over NFS/CIFS (recycle bin ''location'' may not be > accessible on all hosts, or may require cross-network traffic to > delete a file).So, how am I supposed to get back a file from your proposed "undo" solution? Does every directory have a .zfs/deleted directory inside it, with the "temporarily deleted" files residing in there? If not, how do I access a ZFS undo directory across NFS? If so, how is this any different than having a .recyclebin directory for each normal directory?> - inherently user-centric, not filesystem-centric (location of stored > file depends on who''s doing the deletion)with Application-level RecycleBin, the assumption is that all the clients have a common API to call (some lib installed locally), that implements the generic RecycleBin technology across all filesystem types. Take a look at Samba''s implementation of RecycleBin - it works just fine for user-centric applications.> - requires application changes (which means it can _NEVER_ scale >beyond> a handful of apps) >- ignores the predominant need (accidental ''rm'') >Which is EXACTLY what is needed. The RFE is for "accidental rm". There are only going to be a small handful of applications that really can make use of an "undo" function. Those apps are going to be the ones which allow the user to directly manipulate the filesystem. Put it another way: look at the $HOME/tmp directory. How many applications create/delete files there on a very regular basis? You sure you want ZFS to put those into the FIFO "undo"-reserved space? Or /var/run. Or /tmp. Or any of a dozen different Applications which regularly create and delete files themselves, without any user intervention. If I enable ZFS undo on the / filesystem to protect myself, think of all the apps which do this kind of create/delete to things. So, how long will your FIFO last again? -- Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Wed, 2006-05-31 at 03:48, Erik Trimble wrote:> (I''m going to combine Constantine & Eric''s replies together, so I > apologize for the possible confusion): >Apology accepted. :) Anyhoo - What do you think the chances are that any application vendor is going to write in special handling for Solaris file removal? I''m guessing slim to none, but have been wrong before... On the other hand, who hasn''t removed something and thought to themselves "dang. I wish I could get that back right now..." As a frequent ex-user of the good old Netware Salvage, I can tell you I''m a real fan of that type of functionality. Delete something, but it''s not actually *really* deleted until we need that space for something else. At any point, you can fire up the salvage utility and grab the files back out of that directory. It was not perfect either, but, it was wayyyy faster than having to get tapes out... In the case of taking a snapshot before some mythical event that looks like it''s going to seriously change the system, where do we draw the line? What do we do when we get something like a for i in * do rm $i done You want a 100% solution, but is your 100% solution my, or anyone elses 100% solution? I for one, would *much* prefer the filesystem to make that stuff available to me, so regardless of what removed the file, I at least have a chance to get it back. Then again, I''d also much prefer that the files be recoverable right up until we need the space back. Something like what Eric had suggested, but set the space that the deleted files *cannot* use, so we still always have ''free'' blocks ready for new allocations... So, as opposed to # zfs set undelete=1m home/eschrock something like # zfs set undelete-queue-size=1m home/eschrock and # zfs set undelete-unusable-size=100m home/eschrock and these two options being mutually exclusive...>From an implementation perspective, I''ll be interested to see how we getthings back, particularly in the case of multiple directories being removed, and NOT wanting to blow away the files in the directories that I (or my app) might have partially reconstructed in the meanwhile... <fantasy> /my/directory/important/rubbish # rm -rf .* "ARGH!" cd .. (Not there...) cd /my/directory zfs undo $PWD/important (Or whatever interface we use!) "ah. :)" </fantasy> Wow. Even thinking about how the ZFS guys might implement that breaks my head... Nathan.
Nathan Kroenert wrote:> Anyhoo - What do you think the chances are that any application vendor > is going to write in special handling for Solaris file removal? I''m > guessing slim to none, but have been wrong before... >Agreed. However, to this I reply: Who Cares? I''m guessing that 99% of the possible use of the feature is what people keep talking about, which is accidentally typing "rm" on something. So, if you fix "rm" and possibly "nautilus" to use the application-layer RecycleBin, then you''ve pretty much satisfied everyone - people like me who don''t want unlink(2) hijacked and don''t want most apps to use "undo", and people who want "oops-proof" ''rm'' capability. If we have a standard library which provides application-layer RecycleBin features, then any app-vendor can do it if they so choose. I suspect that very few will, but the few who do will have FILESYSTEM-AGNOISTIC RecycleBin. So we get it for UFS, ZFS, NFS, Samba, LOFS, and everything else.> On the other hand, who hasn''t removed something and thought to > themselves "dang. I wish I could get that back right now..." > > As a frequent ex-user of the good old Netware Salvage, I can tell you > I''m a real fan of that type of functionality. > > Delete something, but it''s not actually *really* deleted until we need > that space for something else. At any point, you can fire up the salvage > utility and grab the files back out of that directory. > > It was not perfect either, but, it was wayyyy faster than having to get > tapes out... >The problem of _easy_ recovery of accidentally deleted files is a SMPSA (Simple Matter of Proper System Administration). The AFS implementation at MIT had a nice feature: in everyone''s home directory, there was a .afs/backup directory. It contained what would now be known as a snapshot of last night''s backup. Having that around solved 90% of the accidental deletion problems, since the main issue usually involves easy access to the recovery media. This kind of setup is trivial to configure in the current ZFS.> In the case of taking a snapshot before some mythical event that looks > like it''s going to seriously change the system, where do we draw the > line? What do we do when we get something like a > > for i in * > do > rm $i > done > > You want a 100% solution, but is your 100% solution my, or anyone elses > 100% solution? >The original problem as stated isn''t the whole problem domain, so when I say 100% solution, I mean the solution to the (reasonably restricted) general case, which in this instance is "I want to be able to recover any file previously deleted". . Snapshots aren''t a solution to the problem. They''re useful as a recovery strategy, but aren''t a solution, and if I implied so, then I didn''t mean to.> I for one, would *much* prefer the filesystem to make that stuff > available to me, so regardless of what removed the file, I at least have > a chance to get it back. > > Then again, I''d also much prefer that the files be recoverable right up > until we need the space back. Something like what Eric had suggested, > but set the space that the deleted files *cannot* use, so we still > always have ''free'' blocks ready for new allocations... >But, once again, you get into the half-solution of "some-files-are-available-for-some-time". There are clearly far too many likely scenarios which will blow through any delete file repository UNLESS you require that the deleted files NEVER are deleted until explicitly done so (e.g. "emptying" the Windows Recycle Bin). And, there again, we''re back to ''oh, did I _really_ mean to do that?'' And, people are going to be upset (and complain that the feature doesn''t work right, etc..) if we implement "undo" with some sort of auto expiration (either limit total size, or whatever). Mainly because they''re not going to understand the limitations. And, as I pointed out before, it leads to lazy administrative practices. Turning on an "undo" for everything at the filesystem level is a nightmare waiting to happen. -Erik
Interesting thread - a few comments:
Finite-sized validation checksums aren''t a 100% solution either, but
they''re certainly good enough to be extremely useful.
NetApp has built a rather decent business at least in part by providing
less-than-100% user-level undo-style facilities via snapshots (not that novel a
feature these days, but it was when they introduced it).  More recently,
''continuous data protection'' products seem to be receiving an
enthusiastic response from customers despite their hefty price tags (of course,
they *do* purport to be a ''100% solution'', as long as
you''re willing to pay for unbounded expansion of storage).
My dim recollection is that TOPS-10 implemented its popular (but again <100%)
undelete mechanism using the same kind of ''space-available''
approach suggested here.  It did, however, support explicit ''delete - I
really mean it'' facilities to help keep unwanted detritus from
shouldering out more desirable bits (''expunge'' being the
applicable incantation, which had an appropriate ring of finality to it).  Tying
into user quotas such that one user can''t drive another user''s
most-recently-deleted content out of the system seems implicit in
eschrock''s comments.
But it is likely that in at least some situations promiscuously retaining
*everything* even for a limited time would be a real problem, and that in a lot
more it would be at least sub-optimal.  Creating a directory attribute
inheritable by subdirectories and files controlling temporary undelete-style
preservation would help (one could also consider per-file-type controls, though
file extensions may not be ideal hooks and I don''t know whether ZFS
uses file attributes to establish types).
Since this is essentially a per-file mechanism, it really shouldn''t
require the level of system-wide flush-synchronization that a formal snapshot
requires, should it?  Especially if it really is limited to preserving deleted
files (though it''s possible that you could extend it to cover
incremental updates as well).  If a full-fledged snapshot has too high an
overhead to be left to the discretion of common users, that''s even more
reason to try to implement some form of undelete facility that''s
lighter in weight.
- bill
 
 
This message posted from opensolaris.org
Pardon me if this scenario has been discussed already, but I haven''t seen anything as yet. I''d like to request a ''zpool evacuate pool <device>'' command. ''zpool evacuate'' would migrate the data from a disk device to other disks in the pool. Here''s the scenario: Say I have a small server with 6x146g disks in a jbod configuration. If I mirror the system disk with SVM (currently) and allocate the rest as a non-raidz pool, I end up with 4x146g in a pool of approximately 548gb capacity. If one of the disks is starting to fail, I would need to use ''zpool replace new-disk old-disk''. However, since I have no more slots in the machine to add a replacement disk, I''m stuck. This is where a ''zpool evacuate pool <device>'' would come in handy. It would allow me to evacuate the failing device so that it could be replaced and re-added with ''zpool add pool <device>''. What does the group think? Thanks! ----- Gregory Shaw, IT Architect Phone: (303) 673-8273 Fax: (303) 673-8273 ITCTO Group, Sun Microsystems Inc. 1 StorageTek Drive MS 4382 greg.shaw at sun.com (work) Louisville, CO 80028-4382 shaw at fmsoft.com (home) "When Microsoft writes an application for Linux, I''ve Won." - Linus Torvalds
>Hmm, I think I''d rather see this built into programs, such as ''rm'', rather >than into the filesystem itself. > >For example, if I''m using ZFS for my OpenSolaris development, I might want >to enable this delete-history, just in case I rm a .c file that I need. > >But I don''t want to keep a history of .o, .a or executable files created, >either.And rm would know this how? The assumption you make seems to be that .a and .o files are never valuable where they may be; I believe *BSD used some form of "don''t archive this" bit to achieve this goal; the compiler/linker would set this bit on the files they created but it would not be automatically copied.>Which brings me to the next point which is to say that there is probably >a need for a "never shanpshot" and "always snapshot" masks for matching >files against.I don''t see how you can determine this on the basis of the file''s name or contents. You can determine this on the basis of how you got this file; was it produced by the compiler, assembler, ld, yacc, lex, rpcgen, javac? Since the number of such progams seems rather small, and the default is that you want to keep a file, perhaps that is the way forward. Or you could say that you know that certain sets of processes generate repeatable results, such as "make" and its children and make would set something inheritable in the process word which would mark all files created during that processes as disposible. Casper
On 11/06/06, Gregory Shaw <Greg.Shaw at sun.com> wrote:> Pardon me if this scenario has been discussed already, but I haven''t > seen anything as yet. > > I''d like to request a ''zpool evacuate pool <device>'' command. > ''zpool evacuate'' would migrate the data from a disk device to other > disks in the pool. > > Here''s the scenario: > > Say I have a small server with 6x146g disks in a jbod > configuration. If I mirror the system disk with SVM (currently) and > allocate the rest as a non-raidz pool, I end up with 4x146g in a pool > of approximately 548gb capacity. > > If one of the disks is starting to fail, I would need to use ''zpool > replace new-disk old-disk''. However, since I have no more slots in > the machine to add a replacement disk, I''m stuck. > > This is where a ''zpool evacuate pool <device>'' would come in handy. > It would allow me to evacuate the failing device so that it could be > replaced and re-added with ''zpool add pool <device>''.That makes sense to me - seems a good parallel to what pvmove(8) does on Linux LVM. Useful not just for imminent failure, but whenever you need to free up a physical partition (you realise you need to dual-boot a laptop, for example). I suppose this is only useful in the ''concatenated disk'' (~raid0) case (as you could just pull the disk otherwise). -- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/
This only seems valuable in the case of an unreplicated pool. We already have ''zpool offline'' to take a device and prevent ZFS from talking to it (because it''s in the process of failing, perhaps). This gives you what you want for mirrored and RAID-Z vdevs, since there''s no data to migrate anyway. We are also planning on implementing ''zpool remove'' (for more than just hot spares), which would allow you to remove an entire toplevel vdev, migrating the data off of it in the process. This would give you what you want for the case of an unreplicated pool. Does this satisfy the usage scenario you described? - Eric On Sun, Jun 11, 2006 at 07:52:37AM -0600, Gregory Shaw wrote:> Pardon me if this scenario has been discussed already, but I haven''t > seen anything as yet. > > I''d like to request a ''zpool evacuate pool <device>'' command. > ''zpool evacuate'' would migrate the data from a disk device to other > disks in the pool. > > Here''s the scenario: > > Say I have a small server with 6x146g disks in a jbod > configuration. If I mirror the system disk with SVM (currently) and > allocate the rest as a non-raidz pool, I end up with 4x146g in a pool > of approximately 548gb capacity. > > If one of the disks is starting to fail, I would need to use ''zpool > replace new-disk old-disk''. However, since I have no more slots in > the machine to add a replacement disk, I''m stuck. > > This is where a ''zpool evacuate pool <device>'' would come in handy. > It would allow me to evacuate the failing device so that it could be > replaced and re-added with ''zpool add pool <device>''. > > What does the group think? > > Thanks! > > ----- > Gregory Shaw, IT Architect > Phone: (303) 673-8273 Fax: (303) 673-8273 > ITCTO Group, Sun Microsystems Inc. > 1 StorageTek Drive MS 4382 greg.shaw at sun.com (work) > Louisville, CO 80028-4382 shaw at fmsoft.com (home) > "When Microsoft writes an application for Linux, I''ve Won." - Linus > Torvalds > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
On Jun 11, 2006, at 03:21, can you guess? wrote:> My dim recollection is that TOPS-10 implemented its popular (but > again <100%) undelete mechanism using the same kind of ''space- > available'' approach suggested here. It did, however, support > explicit ''delete - I really mean it'' facilities to help keep > unwanted detritus from shouldering out more desirable bits > (''expunge'' being the applicable incantation, which had an > appropriate ring of finality to it). Tying into user quotas such > that one user can''t drive another user''s most-recently-deleted > content out of the system seems implicit in eschrock''s comments.There''s also Plan 9''s Venti [1]:> Venti is a network storage system that permanently stores data > blocks. A 160-bit SHA-1 hash of the data (called score by Venti) > acts as the address of the data. This enforces a write-once policy > since no other data block can be found with the same address. The > addresses of multiple writes of the same data are identical, so > duplicate data is easily identified and the data block is stored > only once. Data blocks cannot be removed, making it ideal for > permanent or backup storage. Venti is typically used with Fossil to > provide a file system with permanent snapshots.Venti works in combination with Fossil [2]:> Fossil is the default file system in Plan 9 from Bell Labs. It > serves the network protocol 9P and runs as a user space daemon, > like most Plan 9 file servers. Fossil is different from most other > file system due to its snapshot/archival feature. It can take > snapshots of the entire file system on command or with an interval. > These snapshots can be kept on the Fossil partition as long as disk > space allows; if the partition fills up old snapshots will be > removed to free up disk space. A snapshot can also be saved > permanently to Venti. Fossil and Venti are typically installed > together.[1] http://en.wikipedia.org/wiki/Venti [2] http://en.wikipedia.org/wiki/Fossil_%28file_system%29
> But it is likely that in at least some situations promiscuously retaining > *everything* > even for a limited time would be a real problem, and that in a lot more it > would > be at least sub-optimal. Creating a directory attribute inheritable by > subdirectories >and files controlling temporary undelete-style preservation would help (one >could > also consider per-file-type controls, though file extensions may not be > ideal hooks > and I don''t know whether ZFS uses file attributes to establish types). > > Since this is essentially a per-file mechanism, it really shouldn''t > require the level of > system-wide flush-synchronization that a formal snapshot requires, should > it? > Especially if it really is limited to preserving deleted files (though > it''s possible that > you could extend it to cover incremental updates as well). If a > full-fledged snapshot > has too high an overhead to be left to the discretion of common users, > that''s even > more reason to try to implement some form of undelete facility that''s > lighter in weight.Hmm, I think I''d rather see this built into programs, such as ''rm'', rather than into the filesystem itself. For example, if I''m using ZFS for my OpenSolaris development, I might want to enable this delete-history, just in case I rm a .c file that I need. But I don''t want to keep a history of .o, .a or executable files created, either. I want "make clean" or "make clobber" to not cause things to be kept around. Which brings me to the next point which is to say that there is probably a need for a "never shanpshot" and "always snapshot" masks for matching files against. Darren
Casper.Dik at Sun.COM wrote:>>Hmm, I think I''d rather see this built into programs, such as ''rm'', rather >>than into the filesystem itself. >> >>For example, if I''m using ZFS for my OpenSolaris development, I might want >>to enable this delete-history, just in case I rm a .c file that I need. >> >>But I don''t want to keep a history of .o, .a or executable files created, >>either. >> > >And rm would know this how? > >The assumption you make seems to be that .a and .o files are never valuable >where they may be; I believe *BSD used some form of "don''t archive this" bit to >achieve this goal; the compiler/linker would set this bit on the files they >created but it would not be automatically copied. > > >>Which brings me to the next point which is to say that there is probably >>a need for a "never shanpshot" and "always snapshot" masks for matching >>files against. >> > >I don''t see how you can determine this on the basis of the file''s name or >contents. You can determine this on the basis of how you got this file; >was it produced by the compiler, assembler, ld, yacc, lex, rpcgen, javac? > >Since the number of such progams seems rather small, and the default is that >you want to keep a file, perhaps that is the way forward. Or you could >say that you know that certain sets of processes generate repeatable results, >such as "make" and its children and make would set something inheritable >in the process word which would mark all files created during that processes >as disposible. >I think the idea you''ve suggested here, setting an extra bit or property in on the file as a part of the work flow is a better idea than the one I had in mind. Is passing a new flag through open(2) a way to achieve this, such as O_DISPOSABLE (that is ignored by filesystems that don''t have any way to handle it), or should tools such as make check to see if they''re creating a file on ZFS and set the extra bit appropriately with another system call? Darren
Yes, if zpool remove works like you describe, it does the same thing. Is there a time frame for that feature? Thanks! On Jun 11, 2006, at 10:21 AM, Eric Schrock wrote:> This only seems valuable in the case of an unreplicated pool. We > already have ''zpool offline'' to take a device and prevent ZFS from > talking to it (because it''s in the process of failing, perhaps). This > gives you what you want for mirrored and RAID-Z vdevs, since > there''s no > data to migrate anyway. > > We are also planning on implementing ''zpool remove'' (for more than > just > hot spares), which would allow you to remove an entire toplevel vdev, > migrating the data off of it in the process. This would give you what > you want for the case of an unreplicated pool. > > Does this satisfy the usage scenario you described? > > - Eric > > On Sun, Jun 11, 2006 at 07:52:37AM -0600, Gregory Shaw wrote: >> Pardon me if this scenario has been discussed already, but I haven''t >> seen anything as yet. >> >> I''d like to request a ''zpool evacuate pool <device>'' command. >> ''zpool evacuate'' would migrate the data from a disk device to other >> disks in the pool. >> >> Here''s the scenario: >> >> Say I have a small server with 6x146g disks in a jbod >> configuration. If I mirror the system disk with SVM (currently) and >> allocate the rest as a non-raidz pool, I end up with 4x146g in a pool >> of approximately 548gb capacity. >> >> If one of the disks is starting to fail, I would need to use ''zpool >> replace new-disk old-disk''. However, since I have no more slots in >> the machine to add a replacement disk, I''m stuck. >> >> This is where a ''zpool evacuate pool <device>'' would come in handy. >> It would allow me to evacuate the failing device so that it could be >> replaced and re-added with ''zpool add pool <device>''. >> >> What does the group think? >> >> Thanks! >> >> ----- >> Gregory Shaw, IT Architect >> Phone: (303) 673-8273 Fax: (303) 673-8273 >> ITCTO Group, Sun Microsystems Inc. >> 1 StorageTek Drive MS 4382 greg.shaw at sun.com (work) >> Louisville, CO 80028-4382 shaw at fmsoft.com (home) >> "When Microsoft writes an application for Linux, I''ve Won." - Linus >> Torvalds >> >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- > Eric Schrock, Solaris Kernel Development http://blogs.sun.com/ > eschrock----- Gregory Shaw, IT Architect Phone: (303) 673-8273 Fax: (303) 673-8273 ITCTO Group, Sun Microsystems Inc. 1 StorageTek Drive MS 4382 greg.shaw at sun.com (work) Louisville, CO 80028-4382 shaw at fmsoft.com (home) "When Microsoft writes an application for Linux, I''ve Won." - Linus Torvalds
Darren Reed wrote:> I think the idea you''ve suggested here, setting an extra > bit or property in on the file as a part of the work flow > is a better idea than the one I had in mind.Which is I believe covered by one or more of the following CRs: 5105713 want new security attributes on files 6417435 DOS attributes and additional timestamps to support for CIFS 4058737 RFE: new attributes for ufs Obviously this needs a ZFS equivalent.> Is passing a new flag through open(2) a way to achieve this, > such as O_DISPOSABLE (that is ignored by filesystems that > don''t have any way to handle it), or should tools such as > make check to see if they''re creating a file on ZFS and set > the extra bit appropriately with another system call?Or use acl(2) interface or openat(2) interface depending on how these are implemented. -- Darren J Moffat
Just wondered if there''d been any progress in this area? Correct me if i''m wrong, but as it stands, there''s no way to remove a device you accidentally ''zpool add''ed without destroying the pool. On 12/06/06, Gregory Shaw <Greg.Shaw at sun.com> wrote:> Yes, if zpool remove works like you describe, it does the same > thing. Is there a time frame for that feature? > > Thanks! > > On Jun 11, 2006, at 10:21 AM, Eric Schrock wrote: > > > This only seems valuable in the case of an unreplicated pool. We > > already have ''zpool offline'' to take a device and prevent ZFS from > > talking to it (because it''s in the process of failing, perhaps). This > > gives you what you want for mirrored and RAID-Z vdevs, since > > there''s no > > data to migrate anyway. > > > > We are also planning on implementing ''zpool remove'' (for more than > > just > > hot spares), which would allow you to remove an entire toplevel vdev, > > migrating the data off of it in the process. This would give you what > > you want for the case of an unreplicated pool. > > > > Does this satisfy the usage scenario you described? > > > > - Eric > > > > On Sun, Jun 11, 2006 at 07:52:37AM -0600, Gregory Shaw wrote: > >> Pardon me if this scenario has been discussed already, but I haven''t > >> seen anything as yet. > >> > >> I''d like to request a ''zpool evacuate pool <device>'' command. > >> ''zpool evacuate'' would migrate the data from a disk device to other > >> disks in the pool. > >> > >> Here''s the scenario: > >> > >> Say I have a small server with 6x146g disks in a jbod > >> configuration. If I mirror the system disk with SVM (currently) and > >> allocate the rest as a non-raidz pool, I end up with 4x146g in a pool > >> of approximately 548gb capacity. > >> > >> If one of the disks is starting to fail, I would need to use ''zpool > >> replace new-disk old-disk''. However, since I have no more slots in > >> the machine to add a replacement disk, I''m stuck. > >> > >> This is where a ''zpool evacuate pool <device>'' would come in handy. > >> It would allow me to evacuate the failing device so that it could be > >> replaced and re-added with ''zpool add pool <device>''. > >> > >> What does the group think?-- Rasputin :: Jack of All Trades - Master of Nuns http://number9.hellooperator.net/
a zpool remove/shrink type function is on our list of features we want to add. We have RFE 4852783 reduce pool capacity open to track this. Noel Dick Davies wrote:> Just wondered if there''d been any progress in this area? > > Correct me if i''m wrong, but as it stands, there''s no way > to remove a device you accidentally ''zpool add''ed without > destroying the pool. > > On 12/06/06, Gregory Shaw <Greg.Shaw at sun.com> wrote: > >> Yes, if zpool remove works like you describe, it does the same >> thing. Is there a time frame for that feature? >> >> Thanks! >> >> On Jun 11, 2006, at 10:21 AM, Eric Schrock wrote: >> >> > This only seems valuable in the case of an unreplicated pool. We >> > already have ''zpool offline'' to take a device and prevent ZFS from >> > talking to it (because it''s in the process of failing, perhaps). This >> > gives you what you want for mirrored and RAID-Z vdevs, since >> > there''s no >> > data to migrate anyway. >> > >> > We are also planning on implementing ''zpool remove'' (for more than >> > just >> > hot spares), which would allow you to remove an entire toplevel vdev, >> > migrating the data off of it in the process. This would give you what >> > you want for the case of an unreplicated pool. >> > >> > Does this satisfy the usage scenario you described? >> > >> > - Eric >> > >> > On Sun, Jun 11, 2006 at 07:52:37AM -0600, Gregory Shaw wrote: >> >> Pardon me if this scenario has been discussed already, but I haven''t >> >> seen anything as yet. >> >> >> >> I''d like to request a ''zpool evacuate pool <device>'' command. >> >> ''zpool evacuate'' would migrate the data from a disk device to other >> >> disks in the pool. >> >> >> >> Here''s the scenario: >> >> >> >> Say I have a small server with 6x146g disks in a jbod >> >> configuration. If I mirror the system disk with SVM (currently) and >> >> allocate the rest as a non-raidz pool, I end up with 4x146g in a pool >> >> of approximately 548gb capacity. >> >> >> >> If one of the disks is starting to fail, I would need to use ''zpool >> >> replace new-disk old-disk''. However, since I have no more slots in >> >> the machine to add a replacement disk, I''m stuck. >> >> >> >> This is where a ''zpool evacuate pool <device>'' would come in handy. >> >> It would allow me to evacuate the failing device so that it could be >> >> replaced and re-added with ''zpool add pool <device>''. >> >> >> >> What does the group think? > > > >
Hello Noel,
Wednesday, June 28, 2006, 5:59:18 AM, you wrote:
ND> a zpool remove/shrink type function is on our list of features we want
ND> to add.
ND> We have RFE
ND> 4852783 reduce pool capacity
ND> open to track this.
Is there someone actually working on this right now?
-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com
Hey Robert, Well, not yet. Right now our top two priorities are improving performance in multiple areas of zfs(soon there will be a performance page tracking progess on the zfs community page), and also getting zfs boot done. Hence, we''re not currently working on heaps of brand new features. So this is definately on our list, but not currently being worked on yet. Noel Robert Milkowski wrote:> Hello Noel, > > Wednesday, June 28, 2006, 5:59:18 AM, you wrote: > > ND> a zpool remove/shrink type function is on our list of features we want > ND> to add. > ND> We have RFE > ND> 4852783 reduce pool capacity > ND> open to track this. > > Is there someone actually working on this right now? > >