Hi Tim (Foster), hi all, first of all: Tim, thank you very much for your very useful auto-snapshot script. I believe this really is what every ZFS user needs. As a laptop user, I wanted to make sure that I get snapshots taken even if my machine is down every night, so I added at scheduling support to zfs-auto-snapshot, where each job re-schedules it''s next execution, thus ending up with a cron-like behaviour. The main advantage is that also the standard Solaris cron runs at jobs retrospectively, so we can live without anything like anacron. The main disadvantage is that with current onnv versions, at jobs are broken in cron, so before working on zfs-auto-snapshot, that''s what I had to fix (or at least revert to old code) first (see http://www.opensolaris.org/jive/thread.jspa?threadID=64739 ). I''ll upload zfs-auto-snapshot with my changes here, because i cant post files to Tims blog (or can I?). Tim, feel free to integrate my suggestions or not, I wont feel offended if you don''t, but at any rate I am very happy that you maintain this tool. Thanks again, Nils This message posted from opensolaris.org
and the tar file ... This message posted from opensolaris.org -------------- next part -------------- A non-text attachment was scrubbed... Name: zfs-auto-snapshot-0.10_atjobs.tar.bz2 Type: application/x-bzip2 Size: 13290 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080624/d5d6616e/attachment.bin>
And how about making this an official project? This message posted from opensolaris.org
Tim Foster
2008-Jul-30 11:23 UTC
[zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at scheduling )
Hey Nils & everyone Finally getting around to answering Nil''s mail properly - only a month late! I thought I''d also let everyone else know what''s been going on with the service, since 0.10 released in January this year. On Tue, 2008-06-24 at 14:40 -0700, Nils Goroll wrote:> first of all: Tim, thank you very much for your very useful auto-snapshot script. > I believe this really is what every ZFS user needs.Glad you like it!> As a laptop user, I wanted to make sure that I get snapshots taken even if my > machine is down every night, so I added at scheduling support to zfs-auto-snapshot, > where each job re-schedules it''s next execution, thus ending up with a cron-like > behaviour.Okay, after careful consideration, I don''t think I''m going to add this into the code. I absolutely understand the reasoning behind it, but in cases where you''re powering down a laptop overnight, you don''t want to just take a load of snapshots after you power on for every missed cron job, you just want one (assuming your laptop hasn''t magically changed data-on-disk while it was asleep!) The other thing that worries me about this, is that it exposes the user to too much implementation detail: users would need to know about at(1) timespecs, so for the same reason I didn''t add support to allow people to add their own crontab time strings, I''m not sure this is good either. All that said, the idea of missing the dates when the cron job fires, and somehow doing something about it is a good one - so instead, I think we should add a new "zfs/interval" property value: "none". In the manifest, instead of setting the interval to be "hours", "months", "days", "minutes" - setting "none" would mean that a cron job wouldn''t be scheduled for that instance of the service. Why''s that good ? Well, then other scripts, events, etc. could still manually fire the method script, eg: ???$ /lib/svc/method/zfs-auto-snapshot svc:/system/filesystem/zfs-auto-snapshot:login which would cause snapshots to be taken, under the policies set down by that instance. Perhaps you''d like them to fire on login to a desktop, or booting your laptop, connecting to a network - really whatever event you think is interesting. This would mean you''d still get all the functionality the service provides (rolling snapshots, off-site send/recv, avoiding scrub (though that bug''s fixed now)) and you can control which filesystems get included in that instance as usual via the "//" "zfs/fs-name" property, or hard-coding them in the instance itself. As for the other changes you suggested, I''ve already put some slightly better svcprop caching code in, but just not your implementation ( something about the block comment: ## NOTE/WARNING: THIS IS NOT GUARANTEED TO BE PORTABLE ## TO OTHER SHELLS ! Depends on whether your shell runs ## the while loop in a subshell or not ! made me a bit nervous! :-) I think the implementation I''ve got is okay, bit I''m still testing it. The other changes that will appear in 0.11 (which is nearly done) are: * changing way the default "//" fs-name works - I decided to make the default instances that use this feature always snapshot recursively, for performance reasons. In my day-to-day use, whenever I''m using this feature, I nearly always end up snapshotting all child filesystems. So, the canned instances now set the "zfs/snapshot-children" property. When searching for filesystems that should be snapshotted, we only consider ones that have a locally-set property, rather than inherited one. The down side of this, is that if a filesystem further down the hierarchy sets com.sun:auto-snapshot=false, it''ll still get included in the snapshots, because of the recursive flag. Turning off "zfs/snapshot-children" will get back to the old behaviour where we manually walk the dynamically generated list of filesystems to snapshot. It''s slow for large numbers of children. ??? * Slightly saner preinstall/postremove scripts (though they still suck) ??? * Better shell style - I figured if I''m aiming to get this into ON someday, I might as well clean up the code now * Bugfix reported by "Dan on March 13, 2008 at 02:29" http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_0_10#comment-1205418551000 * RBAC - I''ve been annoyed for a while that the service runs as root all the time and since my new day-job is actually requiring me to read up on RBAC anyway, I figured now''s a good time to have it do the right thing. On first look, I was thinking of creating a "ZFS Automatic Snapshot" profile and a default user (or role? can roles run cron jobs ?), which would have similar abilities as the "ZFS File System Management" profile, but would also allow the user to stop/start the service, and run a given script which would do backups. I''m still reading up on RBAC, so the above may change! Finally, in terms of wider exposure of this service - I''m talking to some OpenSolaris desktop guys about potentially using this service in the next Indiana release: http://mail.opensolaris.org/pipermail/indiana-discuss/2008-July/007916.html -> http://www.opensolaris.org/os/project/indiana/resources/problem_statement/ Talks about: DSK-5: Provide a graphical interface to allow the user to regularly back up their data using ZFS snapshots. The user should be able browse their snapshots over time, and store them remotely if desired. so I think it''d be really cool if they could use this service on the backend, but it''s still under discussion. Anyway, long mail - hope this is of interest! cheers, tim
Nils Goroll
2008-Jul-31 20:45 UTC
[zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at schedul
Hi Tim,> Finally getting around to answering Nil''s mail properly - only a month > late!Not a problem.> Okay, after careful consideration, I don''t think I''m going to add thisthat''s fine for me, but ...> but in cases where you''re powering down a laptop overnight, > you don''t want to just take a load of snapshots after you power on for > every missed cron job, you just want oneThis is precisely what the at solution is doing: As there is only one at job for each zfs snapshot SMF at any one time, only one snapshot is taken for every SMF when the machine powers up - which IS what you want : When you say daily, you want it taken daily (if possible). Briefly: Each at job schedules the next one.> The other thing that worries me about this, is that it exposes the user > to too much implementation detail: users would need to know about at(1) > timespecsThey would not necessarily need to. We could hide that implementation detail. If this was the only thing you disliked, i''d be happy to develop a simpler specification. But please do consider that some people (admins !) DO want to specify these things, I believe you should also keep an eye on datacentre administrators, not just "home users".> Why''s that good ? Well, then other scripts, events, etc. could still > manually fire the method script, eg: > > ?$ /lib/svc/method/zfs-auto-snapshot > svc:/system/filesystem/zfs-auto-snapshot:loginThis is not what I (personally) want. I want to be able to specify things like daily and still get snapshots taken even if the machine is down for regular intervals.> As for the other changes you suggested, I''ve already put some slightly > better svcprop caching code in, but just not your implementation > ( something about the block comment:) ...I like to be honest... ;-) Hardly any shell code is portable, but I was somehow expecting you to step over this thing... I''ll have a look at your solution when it''s out.> can roles run cron jobs ?),No. You need a user who can take on the role. Thanks again, and keep up the good work (and please think again about the at-vanteges ;-) Nils This message posted from opensolaris.org
Tim Foster
2008-Jul-31 21:09 UTC
[zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at schedul
Hey Nils, Nils Goroll wrote:>> but in cases where you''re powering down a laptop overnight, >> you don''t want to just take a load of snapshots after you power on for >> every missed cron job, you just want one > > This is precisely what the at solution is doing: As there is only one > at job for each zfs snapshot SMF at any one time, only one snapshot > is taken for every SMF when the machine powers up - which IS what > you want : When you say daily, you want it taken daily (if possible).Ok, from talking to the desktop guys, they''re looking for exactly this functionality as well, and I agree it''d be useful. So, I''ve got a pretty basic solution: Every time the service starts, we check for the existence of a snapshot that was taken under the current schedule that''s no more than <frequency> <interval>s old - if one doesn''t exist, then we take a snapshot under the policy set down by that instance. So, in the case of daily snapshots, where a user shuts down a machine at 23:59, missing the cron job firing, as soon as the service starts again (probably when the machine reboots) it checks for a snapshot no older than 1 day in the past, and takes a snapshot if the last snapshot was (say) 24hrs 1min ago. The same would apply for monthly, weekly, every-3-days, etc.> But please do consider that some people (admins !) DO want to specify these > things, I believe you should also keep an eye on datacentre administrators, not > just "home users".Hard to please everyone! If you felt like it, it''d be great to get the "offset" property working - that''d make the use of cron a lot more flexible for admins I think.> This is not what I (personally) want. I want to be able to specify > things like daily and still get snapshots taken even if the machine > is down for regular intervals.Would the conditional-snapshot-on-start-method solution work for you? I know it requires a service start for it to work, but in the case of a snapshot failing for any reason, the service would drop to maintenance, and you''d be restarting the service anyway: so good for both laptop users and sysadmins I hope?>> As for the other changes you suggested, I''ve already put some slightly >> better svcprop caching code in, but just not your implementation >> ( something about the block comment:) ... > > I like to be honest... ;-) Hardly any shell code is portable, but I > was somehow expecting you to step over this thing... I''ll have a > look at your solution when it''s out.I''ve attached some sample code - see what you think. It just takes the smf properties and converts them into environment variables, so we''d only ever call svcprop once.>> can roles run cron jobs ?), > > No. You need a user who can take on the role.Darn, back to the drawing board.> Thanks again, and keep up the good work (and please think again about > the at-vanteges ;-)No worries! cheers, tim -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: t.ksh URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080731/d34ea006/attachment.ksh>
Darren J Moffat
2008-Aug-01 08:50 UTC
[zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at schedul
Tim Foster wrote:>>> can roles run cron jobs ?), >> >> No. You need a user who can take on the role. > > Darn, back to the drawing board.I don''t have all the context on this but Solaris RBAC roles *can* run cron jobs. Roles don''t have to have users assigned to them. Roles normally have passwords and accounts that have valid passwords can run cron jobs. To create an account that can not login but can run cron jobs use: passwd -N username Examples of such accounts are sys,adm,lp,postgres Accounts that are locked (by passwd -l username) can not run cron jobs. Tim feel free to explain to me offline what it was you were trying to use roles for. -- Darren J Moffat
Nils Goroll
2008-Aug-03 12:24 UTC
[zfs-discuss] cron and roles (zfs-auto-snapshot 0.11 work)
My previous reply via email did not get linked to this post, so let me resend it:>>>> can roles run cron jobs ?), >>> No. You need a user who can take on the role. >> Darn, back to the drawing board. > I don''t have all the context on this but Solaris RBAC roles *can* run cron jobs. Roles don''t have to have users assigned to them. > > Roles normally have passwords and accounts that have valid passwords can run cron jobs.Sorry for the confusion and thanks for the clarification, I was thinking old nomenclature. Cron needs an *account*. This message posted from opensolaris.org
Nils Goroll
2008-Aug-03 12:57 UTC
[zfs-discuss] zfs-auto-snapshot: Use at ? SMF prop caching?
Hi Tim,> So, I''ve got a pretty basic solution: > > Every time the service starts, we check for the existence of a snapshot > [...] - if one doesn''t exist, then we take a snapshot under the policy set > down by that instance.This does sound like a valid alternative solution for this requirement if you want to avoid using "at", though this will involve additional complexity for parsing timestamps of existing snapshots and calculating intervals, which I think is not that trivial in shells (consider timezone changes, leap years etc). Also, "at" can express intervals which are not expressible with crontabs, so keeping at schedules as an additional feature could be advantageous - which would be a solution to the shortcoming you have documented in the code: # Adding a cron job that runs exactly every x time-intervals is hard to do # properly.> Hard to please everyone! If you felt like it, it''d be great to get the > "offset" property working - that''d make the use of cron a lot more > flexible for admins I think.OK, I''ll let you know when (if) I start working on it so we don''t do double work.> Would the conditional-snapshot-on-start-method solution work for you?I think so, on the other hand I don''t see why exactly you want to avoid supporting "at" as well.> I''ve attached some sample code - see what you think.This is basically a simpler version of the same idea - put svcprops in variables. There are a couple of obstacles here: - If you create variables with the names of svc properties, you run into the issue that shell variables cant contain all characters valid for svc properties, which you need to work around then (you are using sed to filter out some characters (e.g. by mapping - to _), but this will make more than one svc- prop onto the same cache entry, which might work for zfs-auto-snapshot, but is not a general solution). My suggested code uses associative arrays which don''t have this limitation. - For your solution, how do you invalidate the cache if a property is being changed or deleted (this is trivial, but not yet implemented)? - Does your solution handle white space, quotes etc. in svcprop values properly (I think there is an issue regarding white space, but I have not tested it)? - Does your solution impose a security risk? (consider the eval $NAME) Cheers, Nils This message posted from opensolaris.org
Rob
2008-Aug-06 22:58 UTC
[zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at scheduling )
> The other changes that will appear in 0.11 (which is > nearly done) are:Still looking forward to seeing .11 :) Think we can expect a release soon? (or at least svn access so that others can check out the trunk?) This message posted from opensolaris.org
Tim Foster
2008-Aug-08 08:49 UTC
[zfs-discuss] zfs-auto-snapshot: Use at ? SMF prop caching?
Hey Nils, On Sun, 2008-08-03 at 05:57 -0700, Nils Goroll wrote:> This does sound like a valid alternative solution for this requirement if you > want to avoid using "at", though this will involve additional complexity for > parsing timestamps of existing snapshots and calculating intervals, which > I think is not that trivial in shells (consider timezone changes, leap years > etc).Right - the code I have is good enough: you''re absolutely correct that it drifts across unequal months and leap years, but I''m happy with it. The code I have right now does basically this: get the last snapshot taken for this schedule, then: ---- # all calculations done according to time since epoch. LAST_SNAP_TIME=$(zfs get -H -p -o value creation $LAST_SNAPSHOT) LAST_SNAP_TIME_HUMAN=$(zfs get -H -o value creation $LAST_SNAPSHOT) NOW=$(perl -e ''print time;'') MINUTE_S=60 HOUR_S=$(( $MINUTE_S * 60 )) DAY_S=$(( $HOUR_S * 24 )) MONTH_S=$(( $DAY_S * 30 )) case $INTERVAL in "minutes") MULTIPLIER=$MINUTE_S ;; "hours") MULTIPLIER=$HOUR_S ;; "days") MULTIPLIER=$DAY_S ;; "none") return 0 ;; "*") print_log "WARNING - unknown interval encountered in check_missed_snapshots!" return 1 esac PERIOD_S=$(( $MULTIPLIER * $PERIOD )) AGO=$(( $NOW - $LAST_SNAP_TIME )) if [ $AGO -gt $PERIOD_S ] ; then print_log "Last snapshot for $FMRI taken on LAST_SNAP_TIME_HUMAN" print_log "which was greater than the $PERIOD $INTERVAL schedule. Taking snapshot now." take_snapshot $FMRI fi ---- Suggestions welcome?> # Adding a cron job that runs exactly every x time-intervals is hard to do > # properly.Absolutely.> > Hard to please everyone! If you felt like it, it''d be great to get the > > "offset" property working - that''d make the use of cron a lot more > > flexible for admins I think. > > OK, I''ll let you know when (if) I start working on it so we don''t do double work.Thanks!> > Would the conditional-snapshot-on-start-method solution work for you? > > I think so, on the other hand I don''t see why exactly you want to avoid supporting > "at" as well.I''d like to avoid adding at(1) support because while I think it''s a pretty neat hack, I think it also duplicates functionality, causes more maintenance and doesn''t have a clean interface with the rest of the service (imho). If cron not being expressive enough is the problem, yet cron is an already accepted way of running periodic system services, wouldn''t it make more sense to spend time getting cron up to spec on Solaris than having to always hack around it?> > I''ve attached some sample code - see what you think. > > This is basically a simpler version of the same idea - put svcprops in variables. > There are a couple of obstacles here: > > - If you create variables with the names of svc properties, you run into the > issue that shell variables cant contain all characters valid for svc properties, > which you need to work around then (you are using sed to filter out some > characters (e.g. by mapping - to _), but this will make more than one svc- > prop onto the same cache entry, which might work for zfs-auto-snapshot, > but is not a general solution).I''m not sure it needs to be a general solution, it just needs to work for this service. I''m just filtering out the key values though, I think this should be safe.> My suggested code uses associative arrays which don''t have this limitation. > > - For your solution, how do you invalidate the cache if a property is being > changed or deleted (this is trivial, but not yet implemented)?Right, the cache gets created once at the beginning of a method call, if a user changes an SMF property in the middle of that method running, the results should be undefined.> - Does your solution handle white space, quotes etc. in svcprop values properly > (I think there is an issue regarding white space, but I have not tested it)?Very good point - I''ll dig into this.> - Does your solution impose a security risk? (consider the eval $NAME)Not that I''m aware of - at least no more than the "zfs/backup-save-cmd" property, which allows an administrator to set an arbitrary command to process the zfs send stream. The point is, if a user can set any SMF property for this service, then they''re already privileged (or should know better) With the upcoming change to running this stuff under a restricted role, this will be even less of a concern. cheers, tim
Tim Foster
2008-Aug-08 09:56 UTC
[zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at scheduling )
On Wed, 2008-08-06 at 15:58 -0700, Rob wrote:> > The other changes that will appear in 0.11 (which is > > nearly done) are: > > Still looking forward to seeing .11 :)Wow, there''s one user out there at least! Thanks!> Think we can expect a release soon? (or at least svn access so > that others can check out the trunk?)Nearly. Here''s what''s going on: I''m working closely with Niall Power of the Desktop team at Sun who is (along with Erwann Chenede) tackling the requirement to have ZFS snapshots managed out of the box for OpenSolaris 2008.11: http://opensolaris.org/os/project/indiana/resources/problem_statement/#DSK-5 To this end, they''ve decided to use with the existing zfs-auto-snapshot code, and put a proper GUI on top of it, rather than start from scratch. (I suck at writing GUIs, so this is way cool!) So far, we''ve come across a few parts of the core service that make it a little bit harder to have it "just work" from a GUI perspective, so I''ve been working to add those into the existing core codebase. These are: * checking for missed snapshot on service start * a new "zfs/fs-name" keyword "##". The existing keyword "//" is inclusive, it only snapshots filesystems marked with a given property. The new keyword "##" adds exclusive support: it snapshots every filesystem except those set with a given property. * RBAC stuff - still haven''t written this yet, but I think running the service under a role with the "ZFS File System Management" profile will be enough. * Collecting the default instances into a single group, giving us an easier way to enable/disable all of the current default service instances from the GUI. (we''re still working out how best to do this, and whether that big on/off switch should be elsewhere or not) The new code will continue to be backwards compatible with previous releases: manifests from 0.1 upwards will still work just fine with 0.11. ??? The plan is to split the code into two packages, one being the core SMF service + canned instances, the other being the GUI code. For now, we''ve got a new Mercurial repository at: hg clone ssh://anon at hg.opensolaris.org/hg/jds/zfs-snapshot It''s just got most versions up to 0.10 in there at present, I''ll commit the 0.11 changes as soon as we''re happy with them, hopefully in the next week or two. The code is under the JDS project for now, ultimately I''d like to get the core service into ON at some point, but that''ll need more ample free time than I''ve got right now :-) cheers, tim
Dave
2008-Aug-08 18:26 UTC
[zfs-discuss] zfs-auto-snapshot 0.11 work (was Re: zfs-auto-snapshot with at scheduling )
Tim Foster wrote:> On Wed, 2008-08-06 at 15:58 -0700, Rob wrote: >>> The other changes that will appear in 0.11 (which is >>> nearly done) are: >> Still looking forward to seeing .11 :) > > Wow, there''s one user out there at least! Thanks! >Keep up the good work, Tim. There are more users of your work out there than you might think :) -- Dave