I''ve recently started down the road of production use for zfs, and am hitting my head on some paradigm shifts. I''d like to clarify whether my understanding is correct, and/or whether there are better ways of doing things. I have one question for replication, and one question for backups. These questions are all about Solaris 10 production release (U5, I believe) not solarisexpress, etc. 1. For replication purposes: is it still true, that the target filesystem has to be "offline" to receive even an incremental send? !! I find this difficult to understand; surely, it should be possible to "receive" to a snapshot at least? 2. for backup/restore purposes:a related question to the above, I suppose. Let''s say that I had "major" damage to a filesystem, which is an active NFS share, or something otherwise constantly in use. Either of the following behaviours would be really nice (and both, would be better still :-) 2.1 do a receive of an earlier zfs send, to either a snapshot, or a "child" filesystem, and then somehow "promote" some, or ALL, of the files, to the main production filesystem, without interrupting the active NFS share too badly 2.2 do a receive of an earlier zfs send, to either a snapshot or a "child" filesystem, and be efficient about disk space used. ie: have the recieve understand, "hey, I have that file already, completely intact, so I''m not going to waste space by storing it again". Related to the above, in the issue that even if i HAVE 2x the disk space required by production purposes, to have a "restoral staging area" on the machine... If it gets restored to a separate filesystem, I cant just do a quick "unshare /zfs/foo ; zfs rename /zfs/foo /zfs/foo.old; zfs rename /zfs/foo.restored /zfs/foo" because that will break all the client NFS handles, since it is a "new" filesystem, right? Or is that incorrect? Suggestions on the above, and/or any related issues I havent thought of, would be appreciated -- This message posted from opensolaris.org
Philip Brown wrote:> I''ve recently started down the road of production use for zfs, and am hitting my head on some paradigm shifts. I''d like to clarify whether my understanding is correct, and/or whether there are better ways of doing things. > I have one question for replication, and one question for backups. > These questions are all about Solaris 10 production release (U5, I believe) not solarisexpress, etc. >First, zfs send/recv is not a backup/restore solution. You might be happier using the available backup/restore solutions in the market.> 1. For replication purposes: is it still true, that the target filesystem has to be "offline" to receive even an incremental send? !! > I find this difficult to understand; surely, it should be possible to "receive" to a snapshot at least? >sends are snapshots, so you are receiving a snapshot. Snapshots are read-only, by definition. There are tricks you can play with clones, though.> 2. for backup/restore purposes:a related question to the above, I suppose. Let''s say that I had "major" damage to a filesystem, which is an active NFS share, or something otherwise constantly in use. > Either of the following behaviours would be really nice (and both, would be better still :-) > > 2.1 do a receive of an earlier zfs send, to either a snapshot, or a "child" filesystem, and then > somehow "promote" some, or ALL, of the files, to the main production filesystem, without interrupting the active NFS share too badly >I think you can do this, but it might need a staging file system. You might find that scp or rsync is similarly effective.> 2.2 do a receive of an earlier zfs send, to either a snapshot or a "child" filesystem, and be efficient about disk space used. ie: have the recieve understand, "hey, I have that file already, completely intact, so I''m not going to waste space by storing it again". >This question makes no sense to me. Perhaps you can rephrase?> Related to the above, in the issue that even if i HAVE 2x the disk space required by production purposes, to have a "restoral staging area" on the machine... If it gets restored to a separate filesystem, I cant just do a quick "unshare /zfs/foo ; zfs rename /zfs/foo /zfs/foo.old; zfs rename /zfs/foo.restored /zfs/foo" because that will break all the client NFS handles, since it is a "new" filesystem, right? Or is that incorrect? >This should be possible, but you might find cp to be an alternative which does not require blowing the NFS file handles. -- richard> Suggestions on the above, and/or any related issues I havent thought of, would be appreciated >
> > 2.2 do a receive of an earlier zfs send, to > either a snapshot or a "child" filesystem, and be > efficient about disk space used. ie: have the recieve > understand, "hey, I have that file already, > completely intact, so I''m not going to waste space by > storing it again". > >ZFS already does this. You wouldn''t send the entire snapshot again, you would just do an incremental send, of the changes between one snapshot and the next. Of course, you''re still asking for problems if you''re trying to receive snapshots on a live system. Yes, you could clone a previous snapshot and make that live, allowing you to receive later ''backups''. But unless you make it a read only system you''re going to have an almighty problem working out what has or has not changed between those two systems. Basically, ZFS will happily keep track of the changes on your filesystem so you can send it to a remote backup, but as soon as you have both systems live, you have to track all the changes and keep them in sync yourself. If this is what you really need to do, you might be better with AVS or some kind of distributed file system. -- This message posted from opensolaris.org
> relling wrote: > This question makes no sense to me. Perhaps you can > rephrase? >To take a really obnoxious case: lets say I have a 1 gigabyte filesystem. It has 1.5 gigabytes of physical disk allocated to it (so it is 66% full). It has 10x100meg files in it. "Something bad happens", and I need to do a restore. The most recent zsend data, has all 10 files in it. 9 of them have not been touched since the zsend was done. Now, since zfs has data integrity checks, yadda yadda yadda, it should be able to determine relatively easily, "The file on the zfs send, is the exact same file on disk". So, when I do a zfs receive, it would be "really nice", if there were some way for zfs to figure out, lets say, "recieve to a snapshot of the filesystem; then take advantage of the fact that it is a snapshot, to NOT write on disk, the 9 unaltered files that are in the snapshot; just allocate for the altered one". it would be really nice for zfs to have the smarts to do this, WITHOUT having to potentially throw a laaarge amount of extra hard disk space for snapshots. I want the "snapshot" space to be allocated on TAPE, not hard disk, if you see what I mean. If one 100meg file gets replace every 2 days, I wouldnt want to use snapshots on the filesystem, if there was a disk space limitation. (I know there are solutions such as samfs for this, but. I''m looking for a zfs solution, if possible, please?) and help with the other parts of my original email, would still be appreciated :) -- This message posted from opensolaris.org
> So, when I do a zfs receive, it would be "really > nice", if there were some way for zfs to figure out, > lets say, "recieve to a snapshot of the filesystem; > then take advantage of the fact that it is a > snapshot, to NOT write on disk, the 9 unaltered files > that are in the snapshot; just allocate for the > altered one".To follow up on my own question a bit :-) I would presume that the mandate of "incrementals MUST have a common snapshot, with the target restoral zfs filesystem", are basically just a shortcut that somehow guarantees files are identical, without having to do actual calculation. What about some kind of rsync-like capability, though? To have zfs receive, have the capability to judge sameness by, "Well, the timestamp and filesizes are identical: treat them as identical!" without a common snapshot. And, for the truely paranoid, having a "binary compare" option, where it says, "hmm.. timestamp and filesizes are the same... they MIGHT be identical... lemme read from disk, and compare what I''m reading from the zfs send stream. If I find a difference, then write as a new file. Otherwise, just create [hardlink/whatever] in the destination receive snapshot, since they really are the same!" -- This message posted from opensolaris.org
Ah, there is a cognitive disconnect... more below. Philip Brown wrote:>> relling wrote: >> This question makes no sense to me. Perhaps you can >> rephrase? >> >> > > To take a really obnoxious case: > lets say I have a 1 gigabyte filesystem. It has 1.5 gigabytes of physical disk allocated to it (so it is 66% full). > It has 10x100meg files in it. > > "Something bad happens", and I need to do a restore. > The most recent zsend data, has all 10 files in it. 9 of them have not been touched since the zsend was done. > > Now, since zfs has data integrity checks, yadda yadda yadda, it should be able to determine relatively easily, "The file on the zfs send, is the exact same file on disk". > So, when I do a zfs receive, it would be "really nice", if there were some way for zfs to figure out, lets say, "recieve to a snapshot of the filesystem; then take advantage of the fact that it is a snapshot, to NOT write on disk, the 9 unaltered files that are in the snapshot; just allocate for the altered one". > > it would be really nice for zfs to have the smarts to do this, WITHOUT having to potentially throw a laaarge amount of extra hard disk space for snapshots. I want the "snapshot" space to be allocated on TAPE, not hard disk, if you see what I mean. > If one 100meg file gets replace every 2 days, I wouldnt want to use snapshots on the filesystem, if there was a disk space limitation. >The cognitive disconnect is that snapshots are blocks, not files. Therefore, the snapshot may contain only changed portions of files and blocks from a single file may be spread across many different snapshots. Perhaps you are really looking for a more traditional, file-oriented backup capability.> (I know there are solutions such as samfs for this, but. I''m looking for a zfs solution, if possible, please?) >ADM is the samfs equivalent for ZFS. http://opensolaris.org/os/project/adm/ -- richard> and help with the other parts of my original email, would still be appreciated :) >
> Ah, there is a cognitive disconnect... more below. > > The cognitive disconnect is that snapshots are > blocks, not files. > Therefore, the snapshot may contain only changed > portions of > files and blocks from a single file may be spread > across many > different snapshots.I was referring to restoring TO a snapshot. However, I didnt mandate that the incomming stream WAS a snapshot :-} Your point about snapshots being blocks, not files, is well taken. However, the limitation that receive of a "full send" can only be done to an automatically created new filesystem, is overly burdensome. Wouldnt it be more useful, if it had the capability to restore to a newly created snapshot of an existing zfs filesystem, rsync style? Thanks for the ADM reference. I''ll check that out. -- This message posted from opensolaris.org
Ok, I see where you''re coming from now, but what you''re talking about isn''t zfs send / receive. If I''m interpreting correctly, you''re talking about a couple of features, neither of which is in ZFS yet, and I''d need the input of more technical people to know if they are possible. 1. The ability to restore individual files from a snapshot, in the same way an entire snapshot is restored - simply using the blocks that are already stored. 2. The ability to store (and restore from) snapshots on external media. Now I know a few people are working on the ZFS automatic backup stuff, and are currently looking at a way to automatically send snapshots to external media. I suspect these two features would work well with that if there were a way to make them possible. However, being able to restore at a block level from an external device is almost certainly going to have a lot of gotchas at the technical level. I couldn''t say whether that would even be feasible I''m afraid, and to the best of my knowledge, neither of these features are present in ZFS yet. -- This message posted from opensolaris.org
Ross wrote:> Ok, I see where you''re coming from now, but what you''re talking about isn''t zfs send / receive. If I''m interpreting correctly, you''re talking about a couple of features, neither of which is in ZFS yet, and I''d need the input of more technical people to know if they are possible. > > 1. The ability to restore individual files from a snapshot, in the same way an entire snapshot is restored - simply using the blocks that are already stored. > > 2. The ability to store (and restore from) snapshots on external media.What makes you say this doesn''t work ? Exactly what do you mean here because this will work: $ zfs send tank at s1 | dd of=/dev/tape Sure it might not be useful and I don''t think that is what you mean here so can you expand on "sotre snapshots on external media. -- Darren J Moffat
Hi Darren, That''s storing a dump of a snapshot on external media, but files within it are not directly accessible. The work Tim et all are doing is actually putting a live ZFS filesystem on external media and sending snapshots to it. A live ZFS filesystem is far more useful (and reliable) than a dump, and having the ability to restore individual files from that would be even better. It still doesn''t help the OP, but I think that''s what he was after. Ross On Mon, Nov 3, 2008 at 9:55 AM, Darren J Moffat <darrenm at opensolaris.org> wrote:> Ross wrote: >> >> Ok, I see where you''re coming from now, but what you''re talking about >> isn''t zfs send / receive. If I''m interpreting correctly, you''re talking >> about a couple of features, neither of which is in ZFS yet, and I''d need the >> input of more technical people to know if they are possible. >> >> 1. The ability to restore individual files from a snapshot, in the same >> way an entire snapshot is restored - simply using the blocks that are >> already stored. >> >> 2. The ability to store (and restore from) snapshots on external media. > > What makes you say this doesn''t work ? Exactly what do you mean here > because this will work: > > $ zfs send tank at s1 | dd of=/dev/tape > > Sure it might not be useful and I don''t think that is what you mean here so > can you expand on "sotre snapshots on external media. > > -- > Darren J Moffat >
Ross Smith wrote:> Hi Darren, > > That''s storing a dump of a snapshot on external media, but files > within it are not directly accessible. The work Tim et all are doing > is actually putting a live ZFS filesystem on external media and > sending snapshots to it. >Cognitive disconnect, again. Snapshots do not contain files, they contain changed blocks.> A live ZFS filesystem is far more useful (and reliable) than a dump, > and having the ability to restore individual files from that would be > even better. > > It still doesn''t help the OP, but I think that''s what he was after. >Snapshots are not replacements for traditional backup/restore features. If you need the latter, use what is currently available on the market. -- richard> Ross > > > > On Mon, Nov 3, 2008 at 9:55 AM, Darren J Moffat <darrenm at opensolaris.org> wrote: > >> Ross wrote: >> >>> Ok, I see where you''re coming from now, but what you''re talking about >>> isn''t zfs send / receive. If I''m interpreting correctly, you''re talking >>> about a couple of features, neither of which is in ZFS yet, and I''d need the >>> input of more technical people to know if they are possible. >>> >>> 1. The ability to restore individual files from a snapshot, in the same >>> way an entire snapshot is restored - simply using the blocks that are >>> already stored. >>> >>> 2. The ability to store (and restore from) snapshots on external media. >>> >> What makes you say this doesn''t work ? Exactly what do you mean here >> because this will work: >> >> $ zfs send tank at s1 | dd of=/dev/tape >> >> Sure it might not be useful and I don''t think that is what you mean here so >> can you expand on "sotre snapshots on external media. >> >> -- >> Darren J Moffat >> >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
> Snapshots are not replacements for traditional backup/restore features. > If you need the latter, use what is currently available on the market. > -- richardI''d actually say snapshots do a better job in some circumstances. Certainly they''re being used that way by the desktop team: http://blogs.sun.com/erwann/entry/zfs_on_the_desktop_zfs None of this is stuff I''m after personally btw. This was just my attempt to interpret the request of the OP. Although having said that, the ability to restore single files as fast as you can restore a whole snapshot would be a nice feature. Is that something that would be possible? Say you had a ZFS filesystem containing a 20GB file, with a recent snapshot. Is it technically feasible to restore that file by itself in the same way a whole filesystem is rolled back with "zfs restore"? If the file still existed, would this be a case of redirecting the file''s top level block (dnode?) to the one from the snapshot? If the file had been deleted, could you just copy that one block? Is it that simple, or is there a level of interaction between files and snapshots that I''ve missed (I''ve glanced through the tech specs, but I''m a long way from fully understanding them). Ross
Ross Smith wrote:>> Snapshots are not replacements for traditional backup/restore features. >> If you need the latter, use what is currently available on the market. >> -- richard >> > > I''d actually say snapshots do a better job in some circumstances. > Certainly they''re being used that way by the desktop team: > http://blogs.sun.com/erwann/entry/zfs_on_the_desktop_zfs >Yes, this is one of the intended uses of snapshots. But snapshots do not replace backup/restore systems.> None of this is stuff I''m after personally btw. This was just my > attempt to interpret the request of the OP. > > Although having said that, the ability to restore single files as fast > as you can restore a whole snapshot would be a nice feature. Is that > something that would be possible? > > Say you had a ZFS filesystem containing a 20GB file, with a recent > snapshot. Is it technically feasible to restore that file by itself > in the same way a whole filesystem is rolled back with "zfs restore"? >cp> If the file still existed, would this be a case of redirecting the > file''s top level block (dnode?) to the one from the snapshot? If the > file had been deleted, could you just copy that one block? > > Is it that simple, or is there a level of interaction between files > and snapshots that I''ve missed (I''ve glanced through the tech specs, > but I''m a long way from fully understanding them). >It is as simple as a cp, or drag-n-drop in Nautilus. The snapshot is read-only, so there is no need to cp, as long as you don''t want to modify it or destroy the snapshot. -- richard
>> If the file still existed, would this be a case of redirecting the >> file''s top level block (dnode?) to the one from the snapshot? If the >> file had been deleted, could you just copy that one block? >> >> Is it that simple, or is there a level of interaction between files >> and snapshots that I''ve missed (I''ve glanced through the tech specs, >> but I''m a long way from fully understanding them). >> > > It is as simple as a cp, or drag-n-drop in Nautilus. The snapshot is > read-only, so > there is no need to cp, as long as you don''t want to modify it or destroy > the snapshot. > -- richardBut that''s missing the point here, which was that we want to restore this file without having to copy the entire thing back. Doing a cp or a drag-n-drop creates a new copy of the file, taking time to restore, and allocating extra blocks. Not a problem for small files, but not ideal if you''re say using ZFS to store virtual machines, and want to roll back a single 20GB file from a 400GB filesystem. My question was whether it''s technically feasible to roll back a single file using the approach used for restoring snapshots, making it an almost instantaneous operation? ie: If a snapshot exists that contains the file you want, you know that all the relevant blocks are already on disk. You don''t want to copy all of the blocks, but since ZFS follows a tree structure, couldn''t you restore the file by just restoring the one master block for that file? I''m just thinking that if it''s technically feasible, I might raise an RFE for this.
> If > I''m interpreting correctly, you''re talking about a > couple of features, neither of which is in ZFS yet,...> 1. The ability to restore individual files from a > snapshot, in the same way an entire snapshot is > restored - simply using the blocks that are already > stored. > > 2. The ability to store (and restore from) snapshots > on external media.Those sound useful. particularly the ability to restore a single file, even if it was only from a "full" send instead of a snapshot. But I dont think that''s what I''m asking for :-) Lemme try again. Lets say that you have a mega-source tree, in one huge zfs filesystem. (lets say, the entire ON distribution or something :-) Lets say that you had a full zfs send done, Nov 1st. then, between then, and today, there were "assorted things done" to the source tree. Major things. Things that people suddenly realized were "bad". But they werent sure exactly how/why. They just knew things worked nov 1st, but are broken now. Pretend there''s no such thing as tags, etc. So: they want to get things up and running, maybe even only in read-only mode, from the nov 1st full send. But they also want to take a look at the changes. And they want to do it in a very space-efficient manner. It would be REALLY REALLY NICE, to be able to take a full send of /zfs/srctree, and restore it to /zfs/srctree at nov1_snap, or something like that. Given that [making up numbers] out of 1 million src files, only 1000 have changed, it would be "really nice", to have those 999,000 files that have NOT changed, not be doubly allocated in both /zfs/srctree and /zfs/srctree at nov1_snap. They will be actually hardlinked/snapshot-duped/whatever the terminology is. I guess you might refer to what I''m talking about, as taking a synthetic snapshot. Kinda like veritas backup, etc. can "synthesize" full dumps, from a sequence of full+ incrementals, and then write out a "real" full dump, onto a single tape, as if a "full dump" happened on the date of a particular incremental. Except that in what I ''m talking about for zfs, it would be synthesizing a zfs snapshot of a filesystem, that was made for the full zsend (even though the original "snapshot" has since been deleted) -- This message posted from opensolaris.org
Ok, I think I understand. You''re going to be told that ZFS send isn''t a backup (and for these purposes I definately agree), but if we ignore that this sounds like you''re talking about restoring a snapshot from an external media, and then running a clone off that. Clone''s are already supported, but restoring a deleted snapshot isn''t. Can anybody comment on whether that would even be possible? It''s an intriguing idea if so. -- This message posted from opensolaris.org
> Ok, I think I understand. You''re going to be told > that ZFS send isn''t a backup (and for these purposes > I definately agree), ...Hmph. well, even for ''replication'' type purposes, what I''m talking about is quite useful. Picture two remote systems, which happen to have "mostly identical" data. Perhaps they were manually synced at one time with tar, or something. Now the company wants to bring them both into full sync... but first analyze the small differences that may be present. In that scenario, it would then be very useful, to be able to do the following: hostA# zfs snapshot /zfs/prod at A hostA# zfs send /zfs/prod at A | ssh hostB zfs receive /zfs/prod at A hostB# diff -r /zfs/prod /zfs/prod/.zfs/snapshots/A >/tmp/prod.diffs One could otherwise find "files that are different", with rsync -avn. But doing it with zfs in this way, "adds value", by allowing you to locally compare old and new files on the same machine, without having to do some ghastly manual copy of each different file, to a new place, and doing the compare there. -- This message posted from opensolaris.org
On 11/03/08 13:18, Philip Brown wrote:>> Ok, I think I understand. You''re going to be told >> that ZFS send isn''t a backup (and for these purposes >> I definately agree), ... >> > > Hmph. well, even for ''replication'' type purposes, what I''m talking about is quite useful. > Picture two remote systems, which happen to have "mostly identical" data. Perhaps they were manually synced at one time with tar, or something. > Now the company wants to bring them both into full sync... but first analyze the small differences that may be present. >um, /usr/bin/rsync ? but agreed, not for huge amounts of data...> In that scenario, it would then be very useful, to be able to do the following: > > hostA# zfs snapshot /zfs/prod at A > hostA# zfs send /zfs/prod at A | ssh hostB zfs receive /zfs/prod at A > > hostB# diff -r /zfs/prod /zfs/prod/.zfs/snapshots/A >/tmp/prod.diffs > > > One could otherwise find "files that are different", with rsync -avn. But doing it with zfs in this way, "adds value", by allowing you to locally compare old and new files on the same machine, without having to do some ghastly manual copy of each different file, to a new place, and doing the compare there. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081103/2f51cc5c/attachment.html>