I''m sorry if this either obvious or has beaten to death... I''m looking for ways to backup data on a linux server that has been using rsync with the script `rsnapshot''. Some of you may know how that works... I won''t explain it here other than to say only changed data gets rsynced to the backup data. I''m wondering if I could do something similar by making that data directory hierarchy available on a osol.ll build 110 zfs server as an NFS mount. That is, can I nfs mount a remote filesystem on a zpool and use zfs snapshot functionality to create snapshots of that data? I''m not at all familiar with using zfs in general but was thinking something like: Make a directory hierarchy on a remote linux machine availabe for nfs mount. Mount the nfs share on osol server, inside a zpool. Do whatever is the correct way on zfs to create a snapshot of that mounted data and write it onto another directory also inside the zpool. A week later mount the same nfs share from remote linux machine and create a second snapshot written to the same other directory. Will that procedure produce a snapshot of first the full base data, and second time around only the changed data in comparison to the first snapshot? So proceeding in that manner, I''d have a series of snapshots were I could trace any differences in files?
On Fri, Apr 10, 2009 at 01:18:05PM -0500, Harry Putnam wrote:> I''m looking for ways to backup data on a linux server that has been > using rsync with the script `rsnapshot''. Some of you may know how > that works... I won''t explain it here other than to say only changed > data gets rsynced to the backup data. > > I''m wondering if I could do something similar by making that data > directory hierarchy available on a osol.ll build 110 zfs server as an > NFS mount.Not unless the data is within the ZFS pool.> That is, can I nfs mount a remote filesystem on a zpool and use zfs > snapshot functionality to create snapshots of that data?No. You can''t mount a filesystem into a zpool. The snapshots are easy for ZFS to do because it owns the data and it knows about every changed block. It wouldn''t be able to see that on a remote NFS filesystem. The client has to scan every file (like rsync) to find changes. So a ZFS host could own the data, then *it* could be the NFS server and it would work the way you want. But it won''t work if the ZFS host is just an NFS client. -- Darren
On Fri, 10 Apr 2009 13:18:05 -0500, Harry Putnam <reader at newsguy.com> wrote:>I''m sorry if this either obvious or has beaten to death... > >I''m looking for ways to backup data on a linux server that has been >using rsync with the script `rsnapshot''. Some of you may know how >that works... I won''t explain it here other than to say only changed >data gets rsynced to the backup data. > >I''m wondering if I could do something similar by making that data >directory hierarchy available on a osol.ll build 110 zfs server as an >NFS mount. > >That is, can I nfs mount a remote filesystem on a zpoolYes>and use zfs snapshot functionality to create snapshots of that data?No.>I''m not at all familiar with using zfs in general but was thinking >something like: > > Make a directory hierarchy on a remote linux machine availabe for > nfs mount. > > Mount the nfs share on osol server, inside a zpool. Do whatever is > the correct way on zfs to create a snapshot of that mounted data > and write it onto another directory also inside the zpool. > > A week later mount the same nfs share from remote linux machine and > create a second snapshot written to the same other directory. > > Will that procedure produce a snapshot of first the full base data, > and second time around only the changed data in comparison to the > first snapshot? > > So proceeding in that manner, I''d have a series of snapshots were I > could trace any differences in files?Won''t work. The filesystem you mount is not a zfs filesystem. The mountpoint can be in a zfs in a zpool, but that doesn''t make it a zfs. -- ( Kees Nuyt ) c[_]
[...]>> Mount the nfs share on osol server, inside a zpool. Do whatever is >> the correct way on zfs to create a snapshot of that mounted data >> and write it onto another directory also inside the zpool. >> >> A week later mount the same nfs share from remote linux machine and >> create a second snapshot written to the same other directory.[...]>> So proceeding in that manner, I''d have a series of snapshots were I >> could trace any differences in files? > > Won''t work. > The filesystem you mount is not a zfs filesystem. > The mountpoint can be in a zfs in a zpool, but that doesn''t > make it a zfs.Gack... I was afraid of that... it sounded way to easy Thanks for the input... I guess I''ll have to wait and see what is resolved in the other thread about what happens when you rsync data to a zfs filesystem and how that inter plays with zfs snapshots going on too.
On Sat, Apr 11, 2009 at 4:41 AM, Harry Putnam <reader at newsguy.com> wrote:> Thanks for the input... I guess I''ll have to wait and see what is > resolved in the other thread about what happens when you rsync data to > a zfs filesystem and how that inter plays with zfs snapshots going on too.Is there anything to wait? Didn''t that thread already mention that using rsync with --inplace and zfs snapshots work as expected? Or do you ave other problems?
"Fajar A. Nugraha" <fajar at fajar.net> writes:> On Sat, Apr 11, 2009 at 4:41 AM, Harry Putnam <reader at newsguy.com> wrote: >> Thanks for the input... I guess I''ll have to wait and see what is >> resolved in the other thread about what happens when you rsync data to >> a zfs filesystem and how that inter plays with zfs snapshots going on too. > > Is there anything to wait? > Didn''t that thread already mention that using rsync with --inplace and > zfs snapshots work as expected? Or do you ave other problems?I guess not... I''m a little slow on the uptake and easily confused. I guess the upshot is that if one were to daily rsync data to an zfs filesystem, the changes wrought there by rsync would be reflected in zfs snapshots, maybe timed to happen right after the rsync runs, as these new blocks covering only the deltas... I don''t really know what deltas are... but I guess it would be only the changed parts. And I''m guessing further that one would be able to recover each change from the snapshots somehow. In my OP, I mentioned rsync and rsnapshot backup system on linux as being in some way comparable. I do understand how rsnapshot works but still not seeing exactly how the zfs snapshots work. Maybe a concrete example would be a bit easier to understand if you can give one. I''''m still not really understanding COW. Say we have a files residing on a remote non zfs file system. Each day these file might receive a bit more data... or not. Each day rsync is run on this filesystem. When nothing is added, rsync ignores it, when something is changed rsync sends it. On the local host a zfs filesystem is receiving these rsync runs. After each run of rsync... a zfs snapshot is taken on the data now on the local host. Any change is preserved in the zfs snapshot. These snapshots would at least vaguely resemble the backups create by rsnapshot using rsync. I know how to seek out a change to a specific file in pile of rsnapshot backups... But not how to do so zfs snapshots which in my mind are vaguely similar. So if I wanted to find a specific change in a file... that would be somewhere in the zfs snapthosts... say to retrieve a certain formulation in some kind of `rc'' file that worked better than a later formulation. How would I do that?
> guess the upshot is that if one were to daily rsync data to an zfs > filesystem, the changes wrought there by rsync would be reflected in > zfs snapshots, maybe timed to happen right after the rsync runs, as > these new blocks covering only the deltas... I don''t really know what > deltas are... but I guess it would be only the changed parts.I do this (roughly) for Linux backups. My ZFS server exports a "backup" dataset via NFS to a Linux machine. Twice a day (4am and 4pm) Linux rsyncs to the NFS mountpoint. Once a day (at midnight) the ZFS server snapshots the dataset.> And I''m guessing further that one would be able to recover each change > from the snapshots somehow.Yes. My ZFS backup dataset has snapdir=hidden, but it''s still available over the NFS mount. My Linux users can do this kind of thing: cd /nfs/backup/.zfs/snapshot/auto-d20090312 more somefile to read "somefile" from the 12 March 2009 backup.> In my OP, I mentioned rsync and rsnapshot backup system on linux as > being in some way comparable. I do understand how rsnapshot works but > still not seeing exactly how the zfs snapshots work. > > Maybe a concrete example would be a bit easier to understand if you > can give one. I''''m still not really understanding COW.Copy on write means that two objects (files) referring to identical data get pointers to the data instead of duplicate copies. As long as these are only read, and not written, the pointer to the same data is fine. When a write occurs, the data is copied and one of the referrers gets a pointer to the new copy. This prevents the write from affecting both referring files. Copy on write is a description of how COW is used in virtual memory. For disk storage, "copy" isn''t necessarily accurate: since the entire data block is rewritten anyway, a separate "copy" step can be optimized away. Here''s a simple illustration of COW in action. It''s not necessarily an accurate depiction of ZFS, but of the general concept in terms of a filesystem. 1. When a file (file A) is written to disk, blocks are allocated for the file and data is stored in those blocks. The blocks each have a reference count, and ref counts are set to 1 because only one file refers to the blocks. 2. I copy File A to File B. The new file simply refers to all the same blocks. The ref counts are raised to 2. 3. I snapshot the filesystem. This is essentially like copying every file in it, as in #2. No blocks are copied because no new data was written, but ref counts are raised. I''m not sure about zfs''s implementation, but in principle I guess an immutable snapshot should only need to raise ref ct by 1 in total, whereas a mutable snapshot (i.e., a clone) would incrememnt once for every reference in the filesystem. 4. I rsync to the file in step #1. Let''s suppose this leaves blocks 1 and 2 alone, but updates block 3. The new data for block 3 is written to a new block (call it 3bis), and block 3 is left on the disk as it is. Block 3''s ref count is decremented, and 3bis''s ref count is set to 1. File A: blocks 1, 2, 3bis File B: blocks 1, 2, 3 Block 1: ref ct 3 (file A, file B, snapshot) Block 2: ref ct 3 (file A, file B, snapshot) Block 3: ref ct 2 (file B, snapshot) Block 3bis: ref ct 1 (file A) 5. I remove file B. Ref counts for its blocks are decremented, but since all its blocks still have ref counts > 0, they persist. No blocks are removed from the dataset. File A: blocks 1, 2, 3bis Block 1: ref ct 2 (file A, snapshot) Block 2: ref ct 2 (file A, snapshot) Block 3: ref ct 1 (snapshot) Block 3bis: ref ct 1 (file A) 6. I remove file A. Ref counts again decrement. Block 1: ref ct 1 (snapshot) Block 2: ref ct 1 (snapshot) Block 3: ref ct 1 (snapshot) Block 3bis: ref ct 0 Since 3bis no longer has any referrers, it is deallocated. Blocks 1, 2, and 3 are still used by the snapshot, even though the original files A and B are no longer present. This is a pretty simplistic view. In practice, not only does the COW methodology apply to the files'' data blocks; it also applies to their metadata, the filesystem''s directories, and so on. This ensures that directory information as well as files persist in snapshots. It also explains why snapshots are virtually instantaneous: you only make a new set of pointers to all the existing data, but you don''t replace any of the existing data.> So if I wanted to find a specific change in a file... that would be > somewhere in the zfs snapthosts... say to retrieve a certain > formulation in some kind of `rc'' file that worked better than a later > formulation. How would I do that?Using the .zfs/snapshot directory (see above) you can diff two different generations of a file at the same path. -- -D. dgc at uchicago.edu NSIT University of Chicago
Hello Harry, Saturday, April 11, 2009, 5:05:47 PM, you wrote: HP> So if I wanted to find a specific change in a file... that would be HP> somewhere in the zfs snapthosts... say to retrieve a certain HP> formulation in some kind of `rc'' file that worked better than a later HP> formulation. How would I do that? there is no ''zfs diff ...'' (yet), however you could do: # cd /pool/fs/.zfs/snapshots # diff -u 2009-04-01/etc/passwd 2009-04-02/etc/passwd All your snaoshots in zfs are presented to you as read-only file systems which you can freely access. -- Best regards, Robert Milkowski http://milek.blogspot.com