Henu
2010-Feb-03 10:04 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Hello Is there a possibility to get a list of changed files between two snapshots? Currently I do this manually, using basic file system functions offered by OS. I scan every byte in every file manually and it is of course awfully slow. If I have understood correctly, ZFS could use its own information about which files use which blocks, and thereby calculate the difference very quickly without having to scan every byte. Currently I haven''t found any tools like this. My application uses libZFS to handle ZFS. On the other hand, I have noticed that ZFS send generates difference very quickly, even if it needs to find the small difference between many unchanged files. From this, I have concluded that it may be using the ZFS information to quickly see if file has been modified or not. Do you have any idea how the send works? Maybe I could use its output to get the list of changed files... Henrik Heino
Henu
2010-Feb-03 14:53 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Okay, so first of all, it''s true that send is always fast and 100% reliable because it uses blocks to see differences. Good, and thanks for this information. If everything else fails, I can parse the information I want from send stream :) But am I right, that there is no other methods to get the list of changed files other than the send command? And in my situation I do not need to create snapshots. They are already created. The only thing that I need to do, is to get list of all the changed files (and maybe the location of difference in them, but I can do this manually if needed) between two already created snapshots. Regards, Henrik Heino Quoting Andrey Kuzmin <andrey.v.kuzmin at gmail.com>:> In the periodic snapshot/send diff scenario you presumably ask about, > zfs_send basically creates snapshot(n+1) and then performs pruned > tree-walk limited to blocks modified between snap(n) and snap(n+1). > See > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libzfs/common/libzfs_sendrecv.c#1127 > for details. > > Regards, > Andrey > > > > > On Wed, Feb 3, 2010 at 1:04 PM, Henu <henrik.heino at tut.fi> wrote: >> Hello >> >> Is there a possibility to get a list of changed files between two snapshots? >> Currently I do this manually, using basic file system functions offered by >> OS. I scan every byte in every file manually and it is of course awfully >> slow. >> >> If I have understood correctly, ZFS could use its own information about >> which files use which blocks, and thereby calculate the difference very >> quickly without having to scan every byte. Currently I haven''t found any >> tools like this. My application uses libZFS to handle ZFS. >> >> On the other hand, I have noticed that ZFS send generates difference very >> quickly, even if it needs to find the small difference between many >> unchanged files. From this, I have concluded that it may be using the ZFS >> information to quickly see if file has been modified or not. Do you have any >> idea how the send works? Maybe I could use its output to get the list of >> changed files... >> >> Henrik Heino >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >
Ross Walker
2010-Feb-03 15:11 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Feb 3, 2010, at 9:53 AM, Henu <henrik.heino at tut.fi> wrote:> Okay, so first of all, it''s true that send is always fast and 100% > reliable because it uses blocks to see differences. Good, and thanks > for this information. If everything else fails, I can parse the > information I want from send stream :) > > But am I right, that there is no other methods to get the list of > changed files other than the send command? > > And in my situation I do not need to create snapshots. They are > already created. The only thing that I need to do, is to get list of > all the changed files (and maybe the location of difference in them, > but I can do this manually if needed) between two already created > snapshots.Not a ZFS method, but you could use rsync with the dry run option to list all changed files between two file systems. -Ross
Frank Cusack
2010-Feb-03 15:29 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On February 3, 2010 12:04:07 PM +0200 Henu <henrik.heino at tut.fi> wrote:> Is there a possibility to get a list of changed files between two > snapshots?Great timing as I just looked this up last night, I wanted to verify that an install program was only changing the files on disk that it claimed to be changing. So I have to say, "come on". It took me but one google search and the answer was one of the top 3 hits. <http://forums.freebsd.org/showthread.php?p=65632> # newer files find /file/system -newer /file/system/.zfs/snapshot/snapname -type f # deleted files cd /file/system/.zfs/snapshot/snapname find . -type f -exec "test -f /file/system/{} || echo {}" \; The above requires GNU find (for -newer), and obviously it only finds files. If you need symlinks or directory names modify as appropriate. The above is also obviously to compare a snapshot to the current filesystem. To compare two snapshots make the obvious modifications. -frank
Frank Cusack
2010-Feb-03 15:31 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On February 3, 2010 12:04:07 PM +0200 Henu <henrik.heino at tut.fi> wrote:> Is there a possibility to get a list of changed files between two > snapshots? Currently I do this manually, using basic file system > functions offered by OS. I scan every byte in every file manually and it^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ On February 3, 2010 10:11:01 AM -0500 Ross Walker <rswwalker at gmail.com> wrote:> Not a ZFS method, but you could use rsync with the dry run option to list > all changed files between two file systems.That''s exactly what the OP is already doing ... -frank
Andrey Kuzmin
2010-Feb-03 15:35 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Wed, Feb 3, 2010 at 6:11 PM, Ross Walker <rswwalker at gmail.com> wrote:> On Feb 3, 2010, at 9:53 AM, Henu <henrik.heino at tut.fi> wrote: > >> Okay, so first of all, it''s true that send is always fast and 100% >> reliable because it uses blocks to see differences. Good, and thanks for >> this information. If everything else fails, I can parse the information I >> want from send stream :) >> >> But am I right, that there is no other methods to get the list of changed >> files other than the send command?At zfs_send level there are no files, just DMU objects (modified in some txg which is the basis for changed/unchanged decision).>> >> And in my situation I do not need to create snapshots. They are already >> created. The only thing that I need to do, is to get list of all the changed >> files (and maybe the location of difference in them, but I can do this >> manually if needed) between two already created snapshots. > > Not a ZFS method, but you could use rsync with the dry run option to list > all changed files between two file systems.That''s painfully resource-intensive on both (sending and receiving) ends, and it would be IMHO really beneficial to come up with an interface that lets user-space (including off-the-shelf backup tools) to iterate objects changed between two given snapshots. Regards, Andrey> > -Ross > >
Jens Elkner
2010-Feb-03 17:02 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Wed, Feb 03, 2010 at 10:29:18AM -0500, Frank Cusack wrote:> On February 3, 2010 12:04:07 PM +0200 Henu <henrik.heino at tut.fi> wrote: > >Is there a possibility to get a list of changed files between two > >snapshots? > > Great timing as I just looked this up last night, I wanted to verify > that an install program was only changing the files on disk that it > claimed to be changing. So I have to say, "come on". It took me but > one google search and the answer was one of the top 3 hits. > > <http://forums.freebsd.org/showthread.php?p=65632> > > # newer files > find /file/system -newer /file/system/.zfs/snapshot/snapname -type f > # deleted files > cd /file/system/.zfs/snapshot/snapname > find . -type f -exec "test -f /file/system/{} || echo {}" \; > > The above requires GNU find (for -newer), and obviously it only finds > files. If you need symlinks or directory names modify as appropriate. > > The above is also obviously to compare a snapshot to the current > filesystem. To compare two snapshots make the obvious modifications.Perhaps http://iws.cs.uni-magdeburg.de/~elkner/ddiff/ wrt. dir2dir cmp may help as well (should be faster). Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Frank Cusack
2010-Feb-03 17:19 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On February 3, 2010 6:02:52 PM +0100 Jens Elkner <jel+zfs at cs.uni-magdeburg.de> wrote:> On Wed, Feb 03, 2010 at 10:29:18AM -0500, Frank Cusack wrote: >> # newer files >> find /file/system -newer /file/system/.zfs/snapshot/snapname -type f >> # deleted files >> cd /file/system/.zfs/snapshot/snapname >> find . -type f -exec "test -f /file/system/{} || echo {}" \; >> >> The above requires GNU find (for -newer), and obviously it only finds >> files. If you need symlinks or directory names modify as appropriate. >> >> The above is also obviously to compare a snapshot to the current >> filesystem. To compare two snapshots make the obvious modifications. > > Perhaps http://iws.cs.uni-magdeburg.de/~elkner/ddiff/ wrt. dir2dir cmp > may help as well (should be faster).If you don''t need to know about deleted files, it wouldn''t be. It''s hard to be faster than walking through a single directory tree if ddiff has to walk through 2 directory trees. If you do need to know about deleted files, the find method still may be faster depending on how ddiff determines whether or not to do a file diff. The docs don''t explain the heuristics so I wouldn''t want to guess on that. -frank
Frank Cusack
2010-Feb-03 17:35 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On February 3, 2010 12:19:50 PM -0500 Frank Cusack <frank+lists/zfs at linetwo.net> wrote:> If you do need to know about deleted files, the find method still may > be faster depending on how ddiff determines whether or not to do a > file diff. The docs don''t explain the heuristics so I wouldn''t want > to guess on that.An improvement on finding deleted files with the find method would be to not limit your find criteria to files. Directories with deleted files will be newer than in the snapshot so you only need to look at those directories. I think this would be faster than ddiff in most cases.
Jens Elkner
2010-Feb-03 18:15 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Wed, Feb 03, 2010 at 12:19:50PM -0500, Frank Cusack wrote:> On February 3, 2010 6:02:52 PM +0100 Jens Elkner > <jel+zfs at cs.uni-magdeburg.de> wrote: > >On Wed, Feb 03, 2010 at 10:29:18AM -0500, Frank Cusack wrote: > >># newer files > >>find /file/system -newer /file/system/.zfs/snapshot/snapname -type f > >># deleted files > >>cd /file/system/.zfs/snapshot/snapname > >>find . -type f -exec "test -f /file/system/{} || echo {}" \; > >> > >>The above requires GNU find (for -newer), and obviously it only finds > >>files. If you need symlinks or directory names modify as appropriate. > >> > >>The above is also obviously to compare a snapshot to the current > >>filesystem. To compare two snapshots make the obvious modifications. > > > >Perhaps http://iws.cs.uni-magdeburg.de/~elkner/ddiff/ wrt. dir2dir cmp > >may help as well (should be faster). > > If you don''t need to know about deleted files, it wouldn''t be. It''s hard > to be faster than walking through a single directory tree if ddiff has to > walk through 2 directory trees.Yepp, but I guess the ''test ...'' invocation for each file alone is much more time consuming and IIRC the test -f path has do do several stats as well, ''til it reaches its final target. So a lot of overhead again. However, just finding newer files via ''find'' is probably unbeatable ;-)> If you do need to know about deleted files, the find method still may > be faster depending on how ddiff determines whether or not to do a > file diff. The docs don''t explain the heuristics so I wouldn''t want > to guess on that.ddiff is a single process and basically travels recursively through directories via a DirectoryStream (side by side) and stops it at the point, where no more information is required to make the final decision (depends on cmd line options). So it needs for very deep dirs with a lot of entries [much] more memory than find, yes. Not sure, how DirectoryStream is implemented, but I guess, it gets mapped to readdir(3C) and friends ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Ross Walker
2010-Feb-03 23:46 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Feb 3, 2010, at 12:35 PM, Frank Cusack <frank+lists/ zfs at linetwo.net> wrote:> On February 3, 2010 12:19:50 PM -0500 Frank Cusack <frank+lists/zfs at linetwo.net > > wrote: >> If you do need to know about deleted files, the find method still may >> be faster depending on how ddiff determines whether or not to do a >> file diff. The docs don''t explain the heuristics so I wouldn''t want >> to guess on that. > > An improvement on finding deleted files with the find method would > be to not limit your find criteria to files. Directories with > deleted files will be newer than in the snapshot so you only need > to look at those directories. I think this would be faster than > ddiff in most cases.So was there a final consensus on the best way to find the difference between two snapshots (files/directories added, files/directories deleted and file/directories changed)? Find won''t do it, ddiff won''t do it, I think the only real option is rsync. Of course you can zfs send the snap to another system and do the rsync there against a local previous version. -Ross
Richard Elling
2010-Feb-04 00:06 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Feb 3, 2010, at 3:46 PM, Ross Walker wrote:> On Feb 3, 2010, at 12:35 PM, Frank Cusack <frank+lists/zfs at linetwo.net> wrote: > >> On February 3, 2010 12:19:50 PM -0500 Frank Cusack <frank+lists/zfs at linetwo.net> wrote: >>> If you do need to know about deleted files, the find method still may >>> be faster depending on how ddiff determines whether or not to do a >>> file diff. The docs don''t explain the heuristics so I wouldn''t want >>> to guess on that. >> >> An improvement on finding deleted files with the find method would >> be to not limit your find criteria to files. Directories with >> deleted files will be newer than in the snapshot so you only need >> to look at those directories. I think this would be faster than >> ddiff in most cases. > > So was there a final consensus on the best way to find the difference between two snapshots (files/directories added, files/directories deleted and file/directories changed)? > > Find won''t do it, ddiff won''t do it, I think the only real option is rsync. Of course you can zfs send the snap to another system and do the rsync there against a local previous version.bart(1m) is designed to do this. -- richard
Frank Cusack
2010-Feb-04 01:59 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On February 3, 2010 6:46:57 PM -0500 Ross Walker <rswwalker at gmail.com> wrote:> So was there a final consensus on the best way to find the difference > between two snapshots (files/directories added, files/directories deleted > and file/directories changed)? > > Find won''t do it, ddiff won''t do it, I think the only real option is > rsync.I think you misread the thread. Either find or ddiff will do it and either will be better than rsync. -frank
Matthew Ahrens
2010-Feb-04 04:10 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
This is RFE 6425091 "want ''zfs diff'' to list files that have changed between snapshots", which covers both file & directory changes, and file removal/creation/renaming. We actually have a prototype of zfs diff. Hopefully someday we will finish it up... --matt Henu wrote:> Hello > > Is there a possibility to get a list of changed files between two > snapshots? Currently I do this manually, using basic file system > functions offered by OS. I scan every byte in every file manually and it > is of course awfully slow. > > If I have understood correctly, ZFS could use its own information about > which files use which blocks, and thereby calculate the difference very > quickly without having to scan every byte. Currently I haven''t found any > tools like this. My application uses libZFS to handle ZFS. > > On the other hand, I have noticed that ZFS send generates difference > very quickly, even if it needs to find the small difference between many > unchanged files. From this, I have concluded that it may be using the > ZFS information to quickly see if file has been modified or not. Do you > have any idea how the send works? Maybe I could use its output to get > the list of changed files... > > Henrik Heino > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Ross Walker
2010-Feb-04 05:39 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Feb 3, 2010, at 8:59 PM, Frank Cusack <frank+lists/zfs at linetwo.net> wrote:> On February 3, 2010 6:46:57 PM -0500 Ross Walker > <rswwalker at gmail.com> wrote: >> So was there a final consensus on the best way to find the difference >> between two snapshots (files/directories added, files/directories >> deleted >> and file/directories changed)? >> >> Find won''t do it, ddiff won''t do it, I think the only real option is >> rsync. > > I think you misread the thread. Either find or ddiff will do it and > either will be better than rsync.Find can find files that have been added or removed between two directory trees? How? -Ross
Jens Elkner
2010-Feb-04 05:55 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Wed, Feb 03, 2010 at 06:46:57PM -0500, Ross Walker wrote:> On Feb 3, 2010, at 12:35 PM, Frank Cusack <frank+lists/ > zfs at linetwo.net> wrote: > > >On February 3, 2010 12:19:50 PM -0500 Frank Cusack > ><frank+lists/zfs at linetwo.net > wrote: > >>If you do need to know about deleted files, the find method still may > >>be faster depending on how ddiff determines whether or not to do a > >>file diff. The docs don''t explain the heuristics so I wouldn''t want > >>to guess on that. > > > >An improvement on finding deleted files with the find method would > >be to not limit your find criteria to files. Directories with > >deleted files will be newer than in the snapshot so you only need > >to look at those directories. I think this would be faster than > >ddiff in most cases. > > So was there a final consensus on the best way to find the difference > between two snapshots (files/directories added, files/directories > deleted and file/directories changed)? > > Find won''t do it, ddiff won''t do it,ddiff does exactly this. However it never looks at any timestamp since it is the most unimportant/unreliable path component "tag" wrt. "what has been changed" and does also not take file permissions and xattrs into account. So ddiff is all about path names, types and content. Not more but also not less ;-)> I think the only real option is > rsync. Of course you can zfs send the snap to another system and do > the rsync there against a local previous version.Probably the worst of all suggested alternatives ... Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768
Tomas Ă–gren
2010-Feb-04 07:00 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On 03 February, 2010 - Frank Cusack sent me these 0,7K bytes:> On February 3, 2010 12:04:07 PM +0200 Henu <henrik.heino at tut.fi> wrote: >> Is there a possibility to get a list of changed files between two >> snapshots? Currently I do this manually, using basic file system >> functions offered by OS. I scan every byte in every file manually and it > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > On February 3, 2010 10:11:01 AM -0500 Ross Walker <rswwalker at gmail.com> > wrote: >> Not a ZFS method, but you could use rsync with the dry run option to list >> all changed files between two file systems. > > That''s exactly what the OP is already doing ...rsync by default compares metadata first, and only checks through every byte if you add the -c (checksum) flag. I would say rsync is the best tool here. The "find -newer blah" suggested in other posts won''t catch newer files with an old timestamp (which could happen for various reasons, like being copied with kept timestamps from somewhere else). /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
Henu
2010-Feb-04 10:30 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
So do you mean I cannot gather the names and locations of changed/created/removed files just by analyzing a stream of (incremental) zfs_send? Quoting Andrey Kuzmin <andrey.v.kuzmin at gmail.com>:> On Wed, Feb 3, 2010 at 6:11 PM, Ross Walker <rswwalker at gmail.com> wrote: >> On Feb 3, 2010, at 9:53 AM, Henu <henrik.heino at tut.fi> wrote: >> >>> Okay, so first of all, it''s true that send is always fast and 100% >>> reliable because it uses blocks to see differences. Good, and thanks for >>> this information. If everything else fails, I can parse the information I >>> want from send stream :) >>> >>> But am I right, that there is no other methods to get the list of changed >>> files other than the send command? > > At zfs_send level there are no files, just DMU objects (modified in > some txg which is the basis for changed/unchanged decision). >
Henu
2010-Feb-04 10:45 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Whoa! That is exactly what I''ve been looking for. Is there any developement version publicly available for testing? Regards, Henrik Heino Quoting Matthew Ahrens <Matthew.Ahrens at sun.com>:> This is RFE 6425091 "want ''zfs diff'' to list files that have changed > between snapshots", which covers both file & directory changes, and > file removal/creation/renaming. We actually have a prototype of zfs > diff. Hopefully someday we will finish it up... > > --matt
Ian Collins
2010-Feb-04 10:46 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Henu wrote:> So do you mean I cannot gather the names and locations of > changed/created/removed files just by analyzing a stream of > (incremental) zfs_send?That''s correct, you can''t. Snapshots do not work at the file level. -- Ian.
Darren Mackay
2010-Feb-04 12:29 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Hi Ross, zdb -dddd fs at snapshot | grep "path" | nawk ''{print $2}'' Enjoy! Darren Mackay -- This message posted from opensolaris.org
Darren Mackay
2010-Feb-04 13:16 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
looking through some more code.. i was a bit premature in my last post - been a long day. extracting the guids and query the metadata seems to be logical -> i think runnign a zfs send just to parse the data stream is a lot of overhead, when you really only need to traverse metadata directly. zdb sources have most of the bits there - just need to unwind the deadlist (this seems to match the numder of blocks that have been deleted since the last snap)... might look into this in the next week or 2 if i have time -> seems like a worthwhile project ;-) Darren Mackay -- This message posted from opensolaris.org
Ross Walker
2010-Feb-04 13:21 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On Feb 4, 2010, at 2:00 AM, Tomas ?gren <stric at acc.umu.se> wrote:> On 03 February, 2010 - Frank Cusack sent me these 0,7K bytes: > >> On February 3, 2010 12:04:07 PM +0200 Henu <henrik.heino at tut.fi> >> wrote: >>> Is there a possibility to get a list of changed files between two >>> snapshots? Currently I do this manually, using basic file system >>> functions offered by OS. I scan every byte in every file manually >>> and it >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> >> On February 3, 2010 10:11:01 AM -0500 Ross Walker <rswwalker at gmail.com >> > >> wrote: >>> Not a ZFS method, but you could use rsync with the dry run option >>> to list >>> all changed files between two file systems. >> >> That''s exactly what the OP is already doing ... > > rsync by default compares metadata first, and only checks through > every > byte if you add the -c (checksum) flag. > > I would say rsync is the best tool here. > > The "find -newer blah" suggested in other posts won''t catch newer > files > with an old timestamp (which could happen for various reasons, like > being copied with kept timestamps from somewhere else).Find -newer doesn''t catch files added or removed it assumes identical trees. I would be interested in comparing ddiff, bart and rsync (local comparison only) to see imperically how they match up. -Ross
Ross Walker
2010-Feb-04 13:27 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Interesting, can you explain what zdb is dumping exactly? I suppose you would be looking for blocks referenced in the snapshot that have a single reference and print out the associated file/ directory name? -Ross On Feb 4, 2010, at 7:29 AM, Darren Mackay <darren at sikkra.com> wrote:> Hi Ross, > > zdb -dddd fs at snapshot | grep "path" | nawk ''{print $2}'' > > Enjoy! > > Darren Mackay > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Darren Mackay
2010-Feb-04 13:31 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
The delete queue and related blocks need further investigation... root at osol-dev:/data/zdb-test# zdb -dd data/zdb-test | more Dataset data/zdb-test [ZPL], ID 641, cr_txg 529804, 24.5K, 6 objects Object lvl iblk dblk dsize lsize %full type 0 7 16K 16K 15.0K 16K 18.75 DMU dnode -1 1 16K 512 1K 512 100.00 ZFS user/group used -2 1 16K 512 1K 512 100.00 ZFS user/group used 1 1 16K 512 1K 512 100.00 ZFS master node 2 1 16K 512 1K 512 100.00 ZFS delete queue 3 1 16K 1.50K 1K 1.50K 100.00 ZFS directory 4 1 16K 512 1K 512 100.00 ZFS directory 19 1 16K 512 512 512 100.00 ZFS plain file 22 1 16K 2K 2K 2K 100.00 ZFS plain file all the info seems to be there (otherwise, we would not be able to store files at all!!). and *spare time* project for the coming couple of weeks... Darren -- This message posted from opensolaris.org
Frank Cusack
2010-Feb-04 20:15 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On 2/4/10 12:39 AM -0500 Ross Walker wrote:> On Feb 3, 2010, at 8:59 PM, Frank Cusack <frank+lists/zfs at linetwo.net> > wrote: >> I think you misread the thread. Either find or ddiff will do it and >> either will be better than rsync. > > Find can find files that have been added or removed between two directory > trees? > > How?When a file is added or removed in a directory, the directory''s mtime is updated. So find -newer will locate those directories. Then of course you need to do a little bit more work to locate the files. -frank
Frank Cusack
2010-Feb-04 20:23 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On 2/4/10 8:00 AM +0100 Tomas ?gren wrote:> rsync by default compares metadata first, and only checks through every > byte if you add the -c (checksum) flag. > > I would say rsync is the best tool here.ah, i didn''t know that was the default. no wonder recently when i was incremental-rsyncing a few TB of data between 2 hosts (not using zfs) i didn''t get any speedup from --size-only or whatever the flag is.> The "find -newer blah" suggested in other posts won''t catch newer files > with an old timestamp (which could happen for various reasons, like > being copied with kept timestamps from somewhere else).good point. that is definitely a restriction with find -newer. but if you meet that restriction, and don''t need to find added or deleted files, it will be faster since only 1 directory tree has to be walked. but in the general case it does sound like rsync is the best. unless bart can find added and missing files. in which case bart is better because it only has to walk 1 dir tree -- assuming you have a saved manifest from a previous walk over the original dir tree. -frank
Frank Cusack
2010-Feb-04 20:24 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
On 2/4/10 8:21 AM -0500 Ross Walker wrote:> Find -newer doesn''t catch files added or removed it assumes identical > trees.This may be redundant in light of my earlier post, but yes it does. Directory mtimes are updated when a file is added or removed, and find -newer will detect that. -frank
Darren Mackay
2010-Feb-04 22:49 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Hi Ross, Yes - zdb -dddd is dumping out info in the form of: Object lvl iblk dblk dsize lsize %full type 19 1 16K 512 512 512 100.00 ZFS plain file 264 bonus ZFS znode dnode flags: USED_BYTES USERUSED_ACCOUNTED dnode maxblkid: 0 path /snapshot.sh uid 0 gid 0 atime Thu Feb 4 23:04:50 2010 mtime Thu Feb 4 23:04:50 2010 ctime Thu Feb 4 23:04:50 2010 crtime Thu Feb 4 23:04:50 2010 gen 529806 mode 100755 size 174 parent 3 links xattr 0 rdev 0x0000000000000000 for all objects referenced in the snap. Perhaps if you wanted to script this, then parsing the above output for time stamps that are after the previous snapshot. Deleted files (and of course new files) can be diffed against the list for the snapshot you want to compare with, but I assume you also want files that have been modified, hence the requirement to parse the above outputs. Unfortunately time does not permit me to come up with a working solution until (really snowed under until mid next week - did someone say there is meant to be a weekend in their too?). But I am sure there is enough info here for someone to hack together a script. Cheers, Darren Mackay -- This message posted from opensolaris.org
Jesus Cea
2010-Feb-05 13:27 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/03/2010 04:35 PM, Andrey Kuzmin wrote:> At zfs_send level there are no files, just DMU objects (modified in > some txg which is the basis for changed/unchanged decision).Would be awesome if "zfs send" would have an option to show files changed (with offset), and mode/directory changes (showing the before & after data). As is, "zfs send" is nice but you require ZFS in both sides. I would love a rsync-like tool that could avoid to scan 20 millions of files just to find a couple of small changes (or none at all). - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBS2wc2Zlgi5GaxT1NAQKtRgP/dVBF8xfGPRRcq5tpKBQTW7C1aCiHzMhV 0Sxu2lWY7Fcl7+se5O2YINYYVFWF7dA+Rh0yr2dAQDNTbe0CfwRxt3BKjS+nsjvH GFW7cBOD+Zg7tt3nrVaYf7fg86ZssR9rTDj56fRycdA2rzfpnIgjP0bYoZczo6Lx 9DdiopUHaec=RkVb -----END PGP SIGNATURE-----
Jesus Cea
2010-Feb-05 13:36 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 02/04/2010 05:10 AM, Matthew Ahrens wrote:> This is RFE 6425091 "want ''zfs diff'' to list files that have changed > between snapshots", which covers both file & directory changes, and file > removal/creation/renaming. We actually have a prototype of zfs diff. > Hopefully someday we will finish it up...Can''t wait! :-)) - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBS2we+5lgi5GaxT1NAQJbzQP9FuwJAFNP+7m+kIHG0Tx4ksDUwrD8g+UD 8dYSjsymNANml1St39vlLUyG9czz2jt/9HR+fw6ERc4lJI+omlZx9eUMy6f3nVyP GcPpReVE5yMoDUZuhWJwu2fJLvcxzQl6yTSN/J+CVKGeIAJeR6TDWV6Z7UbxmgRA Oc/qN9f70hg=H9sA -----END PGP SIGNATURE-----
Kjetil Torgrim Homme
2010-Feb-06 15:57 UTC
[zfs-discuss] How to get a list of changed files between two snapshots?
Frank Cusack <frank+lists/zfs at linetwo.net> writes:> On 2/4/10 8:00 AM +0100 Tomas ?gren wrote: >> The "find -newer blah" suggested in other posts won''t catch newer >> files with an old timestamp (which could happen for various reasons, >> like being copied with kept timestamps from somewhere else). > > good point. that is definitely a restriction with find -newer. but > if you meet that restriction, and don''t need to find added or deleted > files, it will be faster since only 1 directory tree has to be walked.FWIW, GNU find has -cnewer -- Kjetil T. Homme Redpill Linpro AS - Changing the game