For those that like to assist in the testing of rsync, the CVS version now defaults to doing an incremental file-list scan when it is recursing through the directories. This avoids keeping the whole file list in memory, and allows the transfer to start working on changed files before it has completed the recursive scan of the sending side. The code appears to be working well so far, but there are probably bugs lurking in such a large set of changes, so please be careful and do let me know if you have any questions, discover any problems, etc. Here's what I wrote for the manpage: Beginning with rsync 3.0.0, the recursive algorithm used is now an incremental scan that uses much less memory than before and begins the transfer after the scanning of the first few directories have been completed. This incremental scan only affects our recursion algorithm, and does not change a non-recursive transfer (e.g. when using a fully-specified --files-from list). It is also only possible when both ends of the transfer are at least version 3.0.0. Some options require rsync to know the full file list, so these options disable the incremental recursion mode. These include: --delete-before, --delete-after, --delay-updates, and (currently) --hard-links. Because of this, the default delete mode when you specify --delete is now --delete-during when both ends of the connection are at least 3.0.0 (use --del or --delete-during to request this improved deletion mode explicitly). See also the --delete-delay option that is a better choice than using --delete-after. I have some ideas for how to support --hard-link and --delay-updates in the incremental transfer mode, so those caveats may go away at some point. ..wayne..
On 12/28/06, Wayne Davison <wayned@samba.org> wrote:> For those that like to assist in the testing of rsync, the CVS version > now defaults to doing an incremental file-list scan when it is recursing > through the directories. [...]- Would you care to explain how the scan works, or should I read the source code? Specifically, I'm curious about what areas under the source argument(s) are scanned at what time. Also, does the incremental scan rule out "file has vanished" warnings? - I would recommend renaming incremental -> flist_incremental so people don't confuse it with !whole_file . Matt
Matt McCutchen
2007-Aug-05 21:00 UTC
Incremental file-list recursion has landed in CVS; Re: RSYNC + iNotify
On 8/5/07, Buck Huppmann <buckh@pobox.com> wrote:> i'm just worried that the code is gonna collapse under the weight of > so many options at some point--sorta like WindowsThat's a legitimate concern and one that I have thought about myself. I suppose it depends on what you mean by "collapse". If you're thinking of performance, I don't think 121 flags, or even 121 "if" tests per file, is going to hurt anything measurably. If you're thinking of correctness: no one is ever going to be able to verify that rsync works properly with every possible combination of options, but that's not the goal. The goal is for it to work properly in all situations in which it is actually used. Testing each option in isolation and in combination with the other options with which it is most often used generally accomplishes this. When it doesn't, someone files a bug and the problem gets fixed. Perhaps this rather loose perspective makes you uneasy. It used to make me uneasy, too, but then I realized that I don't use software because its correctness has been proved; I use software because it works for me and does what I need it to do. In practical software development, formal correctness is a means, not an end. Some people have to be reminded of this, myself included--take as evidence this note I sent to the list: http://lists.samba.org/archive/rsync/2007-July/018024.html So, no, I don't see a problem with rsync adding more and more options to satisfy more and more needs.* If a significant drop-off in reliability results, I believe that Wayne will look into cleaning up the code or removing or simplifying some options. For comparison, look at the Linux kernel. It has about 2802 configuration options, and I'm guessing it's unbuildable or unusable in many of the possible configurations. Still, it is widely used and accepted. * This holds as long as the options are germane to rsync's role as a general-purpose file copying tool. Often, people propose the addition of an option to rsync when there is a much better way to solve the problem. I myself fell into that trap: see https://bugzilla.samba.org/show_bug.cgi?id=2094 . Matt
Matt McCutchen
2007-Aug-07 20:20 UTC
Incremental file-list recursion has landed in CVS; Re: RSYNC + iNotify
On 8/6/07, Buck Huppmann <buckh@pobox.com> wrote:> sorry to drag the other guys into this; they may no longer be interest- > ed. for my part, getting rsync hacked up to fit into a tool-chain to > suit my bidding was about as much ambition as i had. setting up this > frep thing seems like Real Work, but i'll certainly bear it in mindI went ahead and wrote my own continuous replication/mirroring script called "continusync". It is posted at: http://www.kepreon.com/~matt/utils/#continusync Currently, it appears to work correctly (I tested it locally and remotely) but is very inefficient: it runs rsync once per event. You might like to use it or improve it. Matt