Douglas Wade Needham
2007-Dec-28 23:37 UTC
problems using --ignore-existing and filter rules
Greetings everyone, I have a problem which I believe is a collision between the --ignore-existing option and filter rules. It appears to me that regardless of argument order, when I specify the two on a command line, even if a non-existing directory appears in the filter list as a protect rule. But when I change protect rules to exclude rules, the excluded files/directories appear not to be transferred. Now, for details... I have a sandbox which is a build of a complete OS image. I want to push the contents of this sandbox to both new and existing hosts, protecting some files which change from server to server, as well as protecting directory trees which are also server specific. Sound crazy? It isn't. It is a trick I used at CompuServe years ago to build, maintain and upgrade hundreds of UN*X servers in six data centers around the world. And when I started having some problems compiling rdist (lower level OS API changes), I figured to give rsync a try. Here is the version I found the problem with (which is in the NetBSD pkgsrc tree). viking$ rsync --version rsync version 2.6.9 protocol version 29 Copyright (C) 1996-2006 by Andrew Tridgell, Wayne Davison, and others. <http://rsync.samba.org/> Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles, inplace, IPv6, 64-bit system inums, 64-bit internal inums The command line used is one like the following, while chroot'ed into the sandbox, with the attached filter: rsync -OavzHn --filter="merge /.rsync/filter.dirs" --ignore-existing / viking:/ I have also confirmed it on the latest versions found in FC6 and CentOS 4.5, and 5.0. In this case, I have copied things from under / into a directory such as /sandbox/rsync_test, added a /.rsync subdirectory to hold my test_rsync script and filter file, and after adding a few extra files and creating a /opt2 by renaming /opt, running rsync. In each and every case, I find that /.rsync and /opt2 are transferred even if listed in a protect rule. And this is true regardless of whether or not the path specification start and/or ends in a slash. Now, as to why the protect rule vs. exclude rule is important, I want to use the filter.dirs file to protect areas which are not a part of the OS, such as application data, home directories and such with this file, and then have another file protect things such as configuration files which are a part of the OS, and should not be pushed once they exist, but should be pushed to a server once the server is up and running with a minimal OS load. And so, I want to use the same filter file I use with a command like the one above with a command such as (untested, wrapped for readability): rsync -OavzH --filter="merge /.rsync/filter.dirs" --filter="merge /.rsync/filter.config" --delete-before --delete-excluded / viking:/ Now, with these details, I would love to hear if folks think that I am crazy to think that I should be able to do this with rsync, or if the consensis is that there indeed a bug which needs to be debugged and exterminated? (Now if only rsync offered a way to run commands on the remote server when certain files were updated...hehe). - Doug -- Douglas Wade Needham - KA8ZRT UN*X Consultant & UW/BSD kernel programmer Email: cinnion <at> ka8zrt . com http://www.ka8zrt.com Disclaimer: My opinions are my own. Since I don't want them, why should my employer, or anybody else for that matter! -------------- next part -------------- # # Use --ignore-existing flag # exclude *~ protect *.orig protect .files protect .files.md5 protect /.rsync/ protect amd/ protect argus/ protect boot/ protect cdrom/ protect cdrom1/ protect dev/ protect distfile* protect do_rdist.sh protect errs* protect floppy/ protect home/ protect homes/ protect kern/ protect mnt/ protect msdos/ protect n/ protect netbsd* protect proc/ protect root/.Xauthority protect root/.ksh_history protect root/.lsof_* protect root/.mozilla protect root/.spamassassin protect source/ protect sysinst.log0 protect tftpboot/ protect tmp/ protect u0/ protect u0i/ protect u0j/ protect u1/ protect u1h/ protect u2/ protect u3/ protect u4/ protect u5/ protect usr/lsrc protect usr/mdec protect usr/pkgsrc protect usr/src protect usr/xsrc protect var/tmp/* protect var/yp/Makefile protect var/yp/binding protect var/yp/ka8zrt.com protect var/zope* protect vol protect work protect www
On Fri, 2007-12-28 at 13:03 -0500, Douglas Wade Needham wrote:> The command line used is one like the following, while chroot'ed into > the sandbox, with the attached filter: > > rsync -OavzHn --filter="merge /.rsync/filter.dirs" --ignore-existing / viking:/ > > I have also confirmed it on the latest versions found in FC6 and > CentOS 4.5, and 5.0. In this case, I have copied things from under / > into a directory such as /sandbox/rsync_test, added a /.rsync > subdirectory to hold my test_rsync script and filter file, and after > adding a few extra files and creating a /opt2 by renaming /opt, > running rsync. In each and every case, I find that /.rsync and /opt2 > are transferred even if listed in a protect rule. And this is true > regardless of whether or not the path specification start and/or ends > in a slash.Right. As documented in the man page, the sole effect of a protect rule is to stop a destination file (or directory) from being deleted if it is extraneous. To stop a destination file from being updated, use an exclude rule.> Now, as to why the protect rule vs. exclude rule is important, I want > to use the filter.dirs file to protect areas which are not a part of > the OS, such as application data, home directories and such with this > file,If you mean that you don't want these areas processed at all, use an exclude rule.> and then have another file protect things such as configuration > files which are a part of the OS, and should not be pushed once they > exist, but should be pushed to a server once the server is up and > running with a minimal OS load.The --ignore-existing option makes rsync leave existing files alone throughout the destination. Rsync does not provide a way to selectively activate this behavior for some areas of the destination. If you want this behavior for certain areas, you can use two rsync runs: one with --ignore-existing for those areas, and one without --ignore-existing with those areas excluded. Still, note that --ignore-existing operates at the level of individual files; there is no way to tell rsync not to add new files to an existing directory. If you need that, then for the first run, instead of passing --ignore-existing, you should run a list of the "configuration" areas through a script on the destination machine that filters out those that already exist and then pass the resulting list to rsync with --files-from.> (Now if only rsync offered a way to run commands on the remote server > when certain files were updated...hehe).You can accomplish this by using an rsync daemon on the destination with a "post-xfer exec" script that parses the daemon's log for any relevant updates and runs any appropriate commands. Or you could use a system administration tool such as Puppet ( https://reductivelabs.com/trac/puppet ) to run commands on a file update whether the update was done via rsync or other means. Matt