Hi, I'd like to suggest a new feature to rsync. Problem: Currently, rsync generates a recursive list of file existing a the source directory, modifies this list by includes and excludes, and then copies these files. That's pretty good in most, but not all cases. I am mirroring a debian archive, but unfortunately, debian mixes all files of several distributions in a subtree /pool. There is no way to select only the files of a certain distribution through a simple exclude/include expression. There is a tool called debmirror, which first downloads the distribution index files, extracts all the filenames/paths of the files needed and then calls rsync for every single file. Thats certainly not useful, especially since rsync shows the servers motd for every single file. Therefore, I'd like to suggest a new option: Allow rsync to not build the list of files existing at the source directory by recursively walking through the source directory, but by reading a file or stdin to get a list of files to be copied. This would allow to mirror the distribution index files in a first step, then build the list of files needed and then to download these files is a second step. An alternative method would be to keep the recursive method, but to open a pipe to an external program. Before downloading a file, the path is printed to the pipe and an answer is read from the pipe. Thus, an external filter program can decide for each single file whether to copy it or not. regards Hadmut (Please respond directly, I'm not on your mailing list)
Might it be possible to take the file list that you want to feed to rsync and turn it into an rsync.conf file? A simple bash script could create the config file and call rsync (with the --config= to specify the temporary config file) Something like this (syntax most likely is wrong, haven't tested it): #!/bin/sh IFS=" " cat /etc/rsync.conf > rsync_command FILES_TO_SYNC=`cat file_list.txt` for EACH_FILE in $FILES_TO_SYNC; do echo ' --include="${EACH_FILE}"' >> rsync_command done rsync --config=rsync_command - Ed King Hadmut Danisch wrote:>Hi, > >I'd like to suggest a new feature to rsync. > >Problem: >Currently, rsync generates a recursive list of file >existing a the source directory, modifies this list by >includes and excludes, and then copies these files. >That's pretty good in most, but not all cases. > >I am mirroring a debian archive, but unfortunately, >debian mixes all files of several distributions in a >subtree /pool. There is no way to select only the files >of a certain distribution through a simple exclude/include >expression. > >There is a tool called debmirror, which first downloads >the distribution index files, extracts all the filenames/paths >of the files needed and then calls rsync for every single file. >Thats certainly not useful, especially since rsync shows the >servers motd for every single file. > >Therefore, I'd like to suggest a new option: Allow rsync to >not build the list of files existing at the source directory >by recursively walking through the source directory, but by >reading a file or stdin to get a list of files to be copied. > >This would allow to mirror the distribution index files in a >first step, then build the list of files needed and then to >download these files is a second step. > >An alternative method would be to keep the recursive method, but >to open a pipe to an external program. Before downloading a >file, the path is printed to the pipe and an answer is read >from the pipe. Thus, an external filter program can decide for >each single file whether to copy it or not. > >regards >Hadmut >(Please respond directly, I'm not on your mailing list) > > > >
Hadmut Danisch wrote:> I'd like to suggest a new feature to rsync.> I am mirroring a debian archive, but unfortunately, > debian mixes all files of several distributions in a > subtree /pool. There is no way to select only the files > of a certain distribution through a simple exclude/include > expression. > > There is a tool called debmirror, which first downloads > the distribution index files, extracts all the filenames/paths > of the files needed and then calls rsync for every single file. > Thats certainly not useful, especially since rsync shows the > servers motd for every single file.I was about to suggest: $ rsync --include-from=list-file --exclude=\* but of course that will exclude the parent directories of files you want, causing them to be ignored. This might work: $ rsync --include-from=list-file --include=\*\*/ --exclude=\* although it will mirror the entire directory structure (but not unspecified files). Probably, rsync should be taught that: "If I explicitly include a file, look for it explicitly, even if I've excluded a parent directory." Max.