Is it possible to call rsync and tell it to use a filter file if it exists, but otherwise continue without errors? If I pass "--filter=. .rsync-filter", it will fail if .rsync-filter doesn't exist. I know you can pass "--filter=: /.rsync-filter" to search for filter files in each directory. That won't fail if there aren't any such files. But I'm only interested in one file at the root. Thanks, Jacob
On Tue, 2009-10-27 at 15:38 -0700, Jacob Weber wrote:> Is it possible to call rsync and tell it to use a filter file if it > exists, but otherwise continue without errors? > > If I pass "--filter=. .rsync-filter", it will fail if .rsync-filter > doesn't exist. > > I know you can pass "--filter=: /.rsync-filter" to search for filter > files in each directory. That won't fail if there aren't any such > files. But I'm only interested in one file at the root.No, rsync does not have such a feature. It could be added, but I would be skeptical of letting the filter support grow organically into something unmanageable; I'd rather see it replaced with a full scripting language once and for all. For now, you can test for the filter file in the script calling rsync. Here's the syntax for bash: filter_opt=() if [ -e .rsync-filter ]; then filter_opt=("--filter=. .rsync-filter") fi rsync ... "${filter_opt[@]}" ... -- Matt
Hi , We have huge data to sync usually everyday and I wish rsync could guarantee performance. I thought of spliting the directories and run parallel rsyncs on them. It may cost me some network, but I can control that from the MAX_RSYNC_PROCESS variable. Can some one evaluate pros and cons of this design?. Any help is heartily appreciated. #!/usr/bin/ksh MAX_RSYNC_PROCESS=10 # Control the Parallelism from here sync_and_wait() { i=0 while read RSYNC_COMMAND do eval "${RSYNC_COMMAND}" & # The command is rsync command line i=$((i+1)) if [[ $i = ${MAX_RSYNC_PROCESS} ]] then wait i=0 fi done wait } Thanks, Satish Shukla
On 28.10.2009 10:35, Matt McCutchen wrote:> On Wed, 2009-10-28 at 10:01 +0100, Matthias Schniedermeyer wrote: > > On 28.10.2009 09:05, Satish Shukla wrote: > > > We have huge data to sync usually everyday and I wish rsync could guarantee performance. > > > > > > I thought of spliting the directories and run parallel rsyncs on them. It may cost me some network, but I can control that from the MAX_RSYNC_PROCESS variable. Can some one evaluate pros and cons of this design?. Any help is heartily appreciated. > > > > That only works IF: > > - You have SSDs (preferably good ones, both sides) > > - Each rsync covers a different physical HDD (both sides) > > - You have a massive Array with truck-loads of HDDs and a matching > > controller or something along that line (again both sides). > > - A combination of the above would also work > > > > Otherwise parallel rsyncs completly kill any performance you had because > > normal HDDs will fall into a seek-storm, when more than 1 rsync works on > > them. > > Asynchronous I/O may solve that, on OSes that support it.No. That's a fundamental problem with ANY rotating media device. I don't say say that you can't build something for the people that have that kind of hardware, or that are constrainted by high bandwidth & latency network connections (You don't need it for low bandwidth and/or low latency). But it would be utterly useless for the other 95-99% of rsync users. Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous.