Brice Rebsamen
2013-Apr-02 19:07 UTC
rsync to sync time without attempting to modify the content
Hello I am setting up a central data repository for my team (several thousands of files, totaling about 4TB). There are multiple sources that I need to consolidate: a source may have a fraction of the total number of file, and there can be conflicts between different sources (that should be very occasional though). I want to detect those conflicts and manually merge them. Also, the timestamps of the different copies of the files can be different as some users have copied the same files to different locations and changed their mtime (because by default cp does not preserve timestamps...). So what I want, is to be able to compare files this way: if the file does not exist at the destination, then transfer it with timestamps (rsync -t) otherwise: if timestamps and sizes are different: if md5sums match if the source time stamp is earlier than the destination timestamp update the timestamp of the destination otherwise report (in a log file or something so that I can come back to those later) I tried writing a script to do it, but it turned out to be tedious and I have the feeling that I am reinventing the wheel here because I spent a lot of time just parsing the paths on the command line the way rsync does it. I have the feeling that rsync could do this for me, but I could not find, in the man page, the option to prevent modifying the content of the remote file, but allow to update the timestamps. I was thinking that using rsync in several passes could do the job: a pass in dry run to compare files, put the files with differences in several lists according to whether their md5sum matches or based on timestamp differences, then process those files again with rsync with appropriate options. But how to get rsync to give me info why files are different? If that turns out to be too tough, I think I can do it with Unison, it's just going to take a lot of time to compute all the md5sums and review all the files for changes... Hope it makes sense Brice
Steven Levine
2013-Apr-03 07:36 UTC
rsync to sync time without attempting to modify the content
In <515B2C7F.8000407 at gmail.com>, on 04/02/13 at 12:07 PM, Brice Rebsamen <brice.rebsamen at gmail.com> said: Hi,>options. But how to get rsync to give me info why files are different?Check the docs for --itemize-changes. This might be provide sufficient information to build your transfer lists. If not, take a look at the log format features in the rsync.conf docs. There's no way I can think of to have rsync update timestamps without first making the content equal, so this might be a separate script. Of course, you could always contract with someone familiar with the rsync code base to add a --assume-content-ok option. Steven -- ---------------------------------------------------------------------- "Steven Levine" <steve53 at earthlink.net> eCS/Warp/DIY etc. www.scoug.com www.ecomstation.com ----------------------------------------------------------------------
Wayne Davison
2013-May-19 23:19 UTC
rsync to sync time without attempting to modify the content
On Tue, Apr 2, 2013 at 12:07 PM, Brice Rebsamen <brice.rebsamen at gmail.com>wrote:> So what I want, is to be able to compare files this way: > > if the file does not exist at the destination, then transfer it with > timestamps (rsync -t) > otherwise: > if timestamps and sizes are different: > if md5sums match > if the source time stamp is earlier than the destination > timestamp > update the timestamp of the destination > otherwise report (in a log file or something so that I can come > back to those later) >You can use rsync to help with that, but it won't do it for you. For instance, the first part can be done by specifying --ignore-existing (so that rsync just copies in missing files). You could then get a list of files that differ in their timestamp by running a --dry-run with --itemize-changes. If you use --checksum, it will differentiate between identical files and those that are different, but it will take a huge amount of time to checksum all files. Instead, you may want to just have rsync list all the files that differ by time and you then put them into a --files-from file for rsync to check via --dry-run and --checksum. ..wayne.. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20130519/ae5760c5/attachment.html>