Is it possible to tell rsync *not* to use file names, date stamps, etc and only use the checksum for deciding if a file is the same? the remote machine "normalizes" a set of file names to remove all punctuation marks and forces all file names to lower case. The files themselves are unchanged. --checksum looks promising but it does not say anything about file names: -c, --checksum Skip based on checksum, not mod-time & size Can this be done? -- I like paying taxes. With them I buy civilization.
On Thu, Jul 05, 2012 at 09:26:05AM -0700, Yan Seiner wrote:> Is it possible to tell rsync *not* to use file names, date stamps, etc and > only use the checksum for deciding if a file is the same? > > the remote machine "normalizes" a set of file names to remove all > punctuation marks and forces all file names to lower case. The files > themselves are unchanged. > > --checksum looks promising but it does not say anything about file names: > > -c, --checksum Skip based on checksum, not mod-time & size > > Can this be done?Does --fuzzy help at all? Lars
On Thu, July 5, 2012 10:10 am, Lars Ellenberg wrote:> On Thu, Jul 05, 2012 at 09:26:05AM -0700, Yan Seiner wrote: >> Is it possible to tell rsync *not* to use file names, date stamps, etc >> and >> only use the checksum for deciding if a file is the same? >> >> the remote machine "normalizes" a set of file names to remove all >> punctuation marks and forces all file names to lower case. The files >> themselves are unchanged. >> >> --checksum looks promising but it does not say anything about file >> names: >> >> -c, --checksum Skip based on checksum, not mod-time & size >> >> Can this be done? > > Does --fuzzy help at all?Apparently not. rsync --fuzzy --checksum rsync --fuzzy rsync --checksum all still want to copy dupes that only differ by upper/lower case... -- I like paying taxes. With them I buy civilization.
On 05.07.2012 09:26, Yan Seiner wrote:> Is it possible to tell rsync *not* to use file names, date stamps, etc and > only use the checksum for deciding if a file is the same? > > the remote machine "normalizes" a set of file names to remove all > punctuation marks and forces all file names to lower case. The files > themselves are unchanged. > > --checksum looks promising but it does not say anything about file names: > > -c, --checksum Skip based on checksum, not mod-time & size > > Can this be done?A workaround comes to mind. MD5/SHA1 (whatever) the files and hardlink them under that name into a (hidden) directory. Then when you rsync with "-H" those hardlinks (All files must be below the start-directory) make sure that rsync only has to delete/create hardlinks and not copy them again after it had copied it the first time. I use a similar method for a bunch of big files i have, i hardlink them into a hidden directory and when i move the files around rsync only deletes/creates hardlinks. When i move the files onto other storage i only need to do "find .z -type f -links 1" to find out which files only have 1 link. Which means all other hardlinks are gone and i can remove that file. ("find .z -type f -links 1 -delete") Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous.
hello, a patch could help you in the case of a move or rename of a file : Patch : --detect-renamed (1) match in size & modify-time (plus the basename, if possible) (2) or match in size & checksum (when --checksum was also specified) and use each match as an alternate basis file to speed up the transfer. http://gitweb.samba.org/?p=rsync-patches.git;a=blob;f=detect-renamed.diff;h=c3e6e846eab437e56e25e2c334e292996ee84345;hb=master Patch options : --detect-renamed-lax and --detect-moved http://gitweb.samba.org/?p=rsync-patches.git;a=blob;f=detect-renamed-lax.diff;h=1ff593c8f97a97e8970d43ff5a62dfad5abddd75;hb=master Benjamin ANDRE 2012/7/5 Matthias Schniedermeyer <ms at citd.de>> On 05.07.2012 09:26, Yan Seiner wrote: > > Is it possible to tell rsync *not* to use file names, date stamps, etc > and > > only use the checksum for deciding if a file is the same? > > > > the remote machine "normalizes" a set of file names to remove all > > punctuation marks and forces all file names to lower case. The files > > themselves are unchanged. > > > > --checksum looks promising but it does not say anything about file names: > > > > -c, --checksum Skip based on checksum, not mod-time & size > > > > Can this be done? > > A workaround comes to mind. > > MD5/SHA1 (whatever) the files and hardlink them under that name into a > (hidden) directory. > > Then when you rsync with "-H" those hardlinks (All files must be below > the start-directory) make sure that rsync only has to delete/create > hardlinks and not copy them again after it had copied it the first time. > > I use a similar method for a bunch of big files i have, i hardlink them > into a hidden directory and when i move the files around rsync only > deletes/creates hardlinks. When i move the files onto other storage i > only need to do "find .z -type f -links 1" to find out which files only > have 1 link. Which means all other hardlinks are gone and i can remove > that file. ("find .z -type f -links 1 -delete") > > > > > > Bis denn > > -- > Real Programmers consider "what you see is what you get" to be just as > bad a concept in Text Editors as it is in women. No, the Real Programmer > wants a "you asked for it, you got it" text editor -- complicated, > cryptic, powerful, unforgiving, dangerous. > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20120706/352cf791/attachment.html>