samba-bugs@samba.org
2008-Jul-04 02:08 UTC
DO NOT REPLY [Bug 5583] New: Files always updated even if time is the only difference
https://bugzilla.samba.org/show_bug.cgi?id=5583 Summary: Files always updated even if time is the only difference Product: rsync Version: 2.6.9 Platform: x86 OS/Version: Linux Status: NEW Severity: enhancement Priority: P3 Component: core AssignedTo: wayned@samba.org ReportedBy: l.gumbley@auckland.ac.nz QAContact: rsync-qa@samba.org I am using rsync to update compact flash cards and would like to minimise the cycles on them. The cards contain root FSs for a number of identical robots that differ only in UUIDs, mac addresses, hostnames etc. A large number of files are generated specially for the update (thus always have different timestamps to the existing files on the card) but almost always correspond exactly to the files existing on the CF card. My rsync -i output is full of:>f..T...... etc/hostnameAnd other similar files, where the only thing being changed is the transfer time, which I don't care about. I accept that the files have to be transferred as the stamp is different, but I don't see the point of writing the file if none of the data has changed. I spent a significant amount of time trying to find an option that would prevent this with no luck, apologies if I have overlooked something. Finally, this is somewhat similar to bug 3229 but a little different in that it's not to do with the backup function. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2008-Jul-04 02:31 UTC
DO NOT REPLY [Bug 5583] Files always updated even if time is the only difference
https://bugzilla.samba.org/show_bug.cgi?id=5583 ------- Comment #1 from matt@mattmccutchen.net 2008-07-03 21:31 CST ------- Try --checksum. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2008-Jul-04 05:18 UTC
DO NOT REPLY [Bug 5583] Files always updated even if time is the only difference
https://bugzilla.samba.org/show_bug.cgi?id=5583 ------- Comment #2 from l.gumbley@auckland.ac.nz 2008-07-04 00:18 CST ------- Thanks for your comment Matt, but --checksum takes in excess of a hundred times longer. I cancelled it as I couldn't be bothered waiting. It calculates the checksum of every file on the source system regardless of the size or timestamp before continuing. It might be a solution if the possibility existed to only calculate checksums in the case of a timestamp difference, however as I say it seems this option does not exist. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2008-Jul-04 08:14 UTC
DO NOT REPLY [Bug 5583] Don't write out an unchanged file if all the checksums matched
https://bugzilla.samba.org/show_bug.cgi?id=5583 wayned@samba.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Summary|Files always updated even if|Don't write out an unchanged |time is the only difference |file if all the checksums | |matched ------- Comment #3 from wayned@samba.org 2008-07-04 03:13 CST ------- I have thought about trying to optimize out such a rewrite, and it is possible, but only by delaying the start of the receiver beginning its update. This could slow things down if the file is actually different, but would speed things up if the files were really the same. I can see two different places to put this logic: One would be to have the receiver delay starting a temp file until it notices that the sender has told it about a changed part of the basis file. At that point, it would need to create a temporary file, open the basis file, and do a basis copy from 0 to the current position, and then proceed normally with the reset of the copy. However, if no difference was found, the update would not be needed, and would be discarded. (One potential issue: the receiver would need to have a way to get the full-file checksum from the generator so that it could do a double-check against the sender's full-file checksum, since it will not have computed one.) Another option would be to put the short-circuit into the sender's logic so that it doesn't tell the receiver to do anything until it first finds a file difference. The protocol would be extended to have a way to convey to the receiver that the file doesn't need any updates (since the receiver probably needs to do its post-transfer attribute updating, and may need to notify the generator that the file is done). We'd still need a solution to the full-file checksum verification. One other option that is available now is to use one of the checksum caching patches from the patches directory (such as the one that caches file-info in a DB and associates the last-known attributes with a checksum, allowing rsync to more quickly notice when files are the same). -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs at samba.org
2009-Nov-13 10:49 UTC
DO NOT REPLY [Bug 5583] Don't write out an unchanged file if all the checksums matched
https://bugzilla.samba.org/show_bug.cgi?id=5583 henrik-rsync at prak.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |henrik-rsync at prak.org ------- Comment #4 from henrik-rsync at prak.org 2009-11-13 04:49 CST ------- Here's my "me too" comment on the issue (feel free to move it to a separate bug depending on this one): I have stumbled upon the same issue in connection with rsnapshot and rsync with the "--detect-renamed" patch. Basically rsnapshot works like this: On the first run creates a full copy of a directory tree /src to /dst/0. Then the next time it rotates /dst/(x) to /dst/(x+1) and creates a copy with just hard links from /dst/1 to /dst/0 and then calls rsync to transfer the changes between /src and /dst/0, effectively creating a differential backup at the granularity of files. I applied the detect-renamed patch to avoid multiple copies of big files when they are moved around in the directory tree. The patch works in so far as it finds the correct base files in /dst. Then it uses the delta algorithm to make sure that no coincidental match of filename,size and mtime results in a false positive. Unfortunately usage of the delta algorithm creates a new copy of the file at /dst even if the content is the same as the base file (instead of using a hardlink to the base file). -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
Seemingly Similar Threads
- DO NOT REPLY [Bug 4128] New: ignore-times with link-dest behaves unexpected / sematics not clear
- DO NOT REPLY [Bug 5459] New: Large amount of files makes checksum count negative
- DO NOT REPLY [Bug 5201] New: Rsync lets user corrupt dest by applying non-inplace batch in inplace mode
- DO NOT REPLY [Bug 6590] New: [sender] could not find xattr #1 for home/jdoe/TheFresh
- DO NOT REPLY [Bug 5482] New: apply the rsync comparison algorithm specially to .mov and .mp4 files