samba-bugs@samba.org
2008-Dec-08 21:17 UTC
DO NOT REPLY [Bug 5954] New: Implement something like --very-fuzzy
https://bugzilla.samba.org/show_bug.cgi?id=5954 Summary: Implement something like --very-fuzzy Product: rsync Version: 3.1.0 Platform: Other OS/Version: Linux Status: NEW Severity: enhancement Priority: P3 Component: core AssignedTo: wayned@samba.org ReportedBy: wasabi@larvalstage.net QAContact: rsync-qa@samba.org I'd like rsync to be able to compare all files on both sides to each other. All of them. Use case: I have a 70GB music directory. I sync it between home and work. At home I run a retagger which tags and renames pretty much every file and directory. So they've all moved, they're all named differently, AND their contents have changed. This basically would result in 70GB of transfers. If rsync were to calculate block hashes for EVERY FILE on BOTH SIDES, and then use those hashes as the sources of new files, this would not take the 2 weeks it would take otherwise. It might take an hour to calculate hashes. But that'd be fine. So maybe --very-fuzzy, --really-fuzzy, or even --fuzzy --fuzzy --fuzzy. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2008-Dec-08 22:39 UTC
DO NOT REPLY [Bug 5954] Implement something like --very-fuzzy
https://bugzilla.samba.org/show_bug.cgi?id=5954 ------- Comment #1 from matt@mattmccutchen.net 2008-12-08 16:40 CST ------- (In reply to comment #0)> If rsync were to calculate block hashes for EVERY FILE on BOTH SIDES, and then > use those hashes as the sources of new files, this would not take the 2 weeks > it would take otherwise. It might take an hour to calculate hashes. But that'd > be fine.--fuzzy chooses one destination file whose name looks the most similar to the source name and uses that as a basis, so it fits nicely into rsync's existing workflow. Your suggestion of using the entire destination as a basis for every transfer is fundamentally different, so I wouldn't call it --very-fuzzy. Implementing that in rsync would be a pain. A technique that accomplishes essentially the same thing is to tar up the source and destination and then delta-transfer the tar file, assuming you have enough disk space. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2008-Dec-23 18:47 UTC
DO NOT REPLY [Bug 5954] Implement something like --very-fuzzy
https://bugzilla.samba.org/show_bug.cgi?id=5954 matt@mattmccutchen.net changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX ------- Comment #2 from matt@mattmccutchen.net 2008-12-23 12:48 CST ------- I don't see an argument for implementing the proposal in rsync. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2008-Dec-23 20:08 UTC
DO NOT REPLY [Bug 5954] Implement something like --very-fuzzy
https://bugzilla.samba.org/show_bug.cgi?id=5954 ------- Comment #3 from wasabi@larvalstage.net 2008-12-23 14:09 CST ------- I suppose if you want to ignore my argument, then yes, there is no argument. But your response to my argument focused on the technical merits more than the practical. It would be one thing to say "I am not doing this." It's quite another to say "there is no argument at all." If you want people to use rsync on common directory workloads, then such a proposal is not far fetched. If you want people to have to disable rsync on their directories and go out of band when making certain types of file changes then it is. That's your call though. Would appreciate something more to the point though "rsync just shouldn't handle this" would have been a nice response. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
samba-bugs@samba.org
2008-Dec-23 20:33 UTC
DO NOT REPLY [Bug 5954] Implement something like --very-fuzzy
https://bugzilla.samba.org/show_bug.cgi?id=5954 ------- Comment #4 from matt@mattmccutchen.net 2008-12-23 14:32 CST ------- I'm sorry, I didn't mean to be rude. It's not unusual that someone proposes a new option and I suggest an alternative approach; the argument they then have to make is that the implementation in rsync would be superior enough to the alternative to justify the added complexity. You haven't made such an argument, aside from "it would be convenient if rsync handled this so I don't have to use a separate tool for that part of the job", which applies to every feature request.> If you want people to use rsync on common directory workloads, then such a > proposal is not far fetched.I don't recall anyone else raising the case of renames at the same time as small data changes. If you have evidence that this is common, please present it. Otherwise, this is just one of several use cases that rsync could optimize but doesn't; for example, see bug 5482. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
Apparently Analagous Threads
- DO NOT REPLY [Bug 4056] New: Option to look for fuzzy basis files in --*-dest directories
- puppet apply --hiera_config --> Error: Could not find class
- rsync lockup on windows
- winbind nss configuration
- DO NOT REPLY [Bug 3392] New: fuzzy misbehaving if source is a file