samba-bugs at samba.org
2013-Nov-03 19:21 UTC
[Bug 10244] New: link-by-hash patch: speed enhancement by hash calculation on source side
https://bugzilla.samba.org/show_bug.cgi?id=10244 Summary: link-by-hash patch: speed enhancement by hash calculation on source side Product: rsync Version: 3.1.0 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P5 Component: core AssignedTo: wayned at samba.org ReportedBy: M_Leipold at gmx.net QAContact: rsync-qa at samba.org The link-by-hash patch actually is working perfectly in reducing needed storage on the destination. But to do so changed/non existing/renamed/moved files are first transfered from source to destination and only then the hash of the file (the one for link-by-hash) is generated, the hash dir is checked and in case the file is already existing the file is replaced by a hard link to the hash dir. In a setup of synchronizing two PCs/Servers via network (especially Internet) a lot of network capacity and time could be saved if the hash (for link-by-hash) would already be generated by the source side instance of rsync. This hash then could be send to the destination rsync to check if the file is already existing in the hash dir. In case the file existed only the hard link needs to be generated but no file transfer would be necessary. If possible this would not only speed up the "file transfer" but also solve the problem of renamed and moved files (at least on a setup with link-by-hash) Could you please check if the described setup could be possible. Thanks in advance. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
samba-bugs at samba.org
2014-Jul-26 01:08 UTC
[Bug 10244] link-by-hash patch: speed enhancement by hash calculation on source side
https://bugzilla.samba.org/show_bug.cgi?id=10244 --- Comment #1 from Dave Yost <Dave at Yost.com> 2014-07-26 01:08:58 UTC --- rsync --link-dest could try a bit harder to find candidates for a hard link. I suggest an option to rsync that works like this when you give it a file size argument: Before copying, on the destination end, rsync makes a list of large files, like this: find /path-to-link-dest/dir -size +100M While copying, when rsync encounters a file that can't be linked normally, if the file is larger than the threshold, rsync tries to link it with a candidate from the list before giving in and making a new copy. The threshold idea is to make rsync faster by not spending time on small files. On the destination, rsync could use threads to overlap some of the computation. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Seemingly Similar Threads
- DO NOT REPLY [Bug 5665] New: need option to hard link from source tree to dest tree
- DO NOT REPLY [Bug 7670] New: rsync --hard-links fails where ditto succeeds
- [Bug 10963] New: rsync to multiple destinations
- ssh trouble checklist
- [Bug 2395] problems copying from a dir that includes a symlink in the path