samba-bugs at samba.org
2013-Nov-03 19:21 UTC
[Bug 10244] New: link-by-hash patch: speed enhancement by hash calculation on source side
https://bugzilla.samba.org/show_bug.cgi?id=10244
Summary: link-by-hash patch: speed enhancement by hash
calculation on source side
Product: rsync
Version: 3.1.0
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: core
AssignedTo: wayned at samba.org
ReportedBy: M_Leipold at gmx.net
QAContact: rsync-qa at samba.org
The link-by-hash patch actually is working perfectly in reducing needed storage
on the destination. But to do so changed/non existing/renamed/moved files are
first transfered from source to destination and only then the hash of the file
(the one for link-by-hash) is generated, the hash dir is checked and in case
the file is already existing the file is replaced by a hard link to the hash
dir.
In a setup of synchronizing two PCs/Servers via network (especially Internet) a
lot of network capacity and time could be saved if the hash (for link-by-hash)
would already be generated by the source side instance of rsync. This hash then
could be send to the destination rsync to check if the file is already existing
in the hash dir. In case the file existed only the hard link needs to be
generated but no file transfer would be necessary.
If possible this would not only speed up the "file transfer" but also
solve the
problem of renamed and moved files (at least on a setup with link-by-hash)
Could you please check if the described setup could be possible.
Thanks in advance.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
samba-bugs at samba.org
2014-Jul-26 01:08 UTC
[Bug 10244] link-by-hash patch: speed enhancement by hash calculation on source side
https://bugzilla.samba.org/show_bug.cgi?id=10244 --- Comment #1 from Dave Yost <Dave at Yost.com> 2014-07-26 01:08:58 UTC --- rsync --link-dest could try a bit harder to find candidates for a hard link. I suggest an option to rsync that works like this when you give it a file size argument: Before copying, on the destination end, rsync makes a list of large files, like this: find /path-to-link-dest/dir -size +100M While copying, when rsync encounters a file that can't be linked normally, if the file is larger than the threshold, rsync tries to link it with a candidate from the list before giving in and making a new copy. The threshold idea is to make rsync faster by not spending time on small files. On the destination, rsync could use threads to overlap some of the computation. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Maybe Matching Threads
- DO NOT REPLY [Bug 5665] New: need option to hard link from source tree to dest tree
- DO NOT REPLY [Bug 7670] New: rsync --hard-links fails where ditto succeeds
- [Bug 10963] New: rsync to multiple destinations
- ssh trouble checklist
- [Bug 2395] problems copying from a dir that includes a symlink in the path