samba-bugs at samba.org
2017-Jan-19 08:05 UTC
[Bug 12530] New: [REQ] Improve fuzzy using files being uploaded
https://bugzilla.samba.org/show_bug.cgi?id=12530 Bug ID: 12530 Summary: [REQ] Improve fuzzy using files being uploaded Product: rsync Version: 3.1.2 Hardware: All OS: All Status: NEW Severity: normal Priority: P5 Component: core Assignee: wayned at samba.org Reporter: ben.rubson at gmail.com QA Contact: rsync-qa at samba.org Hello, Let's imagine the sender is uploading a bunch of files which are quite similar. For example, the following dir : /directory |-backup1.iso |-backup2.iso |-backup3.iso |-backup4.iso |-backup5.iso For the moment, if no remote fuzzy basis is found at the very beginning of the transfer, every file will be fully uploaded. Goal would then be to improve rsync so that once the first file has been uploaded, fuzzy algorithm could look at this new file as a fuzzy basis file for the other new files arriving. Same thing once the second file has been uploaded etc... Perhaps it could be done once for all at the very beginning of the transfer, also taking the list of files which will be uploaded (sent by the sender), and their properties, to feed the fuzzy algorithm. This would speed-up transfer in a number of situations. Thank you very much ! Ben -- You are receiving this mail because: You are the QA Contact for the bug.
samba-bugs at samba.org
2023-Oct-17 16:21 UTC
[Bug 12530] [REQ] Improve fuzzy using files being uploaded
https://bugzilla.samba.org/show_bug.cgi?id=12530 Ulrich Sibilller <ulrich.sibiller at atos.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ulrich.sibiller at atos.net --- Comment #1 from Ulrich Sibilller <ulrich.sibiller at atos.net> --- I go one step further than this: rsync should not only look for a file to reference in fuzzy mode but also take into account what it transferred previously. So instead of throwing away any information it gathered for the first file once it is done it could keep the transfer information and reuse it. It would then a) automatically fulfill you request by having the information for the first iso already b) not rely on similarity by size and/or name only but on the data itself! Of course this would increase memory usage but that's something the user can decide if it is worth or not. -- You are receiving this mail because: You are the QA Contact for the bug.
Reasonably Related Threads
- Windows Trouble with --link-dest set: "file not found" when rsync tries to create hard link
- [Bug 12498] New: --fuzzy --fuzzy hugely impacts performance even if its' not needed
- [Bug 12489] New: --fuzzy --fuzzy does not work with daemon
- [Bug 12527] New: Sender waits for timeout when fuzzy basis file found
- [PATCH] tests: move ntfs tests in a single directory