samba-bugs at samba.org
2013-Dec-30 17:25 UTC
[Bug 10353] New: link-by-hash collision detection
https://bugzilla.samba.org/show_bug.cgi?id=10353
Summary: link-by-hash collision detection
Product: rsync
Version: 3.1.1
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P5
Component: core
AssignedTo: wayned at samba.org
ReportedBy: jimklimov at gmail.com
QAContact: rsync-qa at samba.org
The link-by-hash should include a mode to verify that the original file content
is indeed identical to the content of the file into which it might be
hardlinked per the hash value.
If the hash algorithm happens to be weak (allowing two files of the same size
with same hash and different content - i.e. a hash collision), the
hash-filenames should include a unique suffix (i.e. 123abcd.1024.0;1 and
123abcd.1024.0;2 to differentiate two files with different contents), and if
such filename patterns exist - all copies should be considered for link-by-hash
deduplication.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
samba-bugs at samba.org
2014-Jan-19 22:43 UTC
[Bug 10353] link-by-hash collision detection
https://bugzilla.samba.org/show_bug.cgi?id=10353
Wayne Davison <wayned at samba.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WONTFIX
Severity|normal |enhancement
--- Comment #1 from Wayne Davison <wayned at samba.org> 2014-01-19
22:43:46 UTC ---
I added the size to the filename to avoid having to worry about this -- a hash
conflict with the same file size is very (very) unlikely, and forcing the code
to compare file contents before figuring out which hash conflict is the right
one is super slow.
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Possibly Parallel Threads
- [Bug 10354] New: link-by-hash-autodir - use an automatically determined directory to collect the hash-hardlinks
- [Bug 10352] New: link-by-hash hardlink-collection maintenance mode
- [Bug 8655] New: link-by-hash: add 'link by hash dir' to rsyncd.conf
- [Bug 8659] New: link-by-hash.diff: Fix error when running without --link-by-hash
- [Bug 10351] New: link-by-hash no-copy initialization