Matt McCutchen
2007-Jun-24 17:03 UTC
Factor out .rsyncsums logic into a separate checksum-caching library?
Wayne, I notice that .rsyncsums is starting to look a lot like the index that the git version control system ( http://git.or.cz/ ) uses to determine whether a file has changed since it was last staged for committing. The git index has been heavily used and tested, so you might find it helpful when implementing a checksum cache for rsync. Specifically, it has protection against being fooled when a file's checksum is cached and the file is modified again in the same second; .rsyncsums could use this. I think it would be even better to factor out logic for caching file checksums into a separate library used by both rsync and git. This would have two advantages: the subtleties of implementing a 100% correct cache only have to be addressed once, and different programs can make use of each other's cached checksums. GNU make and Beagle desktop search might also use the library. Matt
Wayne Davison
2007-Jun-30 19:21 UTC
Factor out .rsyncsums logic into a separate checksum-caching library?
On Sun, Jun 24, 2007 at 01:03:03PM -0400, Matt McCutchen wrote:> The git index has been heavily used and tested, so you might find it > helpful when implementing a checksum cache for rsync.The problem with this is that the git cache is SHA1, and rsync needs both MD4 and MD5, depending on what protocol version is in effect. It should be possible to adapt their code for rsync's purpose, but it's probably overkill. The idea behind the new checksum patch is mainly to allow servers to provide cached checksums for their files, especially servers whose content is slow to change.> Specifically, it has protection against being fooled when a file's > checksum is cached and the file is modified again in the same second; > .rsyncsums could use this.I tried to find a description for this algorithm, but didn't see it mentioned in any of the web searches I made. Is the algorithm described anywhere? Or is my only choice to dig into the source and try to find it? ..wayne..
Possibly Parallel Threads
- New rsync option checksum-path
- Checksum-caching on server
- checksum-xattr.diff [CVS update: rsync/patches]
- DO NOT REPLY [Bug 4573] New: Hide/protect filtering of xattrs by name
- CyberPower BR850ELCD ignores offdelay and turns itself back on while still on battery