Hi,
I had a query wrt the topic of rsync's rolling checksum algorithm:
If I have a fileA that is a database file of size 100 MB on local machine.
I back it up first time (full backup) using rsync to the server assuming
block_size to be 30 KB and --compress option to compress data as it is
transferred.
Next time, I modify the fileA with another 100 MB new contents towards the end
(assuming my database appends that new data towards the end of the physical
fileA)
I now again run rsync on my fileA to back it up to server. rsync performs an
incremental backup on that fileA using its rolling checksum and when a match is
found it verifies the match using the stronger checksum.
Query is that during the rolling checksum algorithm, the initial 100 MB is found
to be matching and hence it does not really transfer those blocks on the
network. However, when it encounters the newly appended 100 MB towards the end
of physical fileA, it starts rolling to see if it can find a matching block in
the hashtable or till it hits block_size and then again repeats the rolling
checksum process.
Since all the contents are new, it goes on rolling and hence does not find a
match and hence the literal data is transmitted over the network.
Does this rolling for every byte addition and removal process slow down the
speed of rsync and cause any sort of a latency in incremental backups and if
not, how has this case been handled within match.c or any other associated file?
Should block_size be modified for varying file sizes to optimize the above
condition?
Any help is appreciated.
Thanks in anticipation,
Regards,
Naveen
---------------------------------
Do you Yahoo!?
Yahoo! Search presents - Jib Jab's 'Second Term'
-------------- next part --------------
HTML attachment scrubbed and removed