thr3ads.net - rsync - Rsyncing really large files [Feb 2005]

If this information is useful, please help other people find it:
Share via:

Lars Karlslund

2005-Feb-24 13:02 UTC

Rsyncing really large files

Hello everyone,

I have a question regarding syncing of large files. We have a setup where
we need to sync a 500GB file with changes in random blocks.

The changes are made in a manner, so moving blocks is not probable
(actually it will never happen).

As I understand it, the way rsync works is:
- remote server calculates all checksum blocks and transmits these to client
- local client calculates all checksum blocks
- local client compares local blocks to remote blocks, checking if a block
could have moved
- changes are synced

Is it possible to have rsync calculate one block, transmit checksum, let
local decide to transmit or not, and then move on to next block?

The way it works now it takes a lot of time to actually transfer anything,
and uses a LOAD of memory (around 500MB). We're using 64K blocks.

If we do the rsync with "whole-files" option it takes 17 hours to
complete
the sync.

Without "whole files" (checksumming enabled) it takes 5 days!
(We've got
lots of bandwidth at the moment btw.)

Thanks!


-- 


Best regards,


Lars Karlslund
Pharma Nord
IT department

Wayne Davison

2005-Feb-24 15:52 UTC

head link

Rsyncing really large files

On Thu, Feb 24, 2005 at 02:01:59PM +0100, Lars Karlslund
wrote:> As I understand it, the way rsync works is:
> - remote server calculates all checksum blocks and transmits these to
client
> - local client calculates all checksum blocks
> - local client compares local blocks to remote blocks, checking if a block
> could have moved
> - changes are synced
Not quite.  There is not a need to pre-calculate all the checksums on
the sending side, though it does put all the checksum data from the
sending side into memory in order to be able to look it up randomly as
it is reading through the file and sending data.

It would certainly be possible to change the algorithm to not cache the
data (and thus only allow the current block to be compared), but I don't
think that idea has general enough interest for me to work on for
inclusion in rsync.  You might want to look into coding it up for
yourself.

However, you should be sure to have measured what is causing the
slowdown first to know how much that will help.  If it is not memory
that is swapping on the sender, it may be that the computing of the
checksums in maxing out your CPU, and removing the caching of the
remote checksums won't buy you as much as you think.  You could use some
of the librsync tools (e.g. rdiff) to calculate how long various actions
take on each system (i.e. try running rdiff on each system outputting to
/dev/null to see how long the computing of the checksums takes).

..wayne..

Shachar Shemesh

2005-Mar-01 07:31 UTC

head link

Rsyncing really large files

Kevin Day wrote:
> I would *strongly* recommend that you dig into the thesis a bit (just 
> the section that describes the rsync algorithm itself).
I tried a few weeks ago. I started to print it, and my printer ran out 
of ink :-). I will read it electronically eventually (I hope).
> Now, if you have huge files, then the 16 bit checksum may not be 
> sufficient to keep the hit rate down.  At that point, you may want to 
> consider a few algorithmic alternatives:
>
> 1.  Break up your file into logical blocks and rsync each block 
> individually.  If the file is an "append only" file, then this is
> fine.  However, if the contents of the file get re-ordered across 
> block boundaries, then the efficiency of the rsync algorithm would be 
> seriously degraded.
>
> 2.  Use a larger hash table.  Instead of 16 bits, expand it to 20 bits 
> - it will require 16 times as much memory for the hash table, but that 
> may not be an issue for you - you are probably workring with some 
> relatively beefy hardware anyway, so what the heck.
Now, here's the part where I don't get it. We have X blocks checksummed,
covering Y bytes each (actually, we have X blocks of checksum covering X 
bytes each, but that's not important). This means we actually know, 
before we get the list of checksums, how many we will have.

As for the hash table size - that's standard engineering. Alpha is 
defined as the ratio between the number of used buckets in the table to 
the number of total buckets. 0.8 is considered a good value.

What I can propose is to make the hash table size a function of X. If we 
take Lars' case, he has 500GB file, which means you ideally need about 1 
million buckets in the hash to have reasonable performance. We only have 
65 thousand. His Alpha is 0.008. No wonder he is getting abysmal 
performance.

On the other hand, if I'm syncing a 100K file, I'm only going to have 
320 blocks. A good hash table size for me will be 400 buckets. Having 
65536 buckets instead means I'm less likely to have memory cache hits, 
and performance suffers again. My Alpha value is 204 (instead of 0.8).

If my proposal is accepted, we will be adaptive in CPU-memory trade off.
>  I'll leave the excercise of converting the full rsync 32 bit rolling 
> checksum into a 20 bit value to you.
A simple modulo ought to be good enough. If the checksum is 42891 and I 
have 320 buckets, it should go into bucket 11. Assuming the checksums 
are sufficiently random-like, this algorithm is good enough.
> Cheers,
>
> Kevin
       Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? lingnu.com/backup.html

Shachar Shemesh

2005-Mar-03 08:18 UTC

head link

Rsyncing really large files

Kevin Day wrote:
> Shachar-
>  
> True enough - with one additional thought - if the block size is set 
> to be the square root of the file size, then the load factor on the 
> hash table becomes dynamic in and of itself (bigger block size = less 
> master table entries = fewer hash collisions).
And I'm suggesting making it static, by adjusting the hash table's size 
according to the number of blocks. Just do 
"hashtablesize=(numblocks/8+1)*10;", and you should be set.
> In the case of relatively low bandwidth connections, you will get MUCH 
> better performance improvement by messing with the block size than the 
> size of hash table, becaues the hash table isn't sent over the wire - 
> the block table IS sent over the wire, so reducing it's size can have 
> a big impact on performance if your file isn't changing much.
True, but irrelevant. The hash table performance does not come at the 
cost of extra bandwidth, so I see no reason not to optimize both.
> In Andrew's original thesis, he looked at several very large
...
> The problem you face, of course, is
...

Kevin, I think you are confusing a couple of things:
1. It's not me with the big files. It's Lars. I can't run tests on
files
I don't have. I am merely trying to figure out what stopped Lars so that 
rsync can be better.
2. The size of each block have nothing to do with the question of hash 
table size. Once you've chosen the number of blocks your file will have, 
in whatever way you did, there is an unrelated question of what hash 
table size you should use. Using a 65536 buckets hash table on a 500GB 
divided into 64KB blocks (as Lars is using) means you have, on average, 
125 collisions per bucket. Regardless of the question of whether using 
this size for blocks is smart or not, rsync could handle it better. 
That's what I'm talking about.  
> Trying to increase the size of the hash table may just not be worth it 
> - are you certain that the performance hit you are experiencing
I'm not. Lars is.
> is caused by processing on the recipient side, and not data transfer 
> of the block table?  In my testing (which is actually with my own 
> implementation of the algorithm, so I may have optimizations/ or lack 
> thereof compared to the rsync you are running), the block table 
> transfer is the biggest cause of elapsed time for big files that don't 
> change much.
Don't forget that Lars manages to transfer the whole file in 17 hours. I 
doubt transferring some information about each block takes more than the 
64K the block itself is (as is Lars' case).  
> It may be that the square root of file size for block size isn't 
> appropriate for files as big as you are working with...
It certainly is, as I don't work with such large files. Lars is. While 
at it, he is not using square root. He is using 64KB blocks.
>  
Keyin, I'm trying to make rsync better. Lars' problem is an opportunity 
to find a potential bottleneck. Trying to solve his use of possibly 
non-optimal values won't help rsync, though it will help him. Let's keep
this part of this thread on it's subject - is the hash table size 
optimal? I don't see how modifying the block sizes is going to change 
this question.

          Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? lingnu.com/backup.html

Lars Karlslund

2005-Mar-04 17:45 UTC

head link

Rsyncing really large files

Skipped content of type multipart/alternative-------------- next part
--------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url :
lists.samba.org/archive/rsync/attachments/20050304/60c2e285/attachment.bin

Shachar Shemesh

2005-Mar-05 12:18 UTC

head link

Rsyncing really large files

Kevin Day wrote:
> As a quick FYI, the block size absolutely has an impact on the 
> effectiveness of the checksum table - larger blocks means fewer 
> blocks, which means fewer hash colissions.
Since you wouldn't expect that many blocks in the file, a 32bit weak 
checksum would only produce about 4 or 5 real collisions. Mind you, this 
is me doing educated guesses. I haven't worked out the actual math yet. 
I don't think we need to worry about this particular problem just yet. 
Hash table collisions, however, are much more likely, which is what I'm 
trying to solve here.  
> That said, however, I completely agree that for very large files, the 
> number of buckets in the current implementation is not optimal.  
> Perhaps having a mode for really large files would be appropriate.
I don't see why such a mode would be necessary.
> One caution on increasing the size of the hash:  The current 
> implementation gives 16 bits of spread, so modding that value with the 
> desired bucket count would work fine.
That's not what I read into it. It seems to me that the checksum 
function gives a 32bit result, and we are squashing that into a 16bit 
hash table. Can you point me to the code? Wayne?
>   However, if you choose to start with the 32 bit rolling hash and mod 
> that, you will have problems.  The rolling checksum has two distinct 
> parts, and modding will only pull info from the low order bits,
Why? This may be something I missed within the code.
> which will likely get you considerably less than even the 16 bits that 
> the current implementation gives.
If the source is 16 bit, doing any hash table size bigger than 65536 
buckets would make no sense, true. Is it 16bit?
> I'd recommend using a real 32 bit hashing function to mix the two 
> rolling checksum components,
What two parts? If I understand rsync correctly, we have a rolling 
checksum, and a real checksum. The rolling checksum is used to single 
out potential matches, and the real checksum makes sure these are, 
indeed, real matches. We only need to put the first one into the hash, 
as we are never doing mass-lookups on the second.

Am I missing something basic here?

          Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? lingnu.com/backup.html

Shachar Shemesh

2005-Mar-10 06:32 UTC

head link

Rsyncing really large files

Kevin Day wrote:
> Shachar-
>  
> I think Wayne is mostly pointing you to the correct location here.  If 
> you look at the code where the checksum is computer, you'll find that 
> the rolling checksum actually consists of two parts - one is the sum 
> of all of the byte values in the window, the other is the offset 
> weighted sum of all of the byte values in the window.  The first of 
> these is then left shifted 16 bits and added to the other to come up 
> with the official "32 bit" rolling checksum.  This works fine as
long
> as you aren't counting on a random distribution of bits among the 32 - 
> if you mod the value, you are giving much greater importance to the 
> lower XX bits, effectively dropping the distribution of the high order 
> bits...
>  
> Anyway, the two 16 bit values may be random enough that my concern is 
> not founded, but it should be tested before assuming that the rolling 
> checksum is really a 32 bit value that can easily be divided up into 
> buckets.
>  
> PS - none of the above has anything to do with the strong signature of 
> the window - just the rolling check sum.
>  
> Cheers!
>  
> - K
Some modern algebra for you, then.
We have two numbers. One is always multiplied by 2^16. We want both 
numbers to be able to totally affect the bucket into which the eventual 
checksum arrives. If we will choose a hash table size that is co-prime 
to the bits we want to remain significant, then we achieve that.

Well, guess what? Factorization of any and each bit in the combined 
checksums yields only twos. In other words, any hash table size that 
will be odd (i.e. - two is not in it's factorization primes) will be 
co-prime to our shifted checksums, thus promising that they will get an 
equal chance of affecting what bucket our checksum actually falls into.

It therefor follows that I have to amend my previously proposed hash 
table size choosing formula. The new formula is:
(numblocks/8+1)*10+1
And you're done. Of course, this can also be written as:
(numblocks/8)*10+11
Which is slightly more efficient.

          Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? lingnu.com/backup.html

Possibly Parallel Threads

Search for more reasonably related threads

rsync - Feb 2005 - Rsyncing really large files

Rsyncing really large files

Rsyncing really large files

Rsyncing really large files

Rsyncing really large files

Rsyncing really large files

Rsyncing really large files

Rsyncing really large files

Possibly Parallel Threads