thr3ads.net - rsync - stalling during delta processing [Jun 2004]

If this information is useful, please help other people find it:
Share via:

Wallace Matthews

2004-Jun-14 19:52 UTC

stalling during delta processing

I have a 29 Gig full backup on a remote server (lets call if fedor) that is
called Kbup_1.
I have a 1.3 Gig incremental backup on my local server.

I have rsync 2.6.2 on both servers. Both are RedHat Linux 9.1 on i-686 hardware
platforms.

I issue the command "time rsync -avv --rsh=rsh --stats /test/Kibbutz/Kbup_1
fedor://test/Kibbutz". The synch takes ~5 minutes of real time and about ~2
minutes of user time and it finds 1087632 bytes of match data.

I copy the 29 Gig full backup back into fedor//test/Kibbutz and issue the
command "time rsync -avv --rsh=rsh --stats --block-size=90636
/test/Kibbutz/Kbup_1 fedor://test/Kibbutz" and it takes about ~5 minutes of
real time and about ~2 minutes of user time and it finds 90636 of match data.

I copy the 29 Gig full backup back into fedor//test/Kibbutz and issue the
command "time rsync -avv --rsh=rsh --stats --block-size=181272
/test/Kibbutz/Kbup_1 fedor://test/Kibbutz" and it CRAWLS during delta
generation/transmittal at about 1 Megabyte per second.

I have repeated the experiment 3 times; same result each time. 

The only thing that is different is --block-size= option. First, time it isnt
specified and I get a predictable answer. Second time, I give it a block
size that is about 1/2 of square root of (29 Gig) and that is ok. But,
explicitly give it something that is approximately the square root of the
29 Gig and it CRAWLS. 

When I cancel the command, the real time is 86 minutes and the user time is 84
minutes. This is similar to the issue I reported on Friday that Chris suggested
I remove the --write-batch= option and that seemed to fix the CRAWL.

Now, it appears that the --block-size= option is the culprit.

Can someone who knows rsync internals explain what is happening??

wally

Craig Barratt

2004-Jun-15 06:03 UTC

head link

stalling during delta processing

"Wallace Matthews" writes:
> I copy the 29 Gig full backup back into fedor//test/Kibbutz and issue
> the command "time rsync -avv --rsh=rsh --stats --block-size=181272
> /test/Kibbutz/Kbup_1 fedor://test/Kibbutz" and it CRAWLS during delta
> generation/transmittal at about 1 Megabyte per second.
>
> I have repeated the experiment 3 times; same result each time.
>
> The only thing that is different is --block-size= option. First,
> time it isnt specified and I get a predictable answer. Second
> time, I give it a block size that is about 1/2 of square root of
> (29 Gig) and that is ok. But, explicitly give it something that
> is approximately the square root of the 29 Gig and it CRAWLS.
>
> When I cancel the command, the real time is 86 minutes and the
> user time is 84 minutes. This is similar to the issue I reported
> on Friday that Chris suggested I remove the --write-batch= option
> and that seemed to fix the CRAWL.
If I understand the code correctly, map_ptr() in filio.c maintains
a sliding window of data in memory.  The window starts 64K prior
to the desired offset, and the window length is 256K.  So your
block-size of 181272 occupies most of the balance of the window.

Each time you hit the end of the window the data is memmoved
and the balance needed is read.  With such a large block size
there will be a lot of memmoves and small reads.

I doubt this issue explains the dramatic reduction in speed, but
it might be a factor.  Perhaps there is a bug with large block
sizes?

And, yes, your observation about the number of matching blocks
needs to be explored.

Craig

Seemingly Similar Threads

Search for more possibly parallel threads

rsync - Jun 2004 - stalling during delta processing

stalling during delta processing

stalling during delta processing

Seemingly Similar Threads