I've been trying to figure out why some large files are taking a long time to rsync (80GB file). With this file, the match process is taking days. I've added logging to verbose level 4. The output from match.c is at the point where it is writing out the "potential match at" message. In a 9 hour period the match verbiage has changed from: potential match at 14993337175 i=2976 sum=7c07ae74 potential match at 14993834514 i=3517 sum=0956772e potential match at 14994673480 i=3232 sum=9be33b55 potential match at 14994912897 i=4739 sum=7b87587a potential match at 14996877980 i=1453 sum=b7715246 potential match at 14999624225 i=906 sum=d9d831c6 potential match at 14999951039 i=2235 sum=6ca97091 potential match at 15001174331 i=3866 sum=12f966ee potential match at 15001209073 i=2080 sum=783c7750 potential match at 15001399336 i=4522 sum=87f122e0 potential match at 15001543265 i=1360 sum=85dee02c potential match at 15001770789 i=1637 sum=c55912e6 potential match at 15002913113 i=2783 sum=3fdbf408 potential match at 15004011466 i=3552 sum=ea7d0f44 potential match at 15005784863 i=2758 sum=cf9e00d6 To potential match at 19827231165 i=3880 sum=f0b58ab2 potential match at 19827785238 i=4099 sum=f3338531 potential match at 19827870435 i=1232 sum=6abf175c potential match at 19829135485 i=4472 sum=1ed3674e potential match at 19829758278 i=2705 sum=dc796cb7 potential match at 19830224336 i=2959 sum=f0bd8161 potential match at 19830896106 i=3185 sum=6f83947a potential match at 19832087866 i=1306 sum=14b38acb potential match at 19832536037 i=1411 sum=3de116db potential match at 19833817328 i=102 sum=45a8d003 potential match at 19835208508 i=2706 sum=e326d8e4 potential match at 19836927143 i=1591 sum=e357d821 potential match at 19838869812 i=4324 sum=1b113e13 potential match at 19839194857 i=3894 sum=03e116c1 potential match at 19839789868 i=3285 sum=39139716 I believe this means that 4.8GB of the file has been processed in this 9 hour period? Blocksize is currently manually set at 1149728, 4 times the default value. Any idea on why it would be taking so long to go through this portion of the sync process? Rsync version is 3.0.3 on both ends. Rob
The files are very similar, a maximum of about 5GB of data differences over 80GB. The CPU on both sides is low (3-5 percent) and the memory usage is low (11MB on the client, not sure on the server). Full rsync options are: -ruvvvvityz --partial --partial-dir=.rsync-partial --links --ignore-case --preallocate --ignore-errors --stats --del --block-size=1149728 -I I'm using the -I option to force a full sync since date/time changes on database files is not a reliable measure of changes. I'll try the block-size at 1638400 although I have not seen a big change in moving it from about 287000 (default square root) to 1149728. Rob
Rob Bosch wrote:> I've been trying to figure out why some large files are taking a long time > to rsync (80GB file). With this file, the match process is taking days. > I've added logging to verbose level 4. The output from match.c is at the > point where it is writing out the "potential match at" message. In a 9 hour > period the match verbiage has changed from: > >Can you tell where the bottleneck is? Is it on the sender's CPU? The receiver's? The network? Local IO on either sides?> I believe this means that 4.8GB of the file has been processed in this 9 > hour period? Blocksize is currently manually set at 1149728, 4 times the > default value.Rsync does have some CPU inefficient behavior for especially large files. However, it should not happen at the block size you are using (assuming the files are fairly identical). Try increasing it a little further, to 1638400 (80% utilization on the hash table), and see if things are any better. Are the files fairly identical? Shachar
I believe I've figured out why the process was taking so long...or at least have a theory. In the end it appears that much of the data was being sent even though the "true" amount of data change was less than 7% of the filesize. Exchange uses a database page size of 4K. Many times a page is deleted and then new data is written to that page (delete a message, new message arrives). Exchange will try to keep the data file size constant by reusing freed up space and it will do online "defragmentation" nightly by default. Defragmentation might be the wrong term because online defragmentation really "makes additional database space available by detecting and removing database objects that are no longer being used." Although only 7% of the file is changing, the overall number of data pages would approach 1.5 million. In all likelihood, these pages would be spread throughout the file. So if the usual approach of making the blocksize larger to process the file is used then rsync actually performs worse. This is because a change in a single 4K data page (likely occurrence) will cause the entire block to be sent. This is what I was seeing in the earlier tests, increasing blocksize decreases performance by sending more data. When I changed the blocksize to be close to the default of sqrt(filesize) but rounded down to a function of 4K, rsync performance is much better. The performance of a 4K rounded blocksize is better than the default (in this case, 262144). I'm continuing to test to find the "best" blocksize for these types of files. I'm just sending this info for future reference for those using rsync for large Exchange files or other database files. Rob