I'm having a problem with large files being rsync'd twice because of the checksum failing. The rsync appears to complete on the first pass, but then is done a second time (with second try successful). When some debug code was added to receiver.c, I saw that the checksum for the remote file & the temp file do not match on the first try, so (as expected) it repeats the rsync & the checksums do match after the retry. I have reproduced this behavior with various files ranging from 900 MB to 2.7 GB (though smaller files in several MB range work fine on the first pass). I just need to rsync a single large file for this scenario. The client is rsync 2.5.5 on Solaris 9 & the server is rsync 2.5.5 on AIX 4.3.3. I see the same results with Solaris 9 rsync 2.5.2 client & Solaris 7 rsync 2.5.2 server, as well as Solaris 9 rsync 2.5.2 client with AIX 2.5.5 server. I also see the same results for rsync in daemon mode & over SSH. I used gcc to compile rsync in all cases. Is there a different checksum mechanism used on the second pass (e.g., different length)? If so, perhaps there is an issue with large files for what is used by default for the first pass? Here is the verbose output: ======================$ rsync -avvv --stats --progress rsyncuser@10.200.1.1::rsync/gb-testfile gb-testfile opening tcp connection to 10.200.1.1 port 873 Password: receiving file list ... recv_file_name(gb-testfile) received 1 names 1 file to consider recv_file_list done get_local_name count=1 gb-testfile generator starting pid=2500 count=1 recv_generator(gb-testfile,0) recv_files(1) starting sending sums for 0 generate_files phase=1 recv_files(gb-testfile) recv mapped gb-testfile of size 432402669 gb-testfile 915211191 100% 3.61MB/s 0:04:01 got file_sum renaming .gb-testfile.nDaO4e to gb-testfile set modtime of gb-testfile to (1034093089) Tue Oct 8 09:04:49 2002 redoing gb-testfile(0) recv_generator(gb-testfile,0) recv_files phase=1 sending sums for 0 generate_files phase=2 recv_files(gb-testfile) recv mapped gb-testfile of size 915211191 gb-testfile 915211191 100% 10.23MB/s 0:01:25 got file_sum renaming .gb-testfile.oDaO4e to gb-testfile set modtime of gb-testfile to (1034093089) Tue Oct 8 09:04:49 2002 recv_files finished Number of files: 1 Number of files transferred: 2 Total file size: 915211191 bytes Total transferred file size: 1830422382 bytes Literal data: 482837431 bytes Matched data: 1347584951 bytes File list size: 46 Total bytes written: 1275722 Total bytes read: 483225574 wrote 1275722 bytes read 483225574 bytes 1154949.45 bytes/sec total size is 915211191 speedup is 1.89 _exit_cleanup(code=0, file=main.c, line=925): about to call exit(0) ====================== Thanks. -- Terry
> I'm having a problem with large files being rsync'd twice because of the > checksum failing.I think this was reported recently. Please try using the "-c" option ("always checksum") and see if the makes the problem go away. This is a high priority bug for me (although I have not yet experienced it). --Derek
> -----Original Message----- > From: Derek Simkowiak [mailto:dereks@itsite.com] > Sent: Friday, October 11, 2002 1:51 PM > To: Terry Reed > Cc: 'rsync@lists.samba.org' > Subject: Re: Problem with checksum failing on large files > > > > I'm having a problem with large files being rsync'd twice > because of > > the checksum failing. > > I think this was reported recently. > > Please try using the "-c" option ("always checksum") > and see if the makes the problem go away. > > This is a high priority bug for me (although I have not > yet experienced it). > > > --DerekUsing -c helps for the smallest file (900 MB), but has no effect on the larger files (e.g, 2.7 GB). Most of my files are between 1.5 GB & 3 GB. Any other suggestions? Thanks. -- Terry
> -----Original Message----- > From: Derek Simkowiak [mailto:dereks@itsite.com] > Sent: Saturday, October 12, 2002 2:14 PM > To: Craig Barratt > Cc: Terry Reed; Donovan Baarda; 'rsync@lists.samba.org' > Subject: Re: Problem with checksum failing on large files > > > > My theory is that this is expected behavior given the check > sum size. > > Craig, > Excellent analysis! > > Assuming your hypothesis is correct, I like the > adaptive checksum idea. But how much extra processor > overhead is there with a larger checksum bit size? Is it > worth the extra code and testing to use an adaptive algorithm? > > I'd be more inclined to say "This ain't the 90's > anymore", realize that overall filesizes have increased (MP3, > MS-Office, CD-R .iso, and DV) and that people are moving from > dialup to DSL/Cable, and then make either the default (a) > initial checksum size, or (b) block size, a bit larger. > > Terry, can you try his test (and also the -c option) > and post results? >I tried "--block-size=4096" & "-c --block-size=4096" on 2 files (2.35 GB & 2.71 GB) & still had the same problem - rsync still needed to do a second pass to successfully complete. These tests were between Solaris client & AIX server (both running rsync 2.5.5). As I mentioned in a previous note, a 900 MB file worked fine with just "-c" (but required "-c" to work on the first pass). I'm willing to try the "fixed md4sum implementation", what do I need for this? I cannot try these tests on a Win32 machine because Cygwin does not support files > 2 GB & I could only find rsync as part of Cygwin. I don't have the time nor the patience to try to get rsync to compile using MS VC++ :-) Is there a Win32 version of rsync with large file support available? I do not have any Linux boxes available to test large files. Thanks. -- Terry
> Would you mind trying the following? Build a new rsync (on both > sides, of course) with the initial csum_length set to, say 4, > instead of 2? You will need to change it in two places in > checksum.c; an untested patch is below. Note that this test > version is not compatible with standard rsync, so be sure to > remove the executables once you try them. > > CraigI changed csum_length=2 to csum_length=4 in checksum.c & this time rsync worked on the first pass for a 2.7 GB file. I'm assuming that this change forced rsync to use a longer checksum length on the first pass, what checksum was actually used? Here is the verbose output: ===================opening connection using ssh 10.200.1.1 -l twr4321 /home/twr4321/rsync-src/rsync-2.5.5-mod/rsync --server --sender -vvvlogDtpr . /rsync.guest/SUBMcopy.txt.7 receiving file list ... server_sender starting pid=67130 make_file(1,SUBMcopy.txt.7) expand file_list to 4000 bytes, did move recv_file_name(SUBMcopy.txt.7) received 1 names 1 file to consider recv_file_list done get_local_name count=1 SUBMcopy.txt.7 recv_files(1) starting generator starting pid=8128 count=1 recv_generator(SUBMcopy.txt.7,0) send_file_list done send_files starting sending sums for 0 send_files(0,/rsync.guest/SUBMcopy.txt.7) generate_files phase=1 send_files mapped /rsync.guest/SUBMcopy.txt.7 of size 2715101559 recv_files(SUBMcopy.txt.7) recv mapped SUBMcopy.txt.7 of size 2710310258 SUBMcopy.txt.7 calling match_sums /rsync.guest/SUBMcopy.txt.7 built hash table hash search b=16384 len=2715101559 <"match at" lines snipped> 2715101559 100% 2.31MB/s 0:18:39 done hash search sending file_sum got file_sum renaming .SUBMcopy.txt.7._iaq4p to SUBMcopy.txt.7 set modtime of SUBMcopy.txt.7 to (1032979931) Wed Sep 25 11:52:11 2002 false_alarms=188029 tag_hits=661854315 matches=121690 sender finished /rsync.guest/SUBMcopy.txt.7 recv_files phase=1 send_files phase=1 generate_files phase=2 send files finished total: matches=121690 tag_hits=661854315 false_alarms=188029 data=721332599 recv_files finished Number of files: 1 Number of files transferred: 1 Total file size: 2715101559 bytes Total transferred file size: 2715101559 bytes Literal data: 721332599 bytes Matched data: 1993768960 bytes File list size: 79 Total bytes written: 1323432 Total bytes read: 730229860 wrote 1323432 bytes read 730229860 bytes 576253.90 bytes/sec total size is 2715101559 speedup is 3.71 _exit_cleanup(code=255, file=main.c, line=925): about to call exit(255 =================== Thanks. -- Terry
Reasonably Related Threads
- [Bug 2091] New: scp hangs while copying a large file and being executed as a background process ( with nohup )
- Incremental backup with only delta into a separate file.
- [Bug 1657] New: tests/functional/acl/nontrivial/ zfs_acl_cp_001_pos causes panic
- "force create mode" not enforced from linux client
- [Bug 1014] SCP slow bandwidth with Solaris8 on n240