Hi All- I am trying to use rsync to synchronize very large files- some are close to 2gb in size. I have copied the files manually to a usb drive and shipped it to the location where the rsync server is running. At the remote location I copied the files to a directory which resides on the rsync server and is being pointed to by the module section rsyncd.conf. The source files have slightly chnaged and I now am trying to sync them. However, it seems that rsync has to copy the entire file itself the first time inorder to build the index (hash) needed in order to have more efficient transfers in the future. How can I get around this? I am pretty sure my problem is the initial transfer of data. Once I have done a complete sync then rsync should in theory only make block level changes based on checksum... right?? :) Anyway I looked forward to the responses. Thanks In Advance. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- HTML attachment scrubbed and removed
Eberhard Moenkeberg
2005-Jan-04 01:10 UTC
Transferring Large Files w/ Rsync - Initial Xfer
hi, On Mon, 3 Jan 2005, d c wrote:> I am trying to use rsync to synchronize very large files- some are close to 2gb in size. > I have copied the files manually to a usb drive and shipped it to the > location where the rsync server is running. > At the remote location I copied the files to a directory which resides > on the rsync server and is being pointed to by the module section > rsyncd.conf. > The source files have slightly chnaged and I now am trying to sync > them. However, it seems that rsync has to copy the entire file itself > the first time inorder to build the index (hash) needed in order to have > more efficient transfers in the future. > How can I get around this? I am pretty sure my problem is the initial > transfer of data. Once I have done a complete sync then rsync should in > theory only make block level changes based on checksum... right?? :)Look at your (al least should be ) -vv statistics. Look at the time parameters. Cheers -e -- Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org)
On Mon, Jan 03, 2005 at 04:40:03PM -0800, d c wrote:> The source files have slightly chnaged and I now am trying to sync > them. However, it seems that rsync has to copy the entire file > itself the first time inorder to build the index (hash) needed in > order to have more efficient transfers in the future.The hash is determined at runtime. Rsync needs to read the whole file on the receiving side (to generate the hash), the sender then uses that hash data to send any unmatched data (by reading its version of the file), and then the receiving side creates a duplicate file based on its original one and the data from the sender. It then moves this new file over the old one. So, it's hard to know what you're referring to. You may just be seeing the new temp file being created (which will always happen unless you use the --inplace option, which is new for 2.6.3, but is only efficient for things like log files that get appended data). You might be seeing a new copy because you accidentally put the file into the wrong spot on the recieving side (try sending a small file to the same directory as the big file to test this). Or, the file may be compressed, in which case a small change to the uncompressed data results in a massive change in the compressed data (there is a patch available for gzip that makes its compressed files more rsync friendly, if you need it). ..wayne..