Hi All I am using rsync to do a local network copy of 10 ~8gig files. The source is a NAS Atom rsync server, and the destination is a cygdrive, obviously on the same computer that is running rsync client. I am using --inplace, and ingeneral, the 8gig files generally have data changed within the file, but the general structure of the file is not changed. They are Firebird databases. (The total daily binary difference, that which is sent over the wire, is usually about 300meg for 8gig file) I am looking at ways to improve the performance. What I have noticed is the following: 1) RSync checks that the file needs updating 2) It runs through the file on the client (8gig read on the client, quite quick) 3) Client and server now start running through the file. But this is slow, CPU limits this to 10megs/s Presumably, the CPU is getting thrashed because it is performing hash table lookups. Is it possible to disable hash table look ups, but still have the MD5 block comparison? So, in step 3, both the client and server run through the file, and send each other MD5 block comparisons, and simply transfer the whole block if needed, instead of doing a hash table lookup? I have tried --append and --append-verify, but my data is not strictly append-only data When I used --append-verify, it confirmed that the Atom server it quite capable of chewing through a large file and generating checksums. If I am correct, I think it would be a nice feature for RSync, to just do block comparison, combined with the '--inplace' feature. Fantastic stuff. Cheers Michael. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20110720/3aefc269/attachment.html>
I accidentally finished the email with a statement. My question still stands: Is it possible to disable hash table look ups, but still have the MD5 block comparison? :) Michael On Wed, Jul 20, 2011 at 4:40 PM, Michael Lynch <michaellynch511 at gmail.com>wrote:> Hi All > > I am using rsync to do a local network copy of 10 ~8gig files. > > The source is a NAS Atom rsync server, and the destination is a cygdrive, > obviously on the same computer that is running rsync client. > > I am using --inplace, and ingeneral, the 8gig files generally have data > changed within the file, but the general structure of the file is not > changed. They are Firebird databases. (The total daily binary difference, > that which is sent over the wire, is usually about 300meg for 8gig file) > > I am looking at ways to improve the performance. > > What I have noticed is the following: > 1) RSync checks that the file needs updating > 2) It runs through the file on the client (8gig read on the client, quite > quick) > 3) Client and server now start running through the file. But this is slow, > CPU limits this to 10megs/s > > Presumably, the CPU is getting thrashed because it is performing hash table > lookups. > > Is it possible to disable hash table look ups, but still have the MD5 block > comparison? > So, in step 3, both the client and server run through the file, and send > each other MD5 block comparisons, and simply transfer the whole block if > needed, instead of doing a hash table lookup? > > I have tried --append and --append-verify, but my data is not strictly > append-only data > When I used --append-verify, it confirmed that the Atom server it quite > capable of chewing through a large file and generating checksums. > > If I am correct, I think it would be a nice feature for RSync, to just do > block comparison, combined with the '--inplace' feature. > Fantastic stuff. > > Cheers > Michael. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20110725/821a16be/attachment.html>
On Mon, Jul 25, 2011 at 1:42 AM, Michael Lynch <michaellynch511 at gmail.com>wrote:> Is it possible to disable hash table look ups, but still have the MD5 block > comparison? >You can use --checksum (-c) to check file contents to see if they need to be transfered, and you can turn off the incremental-content updating via --whole-file. There is nothing else beyond that. ..wayne.. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20110725/bbca4d10/attachment.html>