I can't address the algoritm questions but I'll tell you that we
had a tremendous improvement is speed when we switched to a newer
version of rsync.
We are using it (in this case) to rsync our oracle files to a
separate partition on the system cpu.
> I'm using rsync to copy some large (>1GB) oracle datafiles. I've
noticed
> that sometimes it transfers some of the files twice.
>
> Some earlier posts to this list that I saw in the archives seemed to
> indicate that this is a problem with the rsync algorithm itself when
> dealing with large files. Some of the mails seemed to indicate that this
> can be mitigated by using larger block sizes, though there were some
> caveats that increasing block size without increasing checksum size
> might cause more hash collisions.
>
> My questions:
>
> 1) Can anyone explain the problem to me in layman's terms. Is the
> initial bad transfer due to hash collisions?
>
> 2) If I'm transferring files that are 1-2GB, would increasing the
> block-size parameter to 8k or so help here? Or would I be creating more
> chances for hash collisions since I can't increase the checksum size?
>
> 3) I'm using 2.5.5 (yeah, ancient I know, I'll be upgrading it
soon).
> Are later versions better at dealing with this problem?
>
> Any help is appreciated!
>
> Thanks,
> Jeff
>
>
> --
> To unsubscribe or change options:
http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html