Will Smith
2003-Oct-09 01:44 UTC
Rsync for backups - possible major speedup when local files renamed.
I use rsync regularly for backups of large trees, sometimes over low bandwidth links, and I would like to suggest a potentially major speedup under some circumstances, where files or folders are moved or renamed. The principle is simple : often people rename a file, or move it from one folder to another. Current behaviour (please correct me if I am wrong) is for wget to see this as a 'delete' and an 'add'. The 'add' results in a complete retransfer of the file. This could be optimized. For example, if I move local file 'huge_file' to 'old/huge_file', a complete retransfer happens to the remote end of the 'new' file. My proposed implementation is that out the outset, during generation of file lists of local and remote, the checksums of entire files be computed (already happening with --checksum, I believe). Then the local and remote files are sorted by checksum and compared. If two files (one local, one remote) have the same checksums but have a different filename/path on the remote end, the remote end simply moves and renames the file (creating folders etc as appropriate). Would this work? Has it been covered before? Thanks for your thoughts Will Smith
jw schultz
2003-Oct-09 03:04 UTC
Rsync for backups - possible major speedup when local files renamed.
On Wed, Oct 08, 2003 at 11:44:01PM +0800, Will Smith wrote:> I use rsync regularly for backups of large trees, sometimes > over low bandwidth links, and I would like to suggest a potentially > major speedup under some circumstances, where files or > folders are moved or renamed. > > The principle is simple : often people rename a file, or move > it from one folder to another. Current behaviour (please correct > me if I am wrong) is for wget to see this as a 'delete' and > an 'add'. The 'add' results in a complete retransfer of the file. > This could be optimized. > > For example, if I move local file 'huge_file' to 'old/huge_file', > a complete retransfer happens to the remote end of the > 'new' file. > > > My proposed implementation is that out the outset, during generation > of file lists of local and remote, the checksums of entire files > be computed (already happening with --checksum, I believe). > Then the local and remote files are sorted by checksum and > compared. If two files (one local, one remote) have the same > checksums but have a different filename/path on the remote > end, the remote end simply moves and renames the file > (creating folders etc as appropriate). > > Would this work? Has it been covered before?This has been covered many times. Read the archives. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt