Todd Papaioannou
2005-Jul-27 20:51 UTC
Transfering very large files / and restarting failures
Hi, My situation is that I would like to use rsync to copy very large files within my network/systems. Specifically, these files are in the order of 10-100GB. Needless to say, I would like to be able to restart a transfer if it only partially succeeded, but NOT repeat the work already done. Currently, I am initiating the transfer with this command: rsync --partial --progress theFile /path/to/dest where both theFile and /path/to/dest are local drives. In the future /path/to/dest will be an NFS mount. This succeeds in writing theFile to the destination as bytes flow. I.e. I get a partial file there, until the full transfer is successful. Now, say something failed. I want to restart that transfer, and am trying something like: rsync -u --no-whole-file --progress theFile /path/to/dest However, the stats shown during the progress seem to imply that the whole transfer is starting again. Can someone help me out with the correct options to ensure that if I want to restart a copy I can take advantage of the bytes that have already been transferred? Many Thanks Todd
Wayne Davison
2005-Jul-27 23:15 UTC
Transfering very large files / and restarting failures
On Wed, Jul 27, 2005 at 01:50:39PM -0700, Todd Papaioannou wrote:> where both theFile and /path/to/dest are local drives. [...] > rsync -u --no-whole-file --progress theFile /path/to/destWhen using local drives, the rsync protocol (--no-whole-file) slows things down, so you don't want to use it (the rsync protocol's purpose is to trade disk I/O and CPU cycles to reduce network bandwidth, so it doesn't help when the transfer bandwidth is very high, as it is in a local copy). Note also that you're not preserving the file times, which makes rsync less efficient (which forces you to use the -u option to avoid a retransfer) -- you're usually better off using -t (--times) unless you have some overriding reason to omit it.> However, the stats shown during the progress seem to imply that the > whole transfer is starting again.Yes, that's what rsync does. It retransfers the whole file, but it uses the local data to make the amount of data flowing over the socket (or pipe) smaller. The already-sent data is thus coming from the original, partially-transferred file rather than coming from the sender (which would lower the network bandwidth if this were a remote connection).> In the future /path/to/dest will be an NFS mount.You don't want to do that unless you're network speed is higher than your disk speed -- with slower net speeds you are better off rsyncing directly to the remote machine that is the source of the NFS mount so that rsync can reduce the amount of data it is sending. With higher net speeds you're better off just transferring the data via --whole-file and not using --partial. One other possibility is the --append option from the patch named patches/append.diff -- this implements a more efficient append mode for incremental transfers (I'm considering adding this to the next version of rsync). ..wayne..