I am trying to use rsync to backup from a site we will call "office" and another we will call "home." Both sites have DSL accounts provided by Arachnet. At present the files being backed up don't all all to be backed up, but OTOH we wish to backup lots more files that aren't being backed up now. First, we create a local backup on our office machine which happens to be called "mail." We have this directory structure: drwxr-xr-x 20 root 4096 May 17 23:06 20040517-1500-mon drwxr-xr-x 20 root 4096 May 18 23:06 20040518-1500-tue drwxr-xr-x 20 root 4096 May 19 23:09 20040519-1500-wed drwxr-xr-x 20 root 4096 May 20 23:09 20040520-1500-thu drwxr-xr-x 20 root 4096 May 21 23:09 20040521-1500-fri drwxr-xr-x 20 root 4096 May 22 23:10 20040522-1500-sat drwxr-xr-x 20 root 4096 May 23 23:09 20040523-1500-sun drwxr-xr-x 20 root 4096 May 24 23:10 20040524-1500-mon drwxr-xr-x 20 root 4096 May 25 23:10 20040525-1500-tue drwxr-xr-x 20 root 4096 May 26 23:10 20040526-1500-wed drwxr-xr-x 20 root 4096 May 27 23:10 20040527-1500-thu drwxr-xr-x 20 root 4096 May 28 23:11 20040528-1500-fri drwxr-xr-x 20 root 4096 May 29 23:11 20040529-1500-sat drwxr-xr-x 20 root 4096 May 30 23:10 20040530-1500-sun drwxr-xr-x 20 root 4096 May 31 23:11 20040531-1500-mon drwxr-xr-x 3 root 4096 Jun 1 14:10 20040601-0603-tue drwxr-xr-x 3 root 4096 Jun 1 23:07 20040601-1500-tue drwxr-xr-x 3 root 4096 Jun 2 07:42 20040601-2323-tue drwxr-xr-x 3 root 4096 Jun 2 23:07 20040602-1500-wed drwxr-xr-x 3 root 4096 Jun 3 14:04 20040603-0555-thu drwxr-xr-x 3 root 4096 Jun 3 23:06 20040603-1500-thu drwxr-xr-x 3 root 4096 Jun 4 23:07 20040604-1500-fri drwxr-xr-x 3 root 4096 Jun 5 23:08 20040605-1500-sat drwxr-xr-x 3 root 4096 Jun 7 14:19 20040607-0610-mon drwxr-xr-x 3 root 4096 Jun 8 05:01 20040607-2054-mon drwxr-xr-x 3 root 4096 Jun 8 05:35 20040607-2128-mon drwxr-xr-x 20 root 4096 Jun 1 14:06 latest The timestamps in the directory names are UTC times. We maintain the contents of latest thus: + rsync --recursive --links --hard-links --perms --owner --group --devices --times --sparse --one-file-system --rsh=/usr/bin/ssh --delete --delete-excluded --delete-after --max-delete=80 --relative --stats --numeric-ids --exclude-from=/etc/local/backup/system-backup.excludes /boot/ / /home/ /var/ /var/local/backups/office//latest and create the backup-du-jour: + cp -rl /var/local/backups/office//latest /var/local/backups/office//20040607-2128-mon That part works well, and the rsync part generally takes about seven minutes. To copy office to home we try this: + rsync --recursive --links --hard-links --perms --owner --group --devices --times --sparse --one-file-system --rsh=/usr/bin/ssh --delete --delete-excluded --delete-after --max-delete=80 --relative --stats --numeric-ids /var/local/backups 192.168.0.1:/var/local/backups/ Prior to this run that is in progress, we used home's external host name. I've created a VPN between the two sites (for other reasons) using OpenVPN: all the problems we've had so far occurred with, we'll say, the hostname is "home.arach.net.au" as that's the default way Arachnet assign hostnames. I'm hoping that OpenVPN will provide a more robust recovery from network problems. Problems we've had include 1. ADSL connexion at one end ot the other dropping for a while. rsync doesn't notice and mostly hangs. I have seen rsync at home still running but with no relevant files open. 2. rsync uses an enormous amount of virtual memory with the result the Linux kernel lashes out at lots of processes, mostly innocent, until it lucks on rsync. This can cause rsync to terminate without a useful message. 2a. Sometimes the rsync that does this is at home. I've alleviated this at office by allocating an unreasonable amount of swap: unreasonable because if it gets used, performance will be truly dreadful. 3. rsync does not detect when its partner has vanished. I don't understand why this should be so: it seems to me that, at office, it should be able to detect by the fact {r,s}sh has terminated or by timeout, and at home by timeout. 3a. It'd like to see rsync have the ability to retry in the case it's initiated the transfer. It can take some time to collect together the information as to what needs to be done: if I try in its wrapper script, then this has to be redone whereas, I surmise, rsync doing the retry would not need to. 4. I've already mentioned this, but as I've had no feedback I'll try again. As you can see from the above, the source directories for the transfer from office to home are chock-full of hard links. As best I can tell, rsync is transferring each copy fresh instead of recognising the hard link before the transfer and getting the destination rsync to make a new hard link. It is so that it _can_ do this that I present the backup directory as a whole and not the individual day's backup. That, and I have hopes that today's unfinished work will be done tomorrow. This approach seems so far to be problematic, and I am wondering whether I should instead be doing one of these: A. Create a filesystem image with dd if=/dev/zero of=backup .... # of suitable size mke2fs backup then mount -o loop, and put my backups inside that, and then use rsync to sync that offsite. Presumably this will use much less virtual memory. The question is how quickly it would sync the two images. I imagine my problem with hard links will vanish. B. Create a filesystem image as above Use jigdo to keep the images in sync. C. Use md5sum and some home-grown scripts to decide what to transfer. I'm not keen on C. as basically it's implementing what I think rsync should be doing. btw the latest directory contains 1.5 Gbytes of data. The system is still calculating that today's backup contains 1.5 Gbytes, so it seems the startup costs are considerable.
On Tue, Jun 08, 2004 at 07:37:32AM +0800, John wrote:> 1. ADSL connexion at one end ot the other dropping for a while. rsync > doesn't notice and mostly hangs. I have seen rsync at home still > running but with no relevant files open.There are two aspects of this: (1) Your remote shell should be setup to timeout appropriately (which is why rsync doesn't timeout by default) -- see your remote-shell's docs for how to do this; (2) you can tell rsync to timeout after a certain amount of inactivity (see --timeout).> 2. rsync uses an enormous amount of virtual memoryYes, it uses something like 80-100 bytes or so per file in the transferred hierarchy (depending on options) plus a certain base amount of memory. Your options are to (1) copy smaller sections of the hierarchy at a time, (2) add more memory, or (3) help code something better. This is one of the big areas that I've wanted to solve by completely replacing the current rsync protocol with something better (as I did in my rZync testbed protocol project a while back -- it transfers the hierarchy incrementally, so it never has more than a handful of directories in action at any one time). At some point I will get back to working on an rsync-replacement project.> 3. rsync does not detect when its partner has vanished.That seems unlikely unless the remote shell is still around. If the shell has terminated, the socket would return an EOF and rsync would exit. So, I'll assume (until shown otherwise) that this is a case of the remote shell still hanging around.> 3a. It'd like to see rsync have the ability to retry in the case it's > initiated the transfer.There has been some talk of this recently. It doesn't seem like it would be too hard to do, but it's not trivial either. If someone wanted to code something up, I'd certainly appreciate the assistance. Or feel free to put an enhancement request into bugzilla. (BTW: has anyone heard from J.W. Schultz anytime recently? He seems to have dropped off the net without any explanation about 3 months ago -- I hope he's OK.)> 4. [...] As best I can tell, rsync is transferring each copy fresh > instead of recognising the hard link before the transfer and getting > the destination rsync to make a new hard link.This should not be the case if you use the -H option. (It also helps to use 2.6.2 on both ends, as the memory-consumption was reduced considerably from older releases.) If you're seeing a problem with this, you should provide full details on what command you're running, what versions you're using, and as small a test case as you can that shows the problem. ..wayne..
(I see there's already been an exchange between you and Wayne, but I'll still send this reply that I composed to your original email.) On Tue, 08 Jun 2004, John <rsync@computerdatasafe.com.au> wrote:> > We maintain the contents of latest thus: > + rsync --recursive --links --hard-links --perms --owner --group > --devices --times --sparse --one-file-system --rsh=/usr/bin/ssh --delete > --delete-excluded --delete-after --max-delete=80 --relative --stats > --numeric-ids --exclude-from=/etc/local/backup/system-backup.excludes > /boot/ / /home/ /var/ /var/local/backups/office//latestWhy the double slash before latest?> and create the backup-du-jour: > + cp -rl /var/local/backups/office//latest > /var/local/backups/office//20040607-2128-mon > > That part works well, and the rsync part generally takes about seven > minutes. > > To copy office to home we try this: > + rsync --recursive --links --hard-links --perms --owner --group > --devices --times --sparse --one-file-system --rsh=/usr/bin/ssh --delete > --delete-excluded --delete-after --max-delete=80 --relative --stats > --numeric-ids /var/local/backups 192.168.0.1:/var/local/backups/I can see where you will have a dreadful number of files to process if you are also processing all the previous backups.> Problems we've had include > 1. ADSL connexion at one end ot the other dropping for a while. rsync > doesn't notice and mostly hangs. I have seen rsync at home still > running but with no relevant files open. > > 2. rsync uses an enormous amount of virtual memory with the result the > Linux kernel lashes out at lots of processes, mostly innocent, until it > lucks on rsync. This can cause rsync to terminate without a useful message. > 2a. Sometimes the rsync that does this is at home. > I've alleviated this at office by allocating an unreasonable amount of > swap: unreasonable because if it gets used, performance will be truly > dreadful.In neither this nor your previous post have you mentioned the verison of rsync or the OSes involved. rsync prior to 2.6.2 (skipping 2.6.1) have non-optimized hard link processing that used twice as much memory (!) and sometimes copied hard-linked files when there was already a match on the receiver. If you are not using 2.6.2, install that on both ends and try it again.> 3. rsync does not detect when its partner has vanished. I don't > understand why this should be so: it seems to me that, at office, it > should be able to detect by the fact {r,s}sh has terminated or by > timeout, and at home by timeout.There are two timeouts - a relatively short internal socket I/O timeout and a user-controlled client-server communications timeout. If you are not using --timeout and the link goes down at the wrong time, rsync can sit there forever waiting for the next item from the other end. Use --timeout set to some number of seconds that seems long enough to get the job done. If it times out, then either bump it or try to solve the cause of the timeout.> 3a. It'd like to see rsync have the ability to retry in the case it's > initiated the transfer. It can take some time to collect together the > information as to what needs to be done: if I try in its wrapper script, > then this has to be redone whereas, I surmise, rsync doing the retry > would not need to.You need to avoid the kinds of rsync where this becomes a major factor.> 4. I've already mentioned this, but as I've had no feedback I'll try again. > As you can see from the above, the source directories for the transfer > from office to home are chock-full of hard links. As best I can tell, > rsync is transferring each copy fresh instead of recognising the hard > link before the transfer and getting the destination rsync to make a new > hard link. It is so that it _can_ do this that I present the backup > directory as a whole and not the individual day's backup. That, and I > have hopes that today's unfinished work will be done tomorrow.2.6.2 has fixes for unnecessary transfers.> btw the latest directory contains 1.5 Gbytes of data. The system is > still calculating that today's backup contains 1.5 Gbytes, so it seems > the startup costs are considerable.It's not the size of the data that hurts, it's the number of files and directories involved. Here's what I suggest. Since you have wisely made a static snapshot of the content that you wish to back up, do the office -> home rsync in two steps. First, only rsync the "latest" directory, using your original rsync arguments with the source and destination as: /var/local/backups/latest 192.168.0.1:/var/local/backups/latest/ Unchanged content won't be disturbed. Changed or new content will get transferred. When that completes successfully, then do the second rsync, but do *not* use --delete-excluded. The second rsync should include latest and the new YYYYMMDD-HHMM-ddd directory, and exclude all others. That should be nothing but hardlinks and should go very quickly once the filesystem scan for the two hierarchies is done. -- John Van Essen Univ of MN Alumnus <vanes002@umn.edu>
Possibly Parallel Threads
- Rsync Problems, Possible Addressed Bug?
- rsync over ssh - existing files are not updated?
- [Bug 11572] New: rsync --debug doesn't work, and gives erroneous results when taken out
- Is -R --link-dest really hard to use, or is it me?
- 'Invalid cross-device link' message on sparc