Will Smith
2007-Jun-21 07:56 UTC
Rsync to remote host much faster than 10k/sec speeds on local rsync
I run several remote rsyncs off a CentOS 4 server and also back up between two drives on the same server. Performance is far better on the remote rsyncs and the local rsync only manages 10k/sec despite being on modern hardware. Here are the stats from a local backup: Number of files: 979548 Number of files transferred: 2289 Total file size: 24703651523 bytes Total transferred file size: 20870616 bytes Literal data: 20870616 bytes Matched data: 0 bytes File list size: 43811752 File list generation time: 430.928 seconds File list transfer time: 0.000 seconds Total bytes sent: 64787950 Total bytes received: 50426 sent 64787950 bytes received 50426 bytes 10730.39 bytes/sec total size is 24703651523 speedup is 381.00 Here are the stats from a remote backup: Number of files: 979318 Number of files transferred: 543 Total file size: 29764063787 bytes Total transferred file size: 8771412 bytes Literal data: 8771412 bytes Matched data: 0 bytes File list size: 43796536 File list generation time: 179.610 seconds File list transfer time: 0.000 seconds Total bytes sent: 52592412 Total bytes received: 10876 sent 52592412 bytes received 10876 bytes 204284.61 bytes/sec total size is 29764063787 speedup is 565.82 What I have found is that if I empty the destination directory on the local backup disk, so rsync has to start from scratch and copy all files, performance is as to be expected but this isn't feasible for a regular backup routine as there are almost a million files. I have also noticed that the load on the local rsync is much higher than the remote rsyncs, with 90%+ cpu usage. I have tried --bwlimit 100 in a vague hope it might affect disk io but it didn't help. Rsync version is 2.6.9 1.el4.rf from rpm. The directory size of the destination backup directory is around 30gb. There is one directory where the majority of the files are and they are jpgs between 5k-150kb. The command I am running is simple: /usr/bin/rsync -a --bwlimit=100 --stats /src/dir1 /src/dir2 /backup/dir -------------- next part -------------- HTML attachment scrubbed and removed
Paul Slootman
2007-Jun-21 10:52 UTC
Rsync to remote host much faster than 10k/sec speeds on local rsync
On Thu 21 Jun 2007, Will Smith wrote:> I run several remote rsyncs off a CentOS 4 server and also back up between > two drives on the same server. > > Performance is far better on the remote rsyncs and the local rsync only > manages 10k/sec despite being on modern hardware.If the two filesystems you're syncing between are on the same disks, you're causing a lot of disk thrashing during the local sync. That's probably the cause of the slowdown (it has to seek to and fro between the source and the destination).> What I have found is that if I empty the destination directory on the local > backup disk, so rsync has to start from scratch and copy all files, > performance is as to be expected but this isn't feasible for a regular > backup routine as there are almost a million files.That's because it doesn't need to check the existing files then. BTW, don't be fooled by the bytes/sec statistic... If nothing has changed, that will be basically 0 excluding any overhead. That doesn't mean it's very slow... More interesting is the wall clock time taken.> I have also noticed that the load on the local rsync is much higher than the > remote rsyncs, with 90%+ cpu usage. I have tried --bwlimit 100 in a vagueIs that including or excluding wait for IO time, which actually is CPU idle time that could be used for processor-intensive tasks while the disk is seeking.> hope it might affect disk io but it didn't help. Rsync version is 2.6.9 > 1.el4.rf from rpm. The directory size of the destination backup directory is > around 30gb. There is one directory where the majority of the files are and > they are jpgs between 5k-150kb. > > The command I am running is simple: /usr/bin/rsync -a --bwlimit=100 --stats > /src/dir1 /src/dir2 /backup/dirAs you're not preserving hardlinks (-H), you may observe a significant speedup when using version 3.0.0 (currently only available from the CVS or as a daily snapshot, it's not been released yet). Prior versions of rsync first gather all the file meta data from the source, then from the destination, and then start comparing the lists to see what needs to be transferred. 3.0.0 does the comparing as soon as the first couple of directories have been read on both sides, and that can really speed things up. Unfortunately that speedup doesn't work if hardlinks are to be preserved (although I have hope that some day it might :-) Paul Slootman