W Smith
2007-Jul-16 20:36 UTC
Local disk to disk Rsync taking an hour longer than disk to remote
Back in June I posted about the trouble I've been having backing up some local directories and I'm no further ahead than back then. Link for that discussion: http://lists.samba.org/archive/rsync/2007-June/017882.html In summary: I'm copying nearly a million small files from the main disk in a server to another disk in the same machine. Still on this server, but at a different time, I am backing up the same million files to a remote server which is completed massively quicker and I just can't get my head around why this should be. Here are some new stats I got whilst using "time" to measure the difference between the two processes. Local backup ------------------- Number of files: 1060320 Number of files transferred: 2233 Total file size: 26814206753 bytes Total transferred file size: 58711290 bytes Literal data: 58711290 bytes Matched data: 0 bytes File list size: 47383393 Total bytes sent: 106196771 Total bytes received: 44680 sent 106196771 bytes received 44680 bytes 12914.54 bytes/sec total size is 26814206753 speedup is 252.39 real 137m5.932s user 0m16.843s sys 125m35.697s Remote backup ----------------------- Number of files: 1060823 Number of files transferred: 1758 Total file size: 28255934663 bytes Total transferred file size: 63027228 bytes Literal data: 63027228 bytes Matched data: 0 bytes File list size: 47404505 Total bytes sent: 110514244 Total bytes received: 35180 sent 110514244 bytes received 35180 bytes 89045.05 bytes/sec total size is 28255934663 speedup is 255.60 real 20m40.908s user 0m11.344s sys 0m22.644s Sure there are a few less files on the remote backup, due to the scripts running at different times but this can't explain the hour difference in runtime can it? I've studied the rsync man page for hours, trying to find some elusive option to speed up local performance to no avail. I've also tried Unison and rdiff-backup, hoping that they might be better for local copying, but nothing comes close to rsync remote backup speeds. At this stage I'm close to giving up and am considering forgetting local backups and getting another server to have two remote backups instead, which wouldn't be a bad thing. My last hope is in the wisdom of this mailing list... :)
Tony Abernethy
2007-Jul-16 21:03 UTC
Local disk to disk Rsync taking an hour longer than disk to remote
>From an old old old-timerThe first disks that IBM came out with were effectively the same speed as card readers and line printers. For unblocked records. Disks are NOT asynchronous. They spin at a very predictable Rate and timing are extremely different based on whether the Head is in the right place just before or just after the point In time when the data comes by. The exact parameters of disk caches (all of them) can do all Sorts of strange things to the timings. Even the where on the disk the files are can matter. With a bit of experimenting, you should be able to get some Very real and violently counter-intuitive results. Optimized for known contention can mess up badly faced with unknown contention (two disks on same anything)> -----Original Message----- > From: rsync-bounces+tony=servacorp.com@lists.samba.org > [mailto:rsync-bounces+tony=servacorp.com@lists.samba.org] On > Behalf Of W Smith > Sent: Monday, July 16, 2007 3:36 PM > To: rsync@lists.samba.org > Subject: Local disk to disk Rsync taking an hour longer than > disk to remote > > Back in June I posted about the trouble I've been having > backing up some local directories and I'm no further ahead > than back then. > > Link for that discussion: > http://lists.samba.org/archive/rsync/2007-June/017882.html > > In summary: I'm copying nearly a million small files from the > main disk in a server to another disk in the same machine. > > Still on this server, but at a different time, I am backing > up the same million files to a remote server which is > completed massively quicker and I just can't get my head > around why this should be. > > Here are some new stats I got whilst using "time" to measure > the difference between the two processes. > > Local backup > ------------------- > Number of files: 1060320 > Number of files transferred: 2233 > Total file size: 26814206753 bytes > Total transferred file size: 58711290 bytes Literal data: > 58711290 bytes Matched data: 0 bytes File list size: 47383393 > Total bytes sent: 106196771 Total bytes received: 44680 > > sent 106196771 bytes received 44680 bytes 12914.54 > bytes/sec total size is 26814206753 speedup is 252.39 > > real 137m5.932s > user 0m16.843s > sys 125m35.697s > > Remote backup > ----------------------- > Number of files: 1060823 > Number of files transferred: 1758 > Total file size: 28255934663 bytes > Total transferred file size: 63027228 bytes Literal data: > 63027228 bytes Matched data: 0 bytes File list size: 47404505 > Total bytes sent: 110514244 Total bytes received: 35180 > > sent 110514244 bytes received 35180 bytes 89045.05 > bytes/sec total size is 28255934663 speedup is 255.60 > > real 20m40.908s > user 0m11.344s > sys 0m22.644s > > > Sure there are a few less files on the remote backup, due to > the scripts running at different times but this can't explain > the hour difference in runtime can it? > > I've studied the rsync man page for hours, trying to find > some elusive option to speed up local performance to no > avail. I've also tried Unison and rdiff-backup, hoping that > they might be better for local copying, but nothing comes > close to rsync remote backup speeds. > > At this stage I'm close to giving up and am considering > forgetting local backups and getting another server to have > two remote backups instead, which wouldn't be a bad thing. > > My last hope is in the wisdom of this mailing list... :) > -- > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: > http://www.catb.org/~esr/faqs/smart-questions.html >
Aaron W Morris
2007-Jul-16 23:24 UTC
Local disk to disk Rsync taking an hour longer than disk to remote
On 7/16/07, W Smith <digitalmagnets@googlemail.com> wrote:> Back in June I posted about the trouble I've been having backing up > some local directories and I'm no further ahead than back then. > > Link for that discussion: > http://lists.samba.org/archive/rsync/2007-June/017882.html > > In summary: I'm copying nearly a million small files from the main > disk in a server to another disk in the same machine. > > Still on this server, but at a different time, I am backing up the > same million files to a remote server which is completed massively > quicker and I just can't get my head around why this should be. > > Here are some new stats I got whilst using "time" to measure the > difference between the two processes. > > Local backup > ------------------- > Number of files: 1060320 > Number of files transferred: 2233 > Total file size: 26814206753 bytes > Total transferred file size: 58711290 bytes > Literal data: 58711290 bytes > Matched data: 0 bytes > File list size: 47383393 > Total bytes sent: 106196771 > Total bytes received: 44680 > > sent 106196771 bytes received 44680 bytes 12914.54 bytes/sec > total size is 26814206753 speedup is 252.39 > > real 137m5.932s > user 0m16.843s > sys 125m35.697s > > Remote backup > ----------------------- > Number of files: 1060823 > Number of files transferred: 1758 > Total file size: 28255934663 bytes > Total transferred file size: 63027228 bytes > Literal data: 63027228 bytes > Matched data: 0 bytes > File list size: 47404505 > Total bytes sent: 110514244 > Total bytes received: 35180 > > sent 110514244 bytes received 35180 bytes 89045.05 bytes/sec > total size is 28255934663 speedup is 255.60 > > real 20m40.908s > user 0m11.344s > sys 0m22.644s > > > Sure there are a few less files on the remote backup, due to the > scripts running at different times but this can't explain the hour > difference in runtime can it? > > I've studied the rsync man page for hours, trying to find some elusive > option to speed up local performance to no avail. I've also tried > Unison and rdiff-backup, hoping that they might be better for local > copying, but nothing comes close to rsync remote backup speeds. > > At this stage I'm close to giving up and am considering forgetting > local backups and getting another server to have two remote backups > instead, which wouldn't be a bad thing.The difference could be --whole-file which is enabled by default when the source and destination are local disks. You could try to disable that with --no-whole-file . -- Aaron W Morris (decep)
Matt McCutchen
2007-Jul-17 00:47 UTC
Local disk to disk Rsync taking an hour longer than disk to remote
On 7/16/07, W Smith <digitalmagnets@googlemail.com> wrote:> Back in June I posted about the trouble I've been having backing up > some local directories and I'm no further ahead than back then. > > Link for that discussion: > http://lists.samba.org/archive/rsync/2007-June/017882.html > > In summary: I'm copying nearly a million small files from the main > disk in a server to another disk in the same machine. > > Still on this server, but at a different time, I am backing up the > same million files to a remote server which is completed massively > quicker and I just can't get my head around why this should be.Let's see if we can solve the problem for good this time... - First, as a sanity check, please send the command lines for both rsync runs. - To rule out disk I/O being the bottleneck in the local run, you could try the same run with "cp -a" instead of rsync. - Is rsync stalling in the same way you described before ( http://lists.samba.org/archive/rsync/2007-June/017885.html )? If so, please send straces for the sender, the receiver, and the generator so we can begin to investigate. - In the meantime, you might like to try the current CVS rsync both with and without incremental recursion to see if it works any better.> At this stage I'm close to giving up and am considering forgetting > local backups and getting another server to have two remote backups > instead, which wouldn't be a bad thing.Have you tried doing rsync over ssh to localhost? That is easier and might have the same effect. Matt
Matt McCutchen
2007-Jul-17 00:59 UTC
Local disk to disk Rsync taking an hour longer than disk to remote
On 7/16/07, Aaron W Morris <aaronwmorris@gmail.com> wrote:> The difference could be --whole-file which is enabled by default when > the source and destination are local disks. You could try to disable > that with --no-whole-file .--no-whole-file reduces data transfer between the sending and receiving rsync processes. On a local run, it is very unlikely to do any good. In fact, it increases the amount of disk I/O (since the generator reads the basis file), so if disk I/O is the bottleneck, it will actually make rsync even slower. Matt