dbonde+forum+rsync.lists.samba.org at gmail.com
2016-Jan-21 08:20 UTC
Why is my rsync transfer slow?
I run a rsync job transferring about 45 million files/approximately 1.8 TB data (a Mac OS X Time Machine backup) over a 100 MBit connection. I use rsync 3.1.1 from MacPorts (I first tried the built in rsync, version 2.6.9, since it has a Mac OS X specific cache parameter, but it ran out of memory) with the following parameters % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b" /source/ /destination/ The source is an external 3.5" HDD connected with Firewire 800. The destination is a sparse disk image bundle mounted locally (but its "source file" is on a network storage). Initially I got good speeds, 7-9 MB/s for reasonably large files but the longer this operation has been going on (I restarted it three days ago, see below), the slower it gets. There are also long pauses when nothing happens, like this: 2011-01-22-070305/Macintosh HD/Library/Application Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png 1.28K 100% 3.26kB/s 0:00:00 (xfr#48406, ir-chk=1050/4166332) 2016/01/16 18:26:48 Volumes/src/Backups.backupdb/mm/2011-01-22-070305/Macintosh HD/Library/Application Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png 313 2011-01-22-070305/Macintosh HD/Library/Application Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg 32.26K 100% 0.00kB/s 0:00:00 (xfr#48407, ir-chk=1049/4166332) 2016/01/16 19:17:37 Volumes/2TB/Backups.backupdb/mm/2011-01-22-070305/Macintosh HD/Library/Application Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg 31279 As you can see, the first file is finished 18:26, the second file 19:17, almost an hour for a file that is just 32 kB. I don't think the transfer is CPU limited. There are some CPU spikes but generally CPU load is less than 10%. The three rsync processes spawned by this operation has, all in all, used almost exactly 5h of CPU time in the 72h the transfer has been going on. The computer itself idles 23h a day. Nor is memory a problem. Memory pressure has been "green" since the operation begun. Kernel task has accumulated quite a bit of CPU time (57h when I write this), but on the other hand, the uptime is 25 days and all these 57h can't have been consumed by rsync. Some final details * I had had this process running for a couple of days when I restarted it to get better logging three days ago. It took nine hours before the first file was transferred. * I first used Finder to transfer this directory tree from the same source to the same destination. That took 3 days, all in all. Now I have spent 6 days and I don't think I even have transferred a third of the tree. * I have tried transferring files between the same source and destination outside of this operation and they go at full speed
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 First, don't use -z on a local copy. It will only make rsync slower for no reason at all. Second, 45 million files means 90 million calls to stat(). This will take a while even if nothing needs copying. On 01/21/2016 03:20 AM, dbonde+forum+rsync.lists.samba.org at gmail.com wrote:> I run a rsync job transferring about 45 million files/approximately > 1.8 TB data (a Mac OS X Time Machine backup) over a 100 MBit > connection. > > I use rsync 3.1.1 from MacPorts (I first tried the built in rsync, > version 2.6.9, since it has a Mac OS X specific cache parameter, > but it ran out of memory) with the following parameters > > % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b" > /source/ /destination/ > > The source is an external 3.5" HDD connected with Firewire 800. > The destination is a sparse disk image bundle mounted locally (but > its "source file" is on a network storage). Initially I got good > speeds, 7-9 MB/s for reasonably large files but the longer this > operation has been going on (I restarted it three days ago, see > below), the slower it gets. There are also long pauses when nothing > happens, like this: > > 2011-01-22-070305/Macintosh HD/Library/Application > Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png> >1.28K 100% 3.26kB/s 0:00:00 (xfr#48406, ir-chk=1050/4166332)> > 2016/01/16 18:26:48 > Volumes/src/Backups.backupdb/mm/2011-01-22-070305/Macintosh > HD/Library/Application > Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/Mask3.png> >313> > 2011-01-22-070305/Macintosh HD/Library/Application > Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg> > 32.26K 100% 0.00kB/s 0:00:00 (xfr#48407, ir-chk=1049/4166332) > > 2016/01/16 19:17:37 > Volumes/2TB/Backups.backupdb/mm/2011-01-22-070305/Macintosh > HD/Library/Application > Support/Apple/Mail/Stationery/Apple/Contents/Resources/Photos/Contents/Resources/Bamboo.mailstationery/Contents/Resources/banner-green.jpg> >31279> > As you can see, the first file is finished 18:26, the second file > 19:17, almost an hour for a file that is just 32 kB. > > I don't think the transfer is CPU limited. There are some CPU > spikes but generally CPU load is less than 10%. The three rsync > processes spawned by this operation has, all in all, used almost > exactly 5h of CPU time in the 72h the transfer has been going on. > The computer itself idles 23h a day. > > Nor is memory a problem. Memory pressure has been "green" since > the operation begun. > > Kernel task has accumulated quite a bit of CPU time (57h when I > write this), but on the other hand, the uptime is 25 days and all > these 57h can't have been consumed by rsync. > > Some final details > > * I had had this process running for a couple of days when I > restarted it to get better logging three days ago. It took nine > hours before the first file was transferred. > > * I first used Finder to transfer this directory tree from the > same source to the same destination. That took 3 days, all in all. > Now I have spent 6 days and I don't think I even have transferred a > third of the tree. > > * I have tried transferring files between the same source and > destination outside of this operation and they go at full speed >- -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlag5JkACgkQVKC1jlbQAQcTwwCeKKbLa6UXxuiG7TJidqa1PKcT lh0AnRfDtS90pUJFmDptXmyGEH09G0pS =E+fZ -----END PGP SIGNATURE-----
dbonde+forum+rsync.lists.samba.org at gmail.com
2016-Jan-21 19:14 UTC
Why is my rsync transfer slow?
On 2016-01-21 15:00, Kevin Korb wrote: > First, don't use -z on a local copy. It will only make rsync slower > for no reason at all. Thanks. Hadn't thought about that. I just copied most from the spelled out "archive" list of switches. But is rsync so "stupid" that it really considers z for a local transfer? > Second, 45 million files means 90 million calls to stat(). This will > take a while even if nothing needs copying. Hmm, is there a way to benchmark how long time it takes to do a stat() call? And still, why is it so much slower than Finder? Finder is dog when it comes to file operations. Rsync (and cp) is usually many times faster.
dbonde+forum+rsync.lists.samba.org at gmail.com
2016-Jan-23 16:46 UTC
Why is my rsync transfer slow?
On 2016-01-21 09:20, dbonde+forum+rsync.lists.samba.org at gmail.com wrote:> I run a rsync job transferring about 45 million files/approximately 1.8 > TB data (a Mac OS X Time Machine backup) over a 100 MBit connection. > > I use rsync 3.1.1 from MacPorts (I first tried the built in rsync, > version 2.6.9, since it has a Mac OS X specific cache parameter, but it > ran out of memory) with the following parameters > > % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f %b" > /source/ /destination/Well, after some examination I found at least one problem with this transfer (that is still running): hard links are not preserved: This is how a certain file looks at the source where it is backed up on several locations using hard links: source volume: zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG" 9236871 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG" 9236871 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG destination volume: zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG" 20765913 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG" 704428 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG As you can see the inode number is the same on the source volume while it is completely different on the destination volume. Why are my hard links not preserved? I thought the purpose with -H was to transfer the hard links rather than the file itself.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 It will, assuming it sees both links in the same rsync run. On 01/23/2016 11:46 AM, dbonde+forum+rsync.lists.samba.org at gmail.com wrote:> On 2016-01-21 09:20, dbonde+forum+rsync.lists.samba.org at gmail.com > wrote: >> I run a rsync job transferring about 45 million >> files/approximately 1.8 TB data (a Mac OS X Time Machine backup) >> over a 100 MBit connection. >> >> I use rsync 3.1.1 from MacPorts (I first tried the built in >> rsync, version 2.6.9, since it has a Mac OS X specific cache >> parameter, but it ran out of memory) with the following >> parameters >> >> % rsync -HzvhErlptgoDW --stats --progress --out-format="%t %f >> %b" /source/ /destination/ > > Well, after some examination I found at least one problem with > this transfer (that is still running): hard links are not > preserved: > > This is how a certain file looks at the source where it is backed > up on several locations using hard links: > > source volume: > > zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG" > 9236871 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG > > zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG" > 9236871 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG > > > destination volume: > > zsh-% ls -i "/…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG" > 20765913 /…/backups/2011-06-23-040258/Pictures/DSCF0748.JPG > > zsh-% ls -i "/…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG" > 704428 /…/backups/2010-12-18-070445/Pictures/DSCF0748.JPG > > As you can see the inode number is the same on the source volume > while it is completely different on the destination volume. > > Why are my hard links not preserved? I thought the purpose with -H > was to transfer the hard links rather than the file itself. >- -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlajr1kACgkQVKC1jlbQAQdG3QCgwRt/K9u6xrxGFeZP2uoPoaoT OlcAnjE4eozRjJ1Mb9YC88YNhVTLEpP8 =p3pD -----END PGP SIGNATURE-----