Joseph L. Brunner
2015-Jan-29 13:05 UTC
[CentOS] network copy performance is poor (rsync) - debugging suggestions?
We routinely have to sync 4TB, which is about 2M files... Rsync never does well for us - it just cant push the line at all So, this may or may not work for you - but this is a huge problem - so we tried whole excel spreadsheet worth of combinations, every protocol imaginable to make this happen In the end, after a year of constant work on this - We found if we map a network share from Server Source to Server Destination, and use CIFS protocol to "map a drive" then sync say /srv/www -> to /mnt/shadow-www It worked at 99% of line rate ONLY if we used the cp command to sync the source and destination Cd /srv/www root at pas01#cp -R -u * /mnt/shadow-www/ something to consider if you find yourself not getting "line rate" our investigation showed the rsync process even with all switches we found has to "open" the file a bit before it copies it... so rsync sucks for this kind of stuff with 2 MILLION small files - it never gets going moving millions of small files it has to keep reading. There a switch that says don't do that - but never really helped :) Cheers -----Original Message----- From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On Behalf Of Gordon Messmer Sent: Wednesday, January 28, 2015 06:40 PM To: CentOS mailing list Subject: Re: [CentOS] network copy performance is poor (rsync) - debugging suggestions? On 01/23/2015 01:44 AM, G?tz Reinicke - IT Koordinator wrote:> I do have two centos 6.6 servers. With a "performance optimized" rsync> I get an speed of 15 - 20 MB/sThat *is* pretty slow for sustained writes. Does the same rate hold true for individual large files as it does for lots of small ones? What filesystem are you using on each side?> rsync -aHAXxv --numeric-ids --progress -e "ssh -T -c arcfour -o> Compression=no -x"It's worth noting that -X and -A are going to perform filesystem IO that you don't see on SMB, because it isn't going to preserve/set ACLs and extended attributes (IIRC). So, one possibility is that you're seeing a difference in rate because you're doing lots of small files and filesystem operations are relatively slow. You might drop those two options and see how that affects the rate. If you determine that those are the cause of the performance difference, you can turn them back on, understanding that there's a cost associated with preserving that data.> Both servers have plenty of memory and cpu usage looks low.Define low. If you're using top and press '1' to expand the CPU lines, you'll probably see one cpu with higher "us" percentage, which is SSH encrypting the data. What percentage is that? Is there a large value in "sy" or "hi" on any CPU? Probably not since you see good rates using 'dd' and smb copies, but I've seen systems where interrupt processing was a major bottleneck, so I make it a standard check. _______________________________________________ CentOS mailing list CentOS at centos.org<mailto:CentOS at centos.org> http://lists.centos.org/mailman/listinfo/centos
Les Mikesell
2015-Jan-29 17:09 UTC
[CentOS] network copy performance is poor (rsync) - debugging suggestions?
On Thu, Jan 29, 2015 at 7:05 AM, Joseph L. Brunner <joe at affirmedsystems.com> wrote:>> > our investigation showed the rsync process even with all switches we found has to "open" the file a bit before it copies it... so rsync sucks for this kind of stuff with 2 MILLION small files - it never gets going moving millions of small files it has to keep reading. There a switch that says don't do that - but never really helped :)Rsync is going to read the directory tree first, then walk it on both sides comparing timestamps (for incrementals) and block checksums. Pre 3.0 versions would read the entire directory tree before even starting anything else. So, there is quite a bit of overhead with the point being to avoid using network bandwidth when the source and destination are already mostly identical. Splitting the work into some sensible directory tree structure might help a lot. Or if you know it is mostly different, just tar it up and stream it. -- Les Mikesell lesmikesell at gmail.com
Gordon Messmer
2015-Jan-29 20:38 UTC
[CentOS] network copy performance is poor (rsync) - debugging suggestions?
On 01/29/2015 09:09 AM, Les Mikesell wrote:>> our investigation showed the rsync process even with all switches >> we found has to "open" the file a bit before it copies it > Rsync is going to read the directory tree first, then walk it on > both sides comparing timestamps (for incrementals) and block > checksums.Note that rsync only opens files and performs block checksums if the source and destination modification times and sizes don't match. You should not see most files opened during an incremental copy, unless one of the filesystems is MS-FAT, where modification time has a 2 second resolution, and it may not be possible to preserve a timestamp exactly during a copy.
Reasonably Related Threads
- network copy performance is poor (rsync) - debugging suggestions?
- network copy performance is poor (rsync) - debugging suggestions?
- network copy performance is poor (rsync) - debugging suggestions?
- network copy performance is poor (rsync) - debugging suggestions?
- network copy performance is poor (rsync) - debugging suggestions?