Jayce Piel
2018-Mar-21 13:49 UTC
rsync very very slow with multiple instances at the same time.
I create a new thread, because the issue is not really the same, but i copy here the thread that made me jump into the list. My issue is not really that it waits before starting copying, but a general performance issue, specially when there are multiple rsync running at the same time. Here is my situation : I have multiple clients (around 20) with users and i want to rsync their home dirs with my server to keep a copy of their local files. On the server, files are hosted on a iSCSI volume (on a Thecus RAID) where i never had any performance issue before. When there is only one client, i have no real performance issues. In a few minutes, even with a very large number of files (some users have up to ), the sync is done if there are not too many changed files. But when there are 3 or more rsync at the same time, all rsync become very very slow and can take a few hours to complete. Here are my options : /usr/local/bin/rsync3 --rsync-path=/usr/local/bin/rsync3 -aHXxvE --stats --numeric-ids --delete-excluded --delete-before --human-readable —rsh="ssh -T -c aes128-ctr -o Compression=no -x" -z --skip-compress=gz/bz2/jpg/jpeg/ogg/mp3/mp4/mov/avi/vmdk/vmem --inplace --chmod=u+w --timeout=60 —exclude=‘Caches' —exclude=‘SyncService' —exclude=‘.FileSync' —exclude=‘IMAP*' —exclude=‘.Trash' —exclude='Saved Application State' —exclude='Autosave Information' --exclude-from=/Users/pabittan/.UserSync/exclude-list --max-size=1000M /Users/pabittan/ xserve.local.fftir:./ Here is the version i use (self compiled) : $ /usr/local/bin/rsync3 --version rsync version 3.1.2-jsp protocol version 31 Copyright (C) 1996-2015 by Andrew Tridgell, Wayne Davison, and others. Web site: http://rsync.samba.org/ Capabilities: 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, append, ACLs, xattrs, iconv, symtimes, no prealloc, file-flags I had to put in place a sort of queue to not allow more than 4 simultaneous rsync to be sure they run at least once a day each. Even limiting to 4 rsync makes some wait hours before starting a backup. I’m open to any help to improve perfs. (i have put my whole script calling rsync on github : https://github.com/jpiel/UserSync <https://github.com/jpiel/UserSync> ) PS: I checked, CPU is not under pressure, each rsync instance use between 2 and 5% CPU. The whole CPU usage 30%. I also checked network, and it’s not either an issue. Disk usage doesn’t seem to be at a high load either… (peak at 300 IO/sec)> Le 20 mars 2018 à 13:00, rsync-request at lists.samba.org a écrit : > > De: Kevin Korb <kmk at sanitarium.net <mailto:kmk at sanitarium.net>> > Objet: Rép : Very slow to start sync with millions of directories and files > Date: 19 mars 2018 à 15:33:31 UTC+1 > À: rsync at lists.samba.org <mailto:rsync at lists.samba.org> > > > The performance of rsync with a huge number of files is greatly > determined by every option you are using. So, what is your whole > command line? > > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: >> Hi all, >> >> I'm using rsync 3 to copy all files from one disk to another. The files >> were writen by Minio, an S3 compatible opensource backend. >> >> The number of files is dozens of millions, almost each of them within >> its own directory. >> >> Rsync takes a long time, when not several hours, to even start syncing >> files. I already see a few reasons: >> - it first create all directories to put files in, that could be done >> along with the sync >> - it needs to generate the list of all files before starting, and cannot >> start syncing and keep the list generation in a different thread. >> >> Cheers, >> bráulio >> >> > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net <mailto:Kevin at FutureQuest.net> (work) > Orlando, Florida kmk at sanitarium.net <mailto:kmk at sanitarium.net> (personal) > Web page: http://www.sanitarium.net/ <http://www.sanitarium.net/> > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-- Jayce Piel — jayce.piel at gmail.com -- 0616762431 Responsable Informatique F.F.Tir -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180321/cf032fa1/attachment.html>
Kevin Korb
2018-Mar-21 15:39 UTC
rsync very very slow with multiple instances at the same time.
When rsync has a lot of files to look through but not many to actually transfer most of the work will be gathering information from the stat() function call. You can simulate just the stat call with: find /path -type f -ls > /dev/null You can run one then a few of those to see if your storage has issues with lots of stats all at once. Also, why -c aes128-ctr ? If your OpenSSH is current then the default of chacha20-poly1305 at openssh.com is much faster. If your systems have AES-NI in the CPU then aes128-gcm at openssh.com is much faster. If your OpenSSH is too old for chacha to be the default then aes128-ctr was the default anyway. On 03/21/2018 09:49 AM, Jayce Piel via rsync wrote:> I create a new thread, because the issue is not really the same, but i > copy here the thread that made me jump into the list. > > My issue is not really that it waits before starting copying, but a > general performance issue, specially when there are multiple rsync > running at the same time. > > Here is my situation : > I have multiple clients (around 20) with users and i want to rsync their > home dirs with my server to keep a copy of their local files. > On the server, files are hosted on a iSCSI volume (on a Thecus RAID) > where i never had any performance issue before. > > When there is only one client, i have no real performance issues. In a > few minutes, even with a very large number of files (some users have up > to ), the sync is done if there are not too many changed files. > But when there are 3 or more rsync at the same time, all rsync become > very very slow and can take a few hours to complete. > > Here are my options : > > /usr/local/bin/rsync3 --rsync-path=/usr/local/bin/rsync3 -aHXxvE --stats > --numeric-ids --delete-excluded --delete-before --human-readable > —rsh="ssh -T -c aes128-ctr -o Compression=no -x" -z > --skip-compress=gz/bz2/jpg/jpeg/ogg/mp3/mp4/mov/avi/vmdk/vmem --inplace > --chmod=u+w --timeout=60 —exclude=‘Caches' —exclude=‘SyncService' > —exclude=‘.FileSync' —exclude=‘IMAP*' —exclude=‘.Trash' —exclude='Saved > Application State' —exclude='Autosave Information' > --exclude-from=/Users/pabittan/.UserSync/exclude-list --max-size=1000M > /Users/pabittan/ xserve.local.fftir:./ > > > Here is the version i use (self compiled) : > $ /usr/local/bin/rsync3 --version > rsync version 3.1.2-jsp protocol version 31 > Copyright (C) 1996-2015 by Andrew Tridgell, Wayne Davison, and others. > Web site: http://rsync.samba.org/ > Capabilities: > 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, > socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, > append, ACLs, xattrs, iconv, symtimes, no prealloc, file-flags > > I had to put in place a sort of queue to not allow more than 4 > simultaneous rsync to be sure they run at least once a day each. Even > limiting to 4 rsync makes some wait hours before starting a backup. > > I’m open to any help to improve perfs. (i have put my whole script > calling rsync on github : https://github.com/jpiel/UserSync ) > > PS: > I checked, CPU is not under pressure, each rsync instance use between 2 > and 5% CPU. The whole CPU usage 30%. > I also checked network, and it’s not either an issue. > Disk usage doesn’t seem to be at a high load either… (peak at 300 IO/sec) > > >> Le 20 mars 2018 à 13:00, rsync-request at lists.samba.org >> <mailto:rsync-request at lists.samba.org> a écrit : >> >> *De: *Kevin Korb <kmk at sanitarium.net <mailto:kmk at sanitarium.net>> >> *Objet: **Rép : Very slow to start sync with millions of directories >> and files* >> *Date: *19 mars 2018 à 15:33:31 UTC+1 >> *À: *rsync at lists.samba.org <mailto:rsync at lists.samba.org> >> >> >> The performance of rsync with a huge number of files is greatly >> determined by every option you are using. So, what is your whole >> command line? >> >> On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: >>> Hi all, >>> >>> I'm using rsync 3 to copy all files from one disk to another. The files >>> were writen by Minio, an S3 compatible opensource backend. >>> >>> The number of files is dozens of millions, almost each of them within >>> its own directory. >>> >>> Rsync takes a long time, when not several hours, to even start syncing >>> files. I already see a few reasons: >>> - it first create all directories to put files in, that could be done >>> along with the sync >>> - it needs to generate the list of all files before starting, and cannot >>> start syncing and keep the list generation in a different thread. >>> >>> Cheers, >>> bráulio >>> >>> >> >> -- >> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., >> Kevin KorbPhone: (407) 252-6853 >> Systems AdministratorInternet: >> FutureQuest, Inc.Kevin at FutureQuest.net >> <mailto:Kevin at FutureQuest.net> (work) >> Orlando, Floridakmk at sanitarium.net <mailto:kmk at sanitarium.net> (personal) >> Web page:http://www.sanitarium.net/ >> PGP public key available on web site. >> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > -- > Jayce Piel — jayce.piel at gmail.com <mailto:jayce.piel at gmail.com> -- > 0616762431 > Responsable Informatique F.F.Tir > > >-- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 224 bytes Desc: OpenPGP digital signature URL: <http://lists.samba.org/pipermail/rsync/attachments/20180321/513f82bb/signature.sig>
devzero at web.de
2018-Mar-21 17:17 UTC
Aw: rsync very very slow with multiple instances at the same time.
An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180321/e336a581/attachment.html>
Jayce Piel
2018-Mar-23 16:52 UTC
rsync very very slow with multiple instances at the same time.
Ok, so i did some tests. find /path -type f -ls > /dev/null First on my local SSD disk (1.9 millions files) : 1 find : real 2m16.743s user 0m7.607s sys 0m45.952s 10 concurrent finds (approx same results for each) : real 4m48.629s user 0m11.013s sys 2m0.288s Almost double time is somehow logic. Now same test on my server on the iSCSI disk (when there is no other activity) (2.8 millions files) : 1 find : real 38m54.964s user 0m35.626s sys 4m33.593s 10 concurrent finds : real 76m34.781s user 0m47.848s sys 5m42.034s The difference is not crazy. But the find itself takes so much time !!!!! I now see i have a real issue on that server. Transfer time is not a problem, but access time seems to be terribly slow.> Le 21 mars 2018 à 16:59, Jayce Piel <jayce.piel at gmail.com> a écrit : > > Thanks for the answer. > I will do some tests of the stat() thing at a time when there is nothing else running. > > For the compression i tried to find the lowest common factor between the clients and the server. Server is older for now. > I used to use -c arcfour-128 before it was no more an option. > > The 2 ciphers you are mentionning are available on the Clients but not on the server, sadly. > But i keep this in mind for when i will upgrade the server (or move the destination backups). > > >> Le 21 mars 2018 à 16:39, Kevin Korb via rsync <rsync at lists.samba.org <mailto:rsync at lists.samba.org>> a écrit : >> >> When rsync has a lot of files to look through but not many to actually >> transfer most of the work will be gathering information from the stat() >> function call. You can simulate just the stat call with: find /path >> -type f -ls > /dev/null >> You can run one then a few of those to see if your storage has issues >> with lots of stats all at once. >> >> Also, why -c aes128-ctr ? If your OpenSSH is current then the default >> of chacha20-poly1305 at openssh.com <mailto:chacha20-poly1305 at openssh.com> is much faster. If your systems have >> AES-NI in the CPU then aes128-gcm at openssh.com <mailto:aes128-gcm at openssh.com> is much faster. If your >> OpenSSH is too old for chacha to be the default then aes128-ctr was the >> default anyway. >> >> On 03/21/2018 09:49 AM, Jayce Piel via rsync wrote: >>> >>> Here are my options : >>> >>> /usr/local/bin/rsync3 --rsync-path=/usr/local/bin/rsync3 -aHXxvE --stats >>> --numeric-ids --delete-excluded --delete-before --human-readable >>> —rsh="ssh -T -c aes128-ctr -o Compression=no -x" -z >>> --skip-compress=gz/bz2/jpg/jpeg/ogg/mp3/mp4/mov/avi/vmdk/vmem --inplace >>> --chmod=u+w --timeout=60 —exclude=‘Caches' —exclude=‘SyncService' >>> —exclude=‘.FileSync' —exclude=‘IMAP*' —exclude=‘.Trash' —exclude='Saved >>> Application State' —exclude='Autosave Information' >>> --exclude-from=/Users/pabittan/.UserSync/exclude-list --max-size=1000M >>> /Users/pabittan/ xserve.local.fftir:./ >>> > > -- > Jayce Piel — jayce.piel at gmail.com <mailto:jayce.piel at gmail.com> -- 0616762431 > Responsable Informatique F.F.Tir-- Jayce Piel — jayce.piel at gmail.com -- 0616762431 Responsable Informatique F.F.Tir -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180323/9f519236/attachment.html>
Maybe Matching Threads
- Aw: Re: rsync very very slow with multiple instances at the same time.
- rsync very very slow with multiple instances at the same time.
- Non-clean Rsync 3.0.3 exit on OSX
- rsync3 universal binary for Mac OS X?
- Very slow to start sync with millions of directories and files