Bráulio Bhavamitra
2018-Mar-19 13:05 UTC
Very slow to start sync with millions of directories and files
Hi all, I'm using rsync 3 to copy all files from one disk to another. The files were writen by Minio, an S3 compatible opensource backend. The number of files is dozens of millions, almost each of them within its own directory. Rsync takes a long time, when not several hours, to even start syncing files. I already see a few reasons: - it first create all directories to put files in, that could be done along with the sync - it needs to generate the list of all files before starting, and cannot start syncing and keep the list generation in a different thread. Cheers, bráulio -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180319/6bb00855/attachment.html>
Kevin Korb
2018-Mar-19 14:33 UTC
Very slow to start sync with millions of directories and files
The performance of rsync with a huge number of files is greatly determined by every option you are using. So, what is your whole command line? On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote:> Hi all, > > I'm using rsync 3 to copy all files from one disk to another. The files > were writen by Minio, an S3 compatible opensource backend. > > The number of files is dozens of millions, almost each of them within > its own directory. > > Rsync takes a long time, when not several hours, to even start syncing > files. I already see a few reasons: > - it first create all directories to put files in, that could be done > along with the sync > - it needs to generate the list of all files before starting, and cannot > start syncing and keep the list generation in a different thread. > > Cheers, > bráulio > >-- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 224 bytes Desc: OpenPGP digital signature URL: <http://lists.samba.org/pipermail/rsync/attachments/20180319/580933bf/signature.sig>
Bráulio Bhavamitra
2018-Mar-20 18:33 UTC
Very slow to start sync with millions of directories and files
Em seg, 19 de mar de 2018 11:34, Kevin Korb via rsync <rsync at lists.samba.org> escreveu:> The performance of rsync with a huge number of files is greatly > determined by every option you are using. So, what is your whole > command line? >rsync -avP /data-old/ /data> > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: > > Hi all, > > > > I'm using rsync 3 to copy all files from one disk to another. The files > > were writen by Minio, an S3 compatible opensource backend. > > > > The number of files is dozens of millions, almost each of them within > > its own directory. > > > > Rsync takes a long time, when not several hours, to even start syncing > > files. I already see a few reasons: > > - it first create all directories to put files in, that could be done > > along with the sync > > - it needs to generate the list of all files before starting, and cannot > > start syncing and keep the list generation in a different thread. > > > > Cheers, > > bráulio > > > > > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net (work) > Orlando, Florida kmk at sanitarium.net (personal) > Web page: http://www.sanitarium.net/ > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180320/fa18c9c5/attachment.html>
Maybe Matching Threads
- Very slow to start sync with millions of directories and files
- Very slow to start sync with millions of directories and files
- rsync very very slow with multiple instances at the same time.
- Very slow to start sync with millions of directories and files
- Very slow to start sync with millions of directories and files