Bráulio Bhavamitra
2018-Mar-20 18:33 UTC
Very slow to start sync with millions of directories and files
Em seg, 19 de mar de 2018 11:34, Kevin Korb via rsync <rsync at lists.samba.org> escreveu:> The performance of rsync with a huge number of files is greatly > determined by every option you are using. So, what is your whole > command line? >rsync -avP /data-old/ /data> > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: > > Hi all, > > > > I'm using rsync 3 to copy all files from one disk to another. The files > > were writen by Minio, an S3 compatible opensource backend. > > > > The number of files is dozens of millions, almost each of them within > > its own directory. > > > > Rsync takes a long time, when not several hours, to even start syncing > > files. I already see a few reasons: > > - it first create all directories to put files in, that could be done > > along with the sync > > - it needs to generate the list of all files before starting, and cannot > > start syncing and keep the list generation in a different thread. > > > > Cheers, > > bráulio > > > > > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net (work) > Orlando, Florida kmk at sanitarium.net (personal) > Web page: http://www.sanitarium.net/ > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180320/fa18c9c5/attachment.html>
Kevin Korb
2018-Mar-20 20:49 UTC
Very slow to start sync with millions of directories and files
Nothing there should be preventing incremental indexing. That means it should start copying as soon as it finds a file that needs to be copied. On 03/20/2018 02:33 PM, Bráulio Bhavamitra wrote:> > > Em seg, 19 de mar de 2018 11:34, Kevin Korb via rsync > <rsync at lists.samba.org <mailto:rsync at lists.samba.org>> escreveu: > > The performance of rsync with a huge number of files is greatly > determined by every option you are using. So, what is your whole > command line? > > rsync -avP /data-old/ /data > > > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: > > Hi all, > > > > I'm using rsync 3 to copy all files from one disk to another. The > files > > were writen by Minio, an S3 compatible opensource backend. > > > > The number of files is dozens of millions, almost each of them within > > its own directory. > > > > Rsync takes a long time, when not several hours, to even start syncing > > files. I already see a few reasons: > > - it first create all directories to put files in, that could be done > > along with the sync > > - it needs to generate the list of all files before starting, and > cannot > > start syncing and keep the list generation in a different thread. > > > > Cheers, > > bráulio > > > > > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net (work) > Orlando, Florida kmk at sanitarium.net > <mailto:kmk at sanitarium.net> (personal) > Web page: http://www.sanitarium.net/ > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > -- > Please use reply-all for most replies to avoid omitting the mailing > list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html >-- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 224 bytes Desc: OpenPGP digital signature URL: <http://lists.samba.org/pipermail/rsync/attachments/20180320/4a1858a2/signature.sig>
Bráulio Bhavamitra
2018-Mar-20 21:24 UTC
Very slow to start sync with millions of directories and files
On Tue, Mar 20, 2018 at 5:49 PM Kevin Korb <kmk at sanitarium.net> wrote:> Nothing there should be preventing incremental indexing. That means it > should start copying as soon as it finds a file that needs to be copied. >Doesn't it tries to create all (empty) directories first?> On 03/20/2018 02:33 PM, Bráulio Bhavamitra wrote: > > > > > > Em seg, 19 de mar de 2018 11:34, Kevin Korb via rsync > > <rsync at lists.samba.org <mailto:rsync at lists.samba.org>> escreveu: > > > > The performance of rsync with a huge number of files is greatly > > determined by every option you are using. So, what is your whole > > command line? > > > > rsync -avP /data-old/ /data > > > > > > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: > > > Hi all, > > > > > > I'm using rsync 3 to copy all files from one disk to another. The > > files > > > were writen by Minio, an S3 compatible opensource backend. > > > > > > The number of files is dozens of millions, almost each of them > within > > > its own directory. > > > > > > Rsync takes a long time, when not several hours, to even start > syncing > > > files. I already see a few reasons: > > > - it first create all directories to put files in, that could be > done > > > along with the sync > > > - it needs to generate the list of all files before starting, and > > cannot > > > start syncing and keep the list generation in a different thread. > > > > > > Cheers, > > > bráulio > > > > > > > > > > -- > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > Kevin Korb Phone: (407) 252-6853 > > Systems Administrator Internet: > > FutureQuest, Inc. Kevin at FutureQuest.net > (work) > > Orlando, Florida kmk at sanitarium.net > > <mailto:kmk at sanitarium.net> (personal) > > Web page: http://www.sanitarium.net/ > > PGP public key available on web site. > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > > > -- > > Please use reply-all for most replies to avoid omitting the mailing > > list. > > To unsubscribe or change options: > > https://lists.samba.org/mailman/listinfo/rsync > > Before posting, read: > http://www.catb.org/~esr/faqs/smart-questions.html > > > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net (work) > Orlando, Florida kmk at sanitarium.net (personal) > Web page: http://www.sanitarium.net/ > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180320/436822e8/attachment.html>
Apparently Analagous Threads
- Very slow to start sync with millions of directories and files
- Very slow to start sync with millions of directories and files
- rsync very very slow with multiple instances at the same time.
- Very slow to start sync with millions of directories and files
- Very slow to start sync with millions of directories and files