Bráulio Bhavamitra
2018-Mar-20 21:24 UTC
Very slow to start sync with millions of directories and files
On Tue, Mar 20, 2018 at 5:49 PM Kevin Korb <kmk at sanitarium.net> wrote:> Nothing there should be preventing incremental indexing. That means it > should start copying as soon as it finds a file that needs to be copied. >Doesn't it tries to create all (empty) directories first?> On 03/20/2018 02:33 PM, Bráulio Bhavamitra wrote: > > > > > > Em seg, 19 de mar de 2018 11:34, Kevin Korb via rsync > > <rsync at lists.samba.org <mailto:rsync at lists.samba.org>> escreveu: > > > > The performance of rsync with a huge number of files is greatly > > determined by every option you are using. So, what is your whole > > command line? > > > > rsync -avP /data-old/ /data > > > > > > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: > > > Hi all, > > > > > > I'm using rsync 3 to copy all files from one disk to another. The > > files > > > were writen by Minio, an S3 compatible opensource backend. > > > > > > The number of files is dozens of millions, almost each of them > within > > > its own directory. > > > > > > Rsync takes a long time, when not several hours, to even start > syncing > > > files. I already see a few reasons: > > > - it first create all directories to put files in, that could be > done > > > along with the sync > > > - it needs to generate the list of all files before starting, and > > cannot > > > start syncing and keep the list generation in a different thread. > > > > > > Cheers, > > > bráulio > > > > > > > > > > -- > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > Kevin Korb Phone: (407) 252-6853 > > Systems Administrator Internet: > > FutureQuest, Inc. Kevin at FutureQuest.net > (work) > > Orlando, Florida kmk at sanitarium.net > > <mailto:kmk at sanitarium.net> (personal) > > Web page: http://www.sanitarium.net/ > > PGP public key available on web site. > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > > > -- > > Please use reply-all for most replies to avoid omitting the mailing > > list. > > To unsubscribe or change options: > > https://lists.samba.org/mailman/listinfo/rsync > > Before posting, read: > http://www.catb.org/~esr/faqs/smart-questions.html > > > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net (work) > Orlando, Florida kmk at sanitarium.net (personal) > Web page: http://www.sanitarium.net/ > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180320/436822e8/attachment.html>
Kevin Korb
2018-Mar-20 22:26 UTC
Very slow to start sync with millions of directories and files
It creates the directories as it needs them. If you want to watch it looking through files it doesn't need to copy you can add -ii (see --itemize-changes for what the output means). On 03/20/2018 05:24 PM, Bráulio Bhavamitra wrote:> > > On Tue, Mar 20, 2018 at 5:49 PM Kevin Korb <kmk at sanitarium.net > <mailto:kmk at sanitarium.net>> wrote: > > Nothing there should be preventing incremental indexing. That means it > should start copying as soon as it finds a file that needs to be copied. > > Doesn't it tries to create all (empty) directories first? > > > On 03/20/2018 02:33 PM, Bráulio Bhavamitra wrote: > > > > > > Em seg, 19 de mar de 2018 11:34, Kevin Korb via rsync > > <rsync at lists.samba.org <mailto:rsync at lists.samba.org> > <mailto:rsync at lists.samba.org <mailto:rsync at lists.samba.org>>> escreveu: > > > > The performance of rsync with a huge number of files is greatly > > determined by every option you are using. So, what is your whole > > command line? > > > > rsync -avP /data-old/ /data > > > > > > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: > > > Hi all, > > > > > > I'm using rsync 3 to copy all files from one disk to > another. The > > files > > > were writen by Minio, an S3 compatible opensource backend. > > > > > > The number of files is dozens of millions, almost each of > them within > > > its own directory. > > > > > > Rsync takes a long time, when not several hours, to even > start syncing > > > files. I already see a few reasons: > > > - it first create all directories to put files in, that > could be done > > > along with the sync > > > - it needs to generate the list of all files before > starting, and > > cannot > > > start syncing and keep the list generation in a different > thread. > > > > > > Cheers, > > > bráulio > > > > > > > > > > -- > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > Kevin Korb Phone: (407) 252-6853 > > Systems Administrator Internet: > > FutureQuest, Inc. Kevin at FutureQuest.net > (work) > > Orlando, Florida kmk at sanitarium.net > <mailto:kmk at sanitarium.net> > > <mailto:kmk at sanitarium.net <mailto:kmk at sanitarium.net>> (personal) > > Web page: http://www.sanitarium.net/ > > PGP public key available on web site. > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > > > -- > > Please use reply-all for most replies to avoid omitting the > mailing > > list. > > To unsubscribe or change options: > > https://lists.samba.org/mailman/listinfo/rsync > > Before posting, read: > http://www.catb.org/~esr/faqs/smart-questions.html > > > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net (work) > Orlando, Florida kmk at sanitarium.net > <mailto:kmk at sanitarium.net> (personal) > Web page: http://www.sanitarium.net/ > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., >-- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 224 bytes Desc: OpenPGP digital signature URL: <http://lists.samba.org/pipermail/rsync/attachments/20180320/e5c7520d/signature.sig>
Bráulio Bhavamitra
2018-Mar-21 18:29 UTC
Very slow to start sync with millions of directories and files
Em ter, 20 de mar de 2018 19:26, Kevin Korb <kmk at sanitarium.net> escreveu:> It creates the directories as it needs them. If you want to watch it > looking through files it doesn't need to copy you can add -ii (see > --itemize-changes for what the output means). >Nice, this should help debugging> > On 03/20/2018 05:24 PM, Bráulio Bhavamitra wrote: > > > > > > On Tue, Mar 20, 2018 at 5:49 PM Kevin Korb <kmk at sanitarium.net > > <mailto:kmk at sanitarium.net>> wrote: > > > > Nothing there should be preventing incremental indexing. That means > it > > should start copying as soon as it finds a file that needs to be > copied. > > > > Doesn't it tries to create all (empty) directories first? > > > > > > On 03/20/2018 02:33 PM, Bráulio Bhavamitra wrote: > > > > > > > > > Em seg, 19 de mar de 2018 11:34, Kevin Korb via rsync > > > <rsync at lists.samba.org <mailto:rsync at lists.samba.org> > > <mailto:rsync at lists.samba.org <mailto:rsync at lists.samba.org>>> > escreveu: > > > > > > The performance of rsync with a huge number of files is greatly > > > determined by every option you are using. So, what is your > whole > > > command line? > > > > > > rsync -avP /data-old/ /data > > > > > > > > > On 03/19/2018 09:05 AM, Bráulio Bhavamitra via rsync wrote: > > > > Hi all, > > > > > > > > I'm using rsync 3 to copy all files from one disk to > > another. The > > > files > > > > were writen by Minio, an S3 compatible opensource backend. > > > > > > > > The number of files is dozens of millions, almost each of > > them within > > > > its own directory. > > > > > > > > Rsync takes a long time, when not several hours, to even > > start syncing > > > > files. I already see a few reasons: > > > > - it first create all directories to put files in, that > > could be done > > > > along with the sync > > > > - it needs to generate the list of all files before > > starting, and > > > cannot > > > > start syncing and keep the list generation in a different > > thread. > > > > > > > > Cheers, > > > > bráulio > > > > > > > > > > > > > > -- > > > > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > > Kevin Korb Phone: (407) > 252-6853 > > > Systems Administrator Internet: > > > FutureQuest, Inc. Kevin at FutureQuest.net > > (work) > > > Orlando, Florida kmk at sanitarium.net > > <mailto:kmk at sanitarium.net> > > > <mailto:kmk at sanitarium.net <mailto:kmk at sanitarium.net>> > (personal) > > > Web page: > http://www.sanitarium.net/ > > > PGP public key available on web site. > > > > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > > > > > -- > > > Please use reply-all for most replies to avoid omitting the > > mailing > > > list. > > > To unsubscribe or change options: > > > https://lists.samba.org/mailman/listinfo/rsync > > > Before posting, read: > > http://www.catb.org/~esr/faqs/smart-questions.html > > > > > > > -- > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > Kevin Korb Phone: (407) 252-6853 > > Systems Administrator Internet: > > FutureQuest, Inc. Kevin at FutureQuest.net > (work) > > Orlando, Florida kmk at sanitarium.net > > <mailto:kmk at sanitarium.net> (personal) > > Web page: http://www.sanitarium.net/ > > PGP public key available on web site. > > > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > > > > -- > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > Kevin Korb Phone: (407) 252-6853 > Systems Administrator Internet: > FutureQuest, Inc. Kevin at FutureQuest.net (work) > Orlando, Florida kmk at sanitarium.net (personal) > Web page: http://www.sanitarium.net/ > PGP public key available on web site. > ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20180321/79062592/attachment.html>
Seemingly Similar Threads
- Very slow to start sync with millions of directories and files
- Very slow to start sync with millions of directories and files
- Very slow to start sync with millions of directories and files
- Very slow to start sync with millions of directories and files
- rsync very very slow with multiple instances at the same time.