Hi Friends, I am using rsync to copy data from Production File Server to Disaster Recovery file server. I have 100Mbps link setup between these two servers. Folder structure is very deep. It is having path like /reports/folder1/date/folder2/file.tx, where we have 1600 directories like 'folder1', daily folders since last year in date folder and 2 folders for each date folder like folder2 which ultimately will contain the file. Files are not too big but just design of folder structure is complex. Folder structure design is done by application and we can't change it at the moment. I am using following command in cron to run rsync. rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 10.10.10.100:/reports/ | tee /tmp/rsync_report.out >> /tmp/rsync_report.out.$today Initially we were running it every 5 mins then we increased it to every 30 mins since one instance was not getting finished in 5 mins. Now we have made it to run every 8 hours because of lots of folders. Is there a way i can improve performance of my rsync?? Regards, Vijay -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20120412/b7f1eb16/attachment.html>
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Several suggestions... Add a lockfile to your cron job so it doesn't run two instances at the same time and you don't have to predict the run time. Make sure you are running rsync version 3+ on both systems. It has significant performance benefits over version 2. Run a job manually and add --itemize-changes and --progress. Try to figure out where most of the time is spent. Looking for something to transfer, transferring new files, or updating changed files. If it is mostly looking for something to transfer then you need filesystem optimizations. Such as directory indexing. You didn't specify the OS or anything but if you are on Linux this is where an ext3 > ext4 conversion would be helpful. If it is mostly transferring new files then look at the network transfer rate. If it is low then try optimizing the ssh portion. Try using -e 'ssh -c arcfour' or try using the hpn version of openssh. If encryption isn't important you could also setup rsyncd. If it is mostly updating existing files check the itemize output to see if the files really need updating. For instance if something is screwing with your timestamps that will create a bunch of extra work for rsync. Also, --inplace might help performance but be sure to read about it. On 04/12/12 14:29, vijay patel wrote:> Hi Friends, > > I am using rsync to copy data from Production File Server to > Disaster Recovery file server. I have 100Mbps link setup between > these two servers. Folder structure is very deep. It is having path > like /reports/folder1/date/folder2/file.tx, where we have 1600 > directories like 'folder1', daily folders since last year in date > folder and 2 folders for each date folder like folder2 which > ultimately will contain the file. Files are not too big but just > design of folder structure is complex. Folder structure design is > done by application and we can't change it at the moment. I am > using following command in cron to run rsync. > > rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ > 10.10.10.100:/reports/ | tee /tmp/rsync_report.out >> > /tmp/rsync_report.out.$today > > Initially we were running it every 5 mins then we increased it to > every 30 mins since one instance was not getting finished in 5 > mins. Now we have made it to run every 8 hours because of lots of > folders. Is there a way i can improve performance of my rsync?? > > > Regards, Vijay > > >- -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ Kevin Korb Phone: (407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. Kevin at FutureQuest.net (work) Orlando, Florida kmk at sanitarium.net (personal) Web page: http://www.sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk+HIoMACgkQVKC1jlbQAQddkACeOljjKSj/NVpc4dj6+Hjm946j 9IsAoPNV4DrbTtH5Yj8Zk7p/2O8JacE3 =LsDJ -----END PGP SIGNATURE-----
I've heard lots of good suggestions already - another thing that I've not seen mentioned is, upgrading your kernel may help. Somewhere shortly before kernel 3.0, pathname lookups got noticeably faster. You could also try an alternative filesystem like xfs. It's supposed to be pretty good at large directories. On Thu, Apr 12, 2012 at 11:29 AM, vijay patel <catchvjay at hotmail.com> wrote:> Hi Friends, > > I am using rsync to copy data from Production File Server to Disaster > Recovery file server. I have 100Mbps link setup between these two servers. > Folder structure is very deep. It is having path like > /reports/folder1/date/folder2/file.tx, where we have 1600 directories like > 'folder1', daily folders since last year in date folder and 2 folders for > each date folder like folder2 which ultimately will contain the file. > Files are not too big but just design of folder structure is complex. > Folder structure design is done by application and we can't change it at > the moment. I am using following command in cron to run rsync. > > rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 10.10.10.100:/reports/ > | tee /tmp/rsync_report.out >> /tmp/rsync_report.out.$today > > Initially we were running it every 5 mins then we increased it to every 30 > mins since one instance was not getting finished in 5 mins. Now we have > made it to run every 8 hours because of lots of folders. Is there a way i > can improve performance of my rsync?? > > > Regards, > Vijay > > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20120412/f9d91c07/attachment.html>
On 12.04.2012 23:59, vijay patel wrote:> > Hi Friends, > > I am using rsync to copy data from Production File Server to Disaster Recovery file server. I have 100Mbps link setup between these two servers. Folder structure is very deep. It is having path like /reports/folder1/date/folder2/file.tx, where we have 1600 directories like 'folder1', daily folders since last year in date folder and 2 folders for each date folder like folder2 which ultimately will contain the file. Files are not too big but just design of folder structure is complex. Folder structure design is done by application and we can't change it at the moment. I am using following command in cron to run rsync. > > rsync -avh --delete --exclude-from 'ex_file.txt' /reports/ 10.10.10.100:/reports/ | tee /tmp/rsync_report.out >> /tmp/rsync_report.out.$today > > Initially we were running it every 5 mins then we increased it to every 30 mins since one instance was not getting finished in 5 mins. Now we have made it to run every 8 hours because of lots of folders. Is there a way i can improve performance of my rsync??You description and the ones in the other mails, read like something else is more appropriate: lsyncd http://code.google.com/p/lsyncd/ It uses inotify to to catch the events of files beeing created/changed/.. and then syncs those files/directories (using rsync). Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous.