Vijay Bellur
2016-Nov-10 15:27 UTC
[Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"
On Thu, Nov 10, 2016 at 3:17 AM, Nithya Balachandran <nbalacha at redhat.com> wrote:> > > On 8 November 2016 at 20:21, Kyle Johnson <kjohnson at gnulnx.net> wrote: >> >> Hey there, >> >> We have a number of processes which daily walk our entire directory tree >> and perform operations on the found files. >> >> Pre-gluster, this processes was able to complete within 24 hours of >> starting. After outgrowing that single server and moving to a gluster setup >> (two bricks, two servers, distribute, 10gig uplink), the processes became >> unusable. >> >> After turning this option on, we were back to normal run times, with the >> process completing within 24 hours. >> >> Our data is heavy nested in a large number of subfolders under /media/ftp. > > > Thanks for getting back to us - this is very good information. Can you > provide a few more details? > > How deep is your directory tree and roughly how many directories do you have > at each level? > Are all your files in the lowest level dirs or do they exist on several > levels? > Would you be willing to provide the gluster volume info output for this > volume? >>I have had performance improvement with this option when the first level below the root consisted several thousands of directories without any files. IIRC, I was testing this in a 16 x 2 setup. Regards, Vijay
Raghavendra G
2016-Nov-10 15:35 UTC
[Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"
On Thu, Nov 10, 2016 at 8:57 PM, Vijay Bellur <vbellur at redhat.com> wrote:> On Thu, Nov 10, 2016 at 3:17 AM, Nithya Balachandran > <nbalacha at redhat.com> wrote: > > > > > > On 8 November 2016 at 20:21, Kyle Johnson <kjohnson at gnulnx.net> wrote: > >> > >> Hey there, > >> > >> We have a number of processes which daily walk our entire directory tree > >> and perform operations on the found files. > >> > >> Pre-gluster, this processes was able to complete within 24 hours of > >> starting. After outgrowing that single server and moving to a gluster > setup > >> (two bricks, two servers, distribute, 10gig uplink), the processes > became > >> unusable. > >> > >> After turning this option on, we were back to normal run times, with the > >> process completing within 24 hours. > >> > >> Our data is heavy nested in a large number of subfolders under > /media/ftp. > > > > > > Thanks for getting back to us - this is very good information. Can you > > provide a few more details? > > > > How deep is your directory tree and roughly how many directories do you > have > > at each level? > > Are all your files in the lowest level dirs or do they exist on several > > levels? > > Would you be willing to provide the gluster volume info output for this > > volume? > >> > > > I have had performance improvement with this option when the first > level below the root consisted several thousands of directories > without any files. IIRC, I was testing this in a 16 x 2 setup. >Yes Vijay. I remember you mentioning it. This option is expected to only boost readdir performance on a directory containing subdirectories. For files it has no effect. On a similar note, I think we can also skip linkto files in readdirp (on brick) as dht_readdirp picks the dentry from subvol containing data-file.> Regards, > Vijay > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel >-- Raghavendra G -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161110/8fa257c9/attachment.html>