Raghavendra G
2016-Nov-10 07:21 UTC
[Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"
Kyle, Thanks for your your response :). This really helps. From 13s to 0.23s seems like huge improvement. regards, Raghavendra On Tue, Nov 8, 2016 at 8:21 PM, Kyle Johnson <kjohnson at gnulnx.net> wrote:> Hey there, > > We have a number of processes which daily walk our entire directory tree > and perform operations on the found files. > > Pre-gluster, this processes was able to complete within 24 hours of > starting. After outgrowing that single server and moving to a gluster > setup (two bricks, two servers, distribute, 10gig uplink), the processes > became unusable. > > After turning this option on, we were back to normal run times, with the > process completing within 24 hours. > > Our data is heavy nested in a large number of subfolders under /media/ftp. > > A subset of our data: > > 15T of files in 48163 directories under /media/ftp/dig_dis. > > Without readdir-optimize: > > [root at colossus dig_dis]# time ls|wc -l > 48163 > > real 13m1.582s > user 0m0.294s > sys 0m0.205s > > > With readdir-optimize: > > [root at colossus dig_dis]# time ls | wc -l > 48163 > > real 0m23.785s > user 0m0.296s > sys 0m0.108s > > > Long story short - this option is super important to me as it resolved an > issue that would have otherwise made me move my data off of gluster. > > > Thank you for all of your work, > > Kyle > > > > > > On 11/07/2016 10:07 PM, Raghavendra Gowdappa wrote: > >> Hi all, >> >> We have an option in called "cluster.readdir-optimize" which alters the >> behavior of readdirp in DHT. This value affects how storage/posix treats >> dentries corresponding to directories (not for files). >> >> When this value is on, >> * DHT asks only one subvol/brick to return dentries corresponding to >> directories. >> * Other subvols/bricks filter dentries corresponding to directories and >> send only dentries corresponding to files. >> >> When this value is off (this is the default value), >> * All subvols return all dentries stored on them. IOW, bricks don't >> filter any dentries. >> * Since a directory has one dentry representing it on each subvol, dht >> (loaded on client) picks up dentry only from hashed subvol. >> >> Note that irrespective of value of this option, _all_ subvols return >> dentries corresponding to files which are stored on them. >> >> This option was introduced to boost performance of readdir as (when set >> on), filtering of dentries happens on bricks and hence there is reduced: >> 1. network traffic (with filtering all the redundant dentry information) >> 2. number of readdir calls between client and server for the same number >> of dentries returned to application (If filtering happens on client, lesser >> number of dentries in result and hence more number of readdir calls. IOW, >> result buffer is not filled to maximum capacity). >> >> We want to hear from you Whether you've used this option and if yes, >> 1. Did it really boost readdir performance? >> 2. Do you've any performance data to find out what was the percentage of >> improvement (or deterioration)? >> 3. Data set you had (Number of files, directories and organisation of >> directories). >> >> If we find out that this option is really helping you, we can spend our >> energies on fixing issues that will arise when this option is set to on. >> One common issue with turning this option on is that when this option is >> set, some directories might not show up in directory listing [1]. The >> reason for this is that: >> 1. If a directory can be created on a hashed subvol, mkdir (result to >> application) will be successful, irrespective of result of mkdir on rest of >> the subvols. >> 2. So, any subvol we pick to give us dentries corresponding to directory >> need not contain all the directories and we might miss out those >> directories in listing. >> >> Your feedback is important for us and will help us to prioritize and >> improve things. >> >> [1] https://www.gluster.org/pipermail/gluster-users/2016-October >> /028703.html >> >> regards, >> Raghavendra >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel >-- Raghavendra G -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161110/0ffb0692/attachment.html>
Gandalf Corvotempesta
2016-Nov-10 07:28 UTC
[Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"
Il 10 nov 2016 08:22, "Raghavendra <raghavendra at gluster.com> ha scritto:> > Kyle, > > Thanks for your your response :). This really helps. From 13s to 0.23sseems like huge improvement.>From 13 minutes to 23 seconds, not from 13 seconds :)-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161110/8318fdfd/attachment.html>