Hi Kashif, Thank you for your feedback! Do you have some data on the nature of performance improvement observed with 3.11 in the new setup? Adding Raghavendra and Poornima for validation of configuration and help with identifying why certain files disappeared from the mount point after enabling readdir-optimize. Regards, Vijay On 07/11/2017 11:06 AM, mohammad kashif wrote:> Hi Vijay and Experts > > I didn't want to experiment with my production setup so started a > parallel system with two server and around 80TB storage. First > configured with gluster 3.8 and had the same lookup performance issue. > Then upgraded to 3.11 as you suggested and it made huge improvement in > lookup time. I also did some more optimization as suggested in other > threads. > Now I am going to update my production server. I am planning to use > following optimization option, it would be very useful if you can point > out any inconsistency or suggest some other options. My production setup > has 5 servers consisting of 400TB storage and around 80 million files > of varying lengths. > > Options Reconfigured: > server.event-threads: 4 > client.event-threads: 4 > cluster.lookup-optimize: on > cluster.readdir-optimize: off > performance.client-io-threads: on > performance.cache-size: 1GB > performance.parallel-readdir: on > performance.md-cache-timeout: 600 > performance.cache-invalidation: on > performance.stat-prefetch: on > features.cache-invalidation-timeout: 600 > features.cache-invalidation: on > nfs.disable: on > performance.readdir-ahead: on > transport.address-family: inet > auth.allow: 163.1.136.* > diagnostics.latency-measurement: on > diagnostics.count-fop-hits: on > > I found that setting cluster.readdir-optimize to 'on' made some files > disappear from client ! > > Thanks > > Kashif > > > > On Sun, Jun 18, 2017 at 4:57 PM, Vijay Bellur <vbellur at redhat.com > <mailto:vbellur at redhat.com>> wrote: > > Hi Mohammad, > > A lot of time is being spent in addressing metadata calls as > expected. Can you consider testing out with 3.11 with md-cache [1] > and readdirp [2] improvements? > > Adding Poornima and Raghavendra who worked on these enhancements to > help out further. > > Thanks, > Vijay > > [1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/ > <https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/> > > [2] https://github.com/gluster/glusterfs/issues/166 > <https://github.com/gluster/glusterfs/issues/166> > > On Fri, Jun 16, 2017 at 2:49 PM, mohammad kashif > <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> wrote: > > Hi Vijay > > Did you manage to look into the gluster profile logs ? > > Thanks > > Kashif > > On Mon, Jun 12, 2017 at 11:40 AM, mohammad kashif > <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> wrote: > > Hi Vijay > > I have enabled client profiling and used this script > https://github.com/bengland2/gluster-profile-analysis/blob/master/gvp-client.sh > <https://github.com/bengland2/gluster-profile-analysis/blob/master/gvp-client.sh> > to extract data. I am attaching output files. I don't have > any reference data to compare with my output. Hopefully you > can make some sense out of it. > > On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur > <vbellur at redhat.com <mailto:vbellur at redhat.com>> wrote: > > Would it be possible for you to turn on client profiling > and then run du? Instructions for turning on client > profiling can be found at [1]. Providing the client > profile information can help us figure out where the > latency could be stemming from. > > Regards, > Vijay > > [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling > <https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling> > > On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif > <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> > wrote: > > Hi Vijay > > Thanks for your quick response. I am using gluster > 3.8.11 on Centos 7 servers > glusterfs-3.8.11-1.el7.x86_64 > > clients are centos 6 but I tested with a centos 7 > client as well and results didn't change > > gluster volume info Volume Name: atlasglust > Type: Distribute > Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b > Status: Started > Snapshot Count: 0 > Number of Bricks: 5 > Transport-type: tcp > Bricks: > Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 > Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 > Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 > Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 > Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 > Options Reconfigured: > nfs.disable: on > performance.readdir-ahead: on > transport.address-family: inet > auth.allow: x.y.z > > I am not using directory quota. > > Please let me know if you require some more info > > Thanks > > Kashif > > > > On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur > <vbellur at redhat.com <mailto:vbellur at redhat.com>> wrote: > > Can you please provide more details about your > volume configuration and the version of gluster > that you are using? > > Regards, > Vijay > > On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif > <kashif.alig at gmail.com > <mailto:kashif.alig at gmail.com>> wrote: > > Hi > > I have just moved our 400 TB HPC storage > from lustre to gluster. It is part of a > research institute and users have very small > files to big files ( few KB to 20GB) . Our > setup consists of 5 servers, each with 96TB > RAID 6 disks. All servers are connected > through 10G ethernet but not all clients. > Gluster volumes are distributed without any > replication. There are approximately 80 > million files in file system. > I am mounting using glusterfs on clients. > > I have copied everything from lustre to > gluster but old file system exist so I can > compare. > > The problem, I am facing is extremely slow > du on even a small directory. Also the time > taken is substantially different each time. > I tried du from same client on a particular > directory twice and got these results. > > time du -sh /data/aa/bb/cc > 3.7G /data/aa/bb/cc > real 7m29.243s > user 0m1.448s > sys 0m7.067s > > time du -sh /data/aa/bb/cc > 3.7G /data/aa/bb/cc > real 16m43.735s > user 0m1.097s > sys 0m5.802s > > 16m and 7m is too long for a 3.7 G > directory. I must mention that the directory > contains huge number of files (208736) > > but running du on same directory on old data > gives this result > > time du -sh /olddata/aa/bb/cc > 4.0G /olddata/aa/bb/cc > real 3m1.255s > user 0m0.755s > sys 0m38.099s > > much better if I run same command again > > time du -sh /olddata/aa/bb/cc > 4.0G /olddata/aa/bb/cc > real 0m8.309s > user 0m0.313s > sys 0m7.755s > > Is there anything I can do to improve this > performance? I would also like hear from > some one who is running same kind of setup. > > Thanks > > Kashif > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users> > > > > > > > >
Hi, ? ? I also noticed disappearing of files with the combination of certain settings. If you use cluster.readdir-optimize but not some of the other settings, they don't disappear. ? Unfortunately can't remember which setting was conflicting... ? ? Performance wise I don't see difference between 3.10 and 3.11 over here. Didn't test recently with 3.8. Regards Jo ? ? -----Original message----- From:Vijay Bellur <vbellur at redhat.com> Sent:Tue 11-07-2017 17:22 Subject:Re: [Gluster-users] Extremely slow du To:mohammad kashif <kashif.alig at gmail.com>; Raghavendra Gowdappa <rgowdapp at redhat.com>; Poornima Gurusiddaiah <pgurusid at redhat.com>; CC:gluster-users Discussion List <Gluster-users at gluster.org>; Hi Kashif, Thank you for your feedback! Do you have some data on the nature of performance improvement observed with 3.11 in the new setup? Adding Raghavendra and Poornima for validation of configuration and help with identifying why certain files disappeared from the mount point after enabling readdir-optimize. Regards, Vijay On 07/11/2017 11:06 AM, mohammad kashif wrote:> Hi Vijay and Experts > > I didn't want to experiment with my production setup so started ?a > parallel system with two server and around 80TB storage. ?First > configured with gluster 3.8 and had the same lookup performance issue. > Then upgraded to 3.11 as you suggested and it made huge improvement in > lookup time. I also did some more optimization as suggested in other > threads. > Now I am going to update my production server. I am planning to use > following ?optimization option, it would be very useful if you can point > out any inconsistency or suggest some other options. My production setup > has 5 servers consisting of ?400TB storage and around 80 million files > of varying lengths. > > Options Reconfigured: > server.event-threads: 4 > client.event-threads: 4 > cluster.lookup-optimize: on > cluster.readdir-optimize: off > performance.client-io-threads: on > performance.cache-size: 1GB > performance.parallel-readdir: on > performance.md-cache-timeout: 600 > performance.cache-invalidation: on > performance.stat-prefetch: on > features.cache-invalidation-timeout: 600 > features.cache-invalidation: on > nfs.disable: on > performance.readdir-ahead: on > transport.address-family: inet > auth.allow: 163.1.136.* > diagnostics.latency-measurement: on > diagnostics.count-fop-hits: on > > I found that setting cluster.readdir-optimize to 'on' made some files > disappear from client ! > > Thanks > > Kashif > > > > On Sun, Jun 18, 2017 at 4:57 PM, Vijay Bellur <vbellur at redhat.com > <mailto:vbellur at redhat.com>> wrote: > > ? ? Hi Mohammad, > > ? ? A lot of time is being spent in addressing metadata calls as > ? ? expected. Can you consider testing out with 3.11 with md-cache [1] > ? ? and readdirp [2] improvements? > > ? ? Adding Poornima and Raghavendra who worked on these enhancements to > ? ? help out further. > > ? ? Thanks, > ? ? Vijay > > ? ? [1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/ > ? ? <https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/> > > ? ? [2] https://github.com/gluster/glusterfs/issues/166 > ? ? <https://github.com/gluster/glusterfs/issues/166> > > ? ? On Fri, Jun 16, 2017 at 2:49 PM, mohammad kashif > ? ? <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> wrote: > > ? ? ? ? Hi Vijay > > ? ? ? ? Did you manage to look into the gluster profile logs ? > > ? ? ? ? Thanks > > ? ? ? ? Kashif > > ? ? ? ? On Mon, Jun 12, 2017 at 11:40 AM, mohammad kashif > ? ? ? ? <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> wrote: > > ? ? ? ? ? ? Hi Vijay > > ? ? ? ? ? ? I have enabled client profiling and used this script > ? ? ? ? ? ? https://github.com/bengland2/gluster-profile-analysis/blob/master/gvp-client.sh > ? ? ? ? ? ? <https://github.com/bengland2/gluster-profile-analysis/blob/master/gvp-client.sh> > ? ? ? ? ? ? to extract data. I am attaching output files. I don't have > ? ? ? ? ? ? any reference data to compare with my output. Hopefully you > ? ? ? ? ? ? can make some sense out of it. > > ? ? ? ? ? ? On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur > ? ? ? ? ? ? <vbellur at redhat.com <mailto:vbellur at redhat.com>> wrote: > > ? ? ? ? ? ? ? ? Would it be possible for you to turn on client profiling > ? ? ? ? ? ? ? ? and then run du? Instructions for turning on client > ? ? ? ? ? ? ? ? profiling can be found at [1]. Providing the client > ? ? ? ? ? ? ? ? profile information can help us figure out where the > ? ? ? ? ? ? ? ? latency could be stemming from. > > ? ? ? ? ? ? ? ? Regards, > ? ? ? ? ? ? ? ? Vijay > > ? ? ? ? ? ? ? ? [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling > ? ? ? ? ? ? ? ? <https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling> > > ? ? ? ? ? ? ? ? On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif > ? ? ? ? ? ? ? ? <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> > ? ? ? ? ? ? ? ? wrote: > > ? ? ? ? ? ? ? ? ? ? Hi Vijay > > ? ? ? ? ? ? ? ? ? ? Thanks for your quick response. I am using gluster > ? ? ? ? ? ? ? ? ? ? 3.8.11 on ?Centos 7 servers > ? ? ? ? ? ? ? ? ? ? glusterfs-3.8.11-1.el7.x86_64 > > ? ? ? ? ? ? ? ? ? ? clients are centos 6 but I tested with a centos 7 > ? ? ? ? ? ? ? ? ? ? client as well and results didn't change > > ? ? ? ? ? ? ? ? ? ? gluster volume info Volume Name: atlasglust > ? ? ? ? ? ? ? ? ? ? Type: Distribute > ? ? ? ? ? ? ? ? ? ? Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b > ? ? ? ? ? ? ? ? ? ? Status: Started > ? ? ? ? ? ? ? ? ? ? Snapshot Count: 0 > ? ? ? ? ? ? ? ? ? ? Number of Bricks: 5 > ? ? ? ? ? ? ? ? ? ? Transport-type: tcp > ? ? ? ? ? ? ? ? ? ? Bricks: > ? ? ? ? ? ? ? ? ? ? Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 > ? ? ? ? ? ? ? ? ? ? Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 > ? ? ? ? ? ? ? ? ? ? Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 > ? ? ? ? ? ? ? ? ? ? Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 > ? ? ? ? ? ? ? ? ? ? Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 > ? ? ? ? ? ? ? ? ? ? Options Reconfigured: > ? ? ? ? ? ? ? ? ? ? nfs.disable: on > ? ? ? ? ? ? ? ? ? ? performance.readdir-ahead: on > ? ? ? ? ? ? ? ? ? ? transport.address-family: inet > ? ? ? ? ? ? ? ? ? ? auth.allow: x.y.z > > ? ? ? ? ? ? ? ? ? ? I am not using directory quota. > > ? ? ? ? ? ? ? ? ? ? Please let me know if you require some more info > > ? ? ? ? ? ? ? ? ? ? Thanks > > ? ? ? ? ? ? ? ? ? ? Kashif > > > > ? ? ? ? ? ? ? ? ? ? On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur > ? ? ? ? ? ? ? ? ? ? <vbellur at redhat.com <mailto:vbellur at redhat.com>> wrote: > > ? ? ? ? ? ? ? ? ? ? ? ? Can you please provide more details about your > ? ? ? ? ? ? ? ? ? ? ? ? volume configuration and the version of gluster > ? ? ? ? ? ? ? ? ? ? ? ? that you are using? > > ? ? ? ? ? ? ? ? ? ? ? ? Regards, > ? ? ? ? ? ? ? ? ? ? ? ? Vijay > > ? ? ? ? ? ? ? ? ? ? ? ? On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif > ? ? ? ? ? ? ? ? ? ? ? ? <kashif.alig at gmail.com > ? ? ? ? ? ? ? ? ? ? ? ? <mailto:kashif.alig at gmail.com>> wrote: > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Hi > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? I have just moved our 400 TB HPC storage > ? ? ? ? ? ? ? ? ? ? ? ? ? ? from lustre to gluster. It is part of a > ? ? ? ? ? ? ? ? ? ? ? ? ? ? research institute and users have very small > ? ? ? ? ? ? ? ? ? ? ? ? ? ? files to ?big files ( few KB to 20GB) . Our > ? ? ? ? ? ? ? ? ? ? ? ? ? ? setup consists of 5 servers, each with 96TB > ? ? ? ? ? ? ? ? ? ? ? ? ? ? RAID 6 disks. All servers are connected > ? ? ? ? ? ? ? ? ? ? ? ? ? ? through 10G ethernet but not all clients. > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Gluster volumes are distributed without any > ? ? ? ? ? ? ? ? ? ? ? ? ? ? replication. There are approximately 80 > ? ? ? ? ? ? ? ? ? ? ? ? ? ? million files in file system. > ? ? ? ? ? ? ? ? ? ? ? ? ? ? I am mounting using glusterfs on ?clients. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? I have copied everything from lustre to > ? ? ? ? ? ? ? ? ? ? ? ? ? ? gluster but old file system exist so I can > ? ? ? ? ? ? ? ? ? ? ? ? ? ? compare. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? The problem, I am facing is extremely slow > ? ? ? ? ? ? ? ? ? ? ? ? ? ? du on even a small directory. Also the time > ? ? ? ? ? ? ? ? ? ? ? ? ? ? taken is substantially different each time. > ? ? ? ? ? ? ? ? ? ? ? ? ? ? I tried du from same client on ?a particular > ? ? ? ? ? ? ? ? ? ? ? ? ? ? directory twice and got these results. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? time du -sh /data/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? 3.7G /data/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? real 7m29.243s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? user 0m1.448s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? sys 0m7.067s > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? time du -sh /data/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? 3.7G ? ? ?/data/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? real 16m43.735s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? user 0m1.097s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? sys 0m5.802s > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? 16m and 7m is too long for a 3.7 G > ? ? ? ? ? ? ? ? ? ? ? ? ? ? directory. I must mention that the directory > ? ? ? ? ? ? ? ? ? ? ? ? ? ? contains huge number of files (208736) > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? but running du on same directory on old data > ? ? ? ? ? ? ? ? ? ? ? ? ? ? gives this result > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? time du -sh /olddata/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? 4.0G /olddata/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? real 3m1.255s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? user 0m0.755s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? sys 0m38.099s > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? much better if I run same command again > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? time du -sh /olddata/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? 4.0G /olddata/aa/bb/cc > ? ? ? ? ? ? ? ? ? ? ? ? ? ? real 0m8.309s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? user 0m0.313s > ? ? ? ? ? ? ? ? ? ? ? ? ? ? sys 0m7.755s > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Is there anything I can do to improve this > ? ? ? ? ? ? ? ? ? ? ? ? ? ? performance? I would also like hear from > ? ? ? ? ? ? ? ? ? ? ? ? ? ? some one who is running same kind of setup. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Thanks > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Kashif > > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? _______________________________________________ > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Gluster-users mailing list > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Gluster-users at gluster.org > ? ? ? ? ? ? ? ? ? ? ? ? ? ? <mailto:Gluster-users at gluster.org> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://lists.gluster.org/mailman/listinfo/gluster-users > ? ? ? ? ? ? ? ? ? ? ? ? ? ? <http://lists.gluster.org/mailman/listinfo/gluster-users> > > > > > > > >_______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170711/839b2c9d/attachment.html>
Hi Vijay Thanks, It would be great if someone can go through the configuration options. Is there any reference document where all these options are described in detail? I was mainly worried about very slow lookup so only did du on a certain file which has a lot of small files (200K). The lookup time improved dramatically. I didn't do any proper benchmarking. Gluster 3.8 without any optimization time du -ksh binno/ 3.7G binno/ real 117m45.733s user 0m1.635s sys 0m6.430s Gluster 3.11 with optimization time du -ksh binno/ 3.7G binno/ real 2m5.595s user 0m0.767s sys 0m4.437s I have also enabled profile Before update Fop Call Count Avg-Latency Min-Latency Max-Latency --- ---------- ----------- ----------- ----------- STAT 153 90.72 us 5.00 us 666.00 us STATFS 3 677.67 us 620.00 us 709.00 us OPENDIR 149 1213.81 us 519.00 us 28777.00 us LOOKUP 552 8493.01 us 3.00 us 79689.00 us READDIRP 3518 5351.76 us 11.00 us 341877.00 us FORGET 10050351 0 us 0 us 0 us RELEASE 9062130 0 us 0 us 0 us RELEASEDIR 5395 0 us 0 us 0 us ------ ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- After update Interval 8 Stats: %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 RELEASEDIR 0.08 118.00 us 113.00 us 123.00 us 2 STATFS 0.13 190.00 us 189.00 us 191.00 us 2 LOOKUP 0.29 422.00 us 422.00 us 422.00 us 2 OPENDIR 99.49 28539.60 us 1698.00 us 48655.00 us 10 READDIRP 0.00 0.00 us 0.00 us 0.00 us 5217 UPCALL 0.00 0.00 us 0.00 us 0.00 us 5217 CI_FORGET Duration: 22 seconds Data Read: 0 bytes Data Written: 0 bytes I am not sure about profiling result as I don't understand it correctly. Thanks Kashif On Tue, Jul 11, 2017 at 4:22 PM, Vijay Bellur <vbellur at redhat.com> wrote:> Hi Kashif, > > Thank you for your feedback! Do you have some data on the nature of > performance improvement observed with 3.11 in the new setup? > > Adding Raghavendra and Poornima for validation of configuration and help > with identifying why certain files disappeared from the mount point after > enabling readdir-optimize. > > Regards, > Vijay > > > On 07/11/2017 11:06 AM, mohammad kashif wrote: > >> Hi Vijay and Experts >> >> I didn't want to experiment with my production setup so started a >> parallel system with two server and around 80TB storage. First >> configured with gluster 3.8 and had the same lookup performance issue. >> Then upgraded to 3.11 as you suggested and it made huge improvement in >> lookup time. I also did some more optimization as suggested in other >> threads. >> Now I am going to update my production server. I am planning to use >> following optimization option, it would be very useful if you can point >> out any inconsistency or suggest some other options. My production setup >> has 5 servers consisting of 400TB storage and around 80 million files >> of varying lengths. >> >> Options Reconfigured: >> server.event-threads: 4 >> client.event-threads: 4 >> cluster.lookup-optimize: on >> cluster.readdir-optimize: off >> performance.client-io-threads: on >> performance.cache-size: 1GB >> performance.parallel-readdir: on >> performance.md-cache-timeout: 600 >> performance.cache-invalidation: on >> performance.stat-prefetch: on >> features.cache-invalidation-timeout: 600 >> features.cache-invalidation: on >> nfs.disable: on >> performance.readdir-ahead: on >> transport.address-family: inet >> auth.allow: 163.1.136.* >> diagnostics.latency-measurement: on >> diagnostics.count-fop-hits: on >> >> I found that setting cluster.readdir-optimize to 'on' made some files >> disappear from client ! >> >> Thanks >> >> Kashif >> >> >> >> On Sun, Jun 18, 2017 at 4:57 PM, Vijay Bellur <vbellur at redhat.com >> <mailto:vbellur at redhat.com>> wrote: >> >> Hi Mohammad, >> >> A lot of time is being spent in addressing metadata calls as >> expected. Can you consider testing out with 3.11 with md-cache [1] >> and readdirp [2] improvements? >> >> Adding Poornima and Raghavendra who worked on these enhancements to >> help out further. >> >> Thanks, >> Vijay >> >> [1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/ >> <https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/> >> >> [2] https://github.com/gluster/glusterfs/issues/166 >> <https://github.com/gluster/glusterfs/issues/166> >> >> On Fri, Jun 16, 2017 at 2:49 PM, mohammad kashif >> <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> wrote: >> >> Hi Vijay >> >> Did you manage to look into the gluster profile logs ? >> >> Thanks >> >> Kashif >> >> On Mon, Jun 12, 2017 at 11:40 AM, mohammad kashif >> <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> wrote: >> >> Hi Vijay >> >> I have enabled client profiling and used this script >> https://github.com/bengland2/gluster-profile-analysis/blob/m >> aster/gvp-client.sh >> <https://github.com/bengland2/gluster-profile-analysis/blob/ >> master/gvp-client.sh> >> to extract data. I am attaching output files. I don't have >> any reference data to compare with my output. Hopefully you >> can make some sense out of it. >> >> On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur >> <vbellur at redhat.com <mailto:vbellur at redhat.com>> wrote: >> >> Would it be possible for you to turn on client profiling >> and then run du? Instructions for turning on client >> profiling can be found at [1]. Providing the client >> profile information can help us figure out where the >> latency could be stemming from. >> >> Regards, >> Vijay >> >> [1] https://gluster.readthedocs.io >> /en/latest/Administrator%20Guide/Performance%20Testing/# >> client-side-profiling >> <https://gluster.readthedocs.i >> o/en/latest/Administrator%20Guide/Performance%20Testing/# >> client-side-profiling> >> >> On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif >> <kashif.alig at gmail.com <mailto:kashif.alig at gmail.com>> >> >> wrote: >> >> Hi Vijay >> >> Thanks for your quick response. I am using gluster >> 3.8.11 on Centos 7 servers >> glusterfs-3.8.11-1.el7.x86_64 >> >> clients are centos 6 but I tested with a centos 7 >> client as well and results didn't change >> >> gluster volume info Volume Name: atlasglust >> Type: Distribute >> Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 5 >> Transport-type: tcp >> Bricks: >> Brick1: pplxgluster01.x.y.z:/glusterat >> las/brick001/gv0 >> Brick2: pplxgluster02..x.y.z:/glustera >> tlas/brick002/gv0 >> Brick3: pplxgluster03.x.y.z:/glusterat >> las/brick003/gv0 >> Brick4: pplxgluster04.x.y.z:/glusterat >> las/brick004/gv0 >> Brick5: pplxgluster05.x.y.z:/glusterat >> las/brick005/gv0 >> Options Reconfigured: >> nfs.disable: on >> performance.readdir-ahead: on >> transport.address-family: inet >> auth.allow: x.y.z >> >> I am not using directory quota. >> >> Please let me know if you require some more info >> >> Thanks >> >> Kashif >> >> >> >> On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur >> <vbellur at redhat.com <mailto:vbellur at redhat.com>> >> wrote: >> >> Can you please provide more details about your >> volume configuration and the version of gluster >> that you are using? >> >> Regards, >> Vijay >> >> On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif >> <kashif.alig at gmail.com >> <mailto:kashif.alig at gmail.com>> wrote: >> >> Hi >> >> I have just moved our 400 TB HPC storage >> from lustre to gluster. It is part of a >> research institute and users have very small >> files to big files ( few KB to 20GB) . Our >> setup consists of 5 servers, each with 96TB >> RAID 6 disks. All servers are connected >> through 10G ethernet but not all clients. >> Gluster volumes are distributed without any >> replication. There are approximately 80 >> million files in file system. >> I am mounting using glusterfs on clients. >> >> I have copied everything from lustre to >> gluster but old file system exist so I can >> compare. >> >> The problem, I am facing is extremely slow >> du on even a small directory. Also the time >> taken is substantially different each time. >> I tried du from same client on a particular >> directory twice and got these results. >> >> time du -sh /data/aa/bb/cc >> 3.7G /data/aa/bb/cc >> real 7m29.243s >> user 0m1.448s >> sys 0m7.067s >> >> time du -sh /data/aa/bb/cc >> 3.7G /data/aa/bb/cc >> real 16m43.735s >> user 0m1.097s >> sys 0m5.802s >> >> 16m and 7m is too long for a 3.7 G >> directory. I must mention that the directory >> contains huge number of files (208736) >> >> but running du on same directory on old data >> gives this result >> >> time du -sh /olddata/aa/bb/cc >> 4.0G /olddata/aa/bb/cc >> real 3m1.255s >> user 0m0.755s >> sys 0m38.099s >> >> much better if I run same command again >> >> time du -sh /olddata/aa/bb/cc >> 4.0G /olddata/aa/bb/cc >> real 0m8.309s >> user 0m0.313s >> sys 0m7.755s >> >> Is there anything I can do to improve this >> performance? I would also like hear from >> some one who is running same kind of setup. >> >> Thanks >> >> Kashif >> >> >> >> ______________________________ >> _________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> <mailto:Gluster-users at gluster.org> >> http://lists.gluster.org/mailm >> an/listinfo/gluster-users >> <http://lists.gluster.org/mail >> man/listinfo/gluster-users> >> >> >> >> >> >> >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170712/0c654b09/attachment.html>