Hi Vijay I have enabled client profiling and used this script https://github.com/bengland2/gluster-profile-analysis/blob/master/gvp-client.sh to extract data. I am attaching output files. I don't have any reference data to compare with my output. Hopefully you can make some sense out of it. On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur <vbellur at redhat.com> wrote:> Would it be possible for you to turn on client profiling and then run du? > Instructions for turning on client profiling can be found at [1]. Providing > the client profile information can help us figure out where the latency > could be stemming from. > > Regards, > Vijay > > [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/ > Performance%20Testing/#client-side-profiling > > On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif <kashif.alig at gmail.com> > wrote: > >> Hi Vijay >> >> Thanks for your quick response. I am using gluster 3.8.11 on Centos 7 >> servers >> glusterfs-3.8.11-1.el7.x86_64 >> >> clients are centos 6 but I tested with a centos 7 client as well and >> results didn't change >> >> gluster volume info Volume Name: atlasglust >> Type: Distribute >> Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 5 >> Transport-type: tcp >> Bricks: >> Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 >> Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 >> Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 >> Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 >> Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 >> Options Reconfigured: >> nfs.disable: on >> performance.readdir-ahead: on >> transport.address-family: inet >> auth.allow: x.y.z >> >> I am not using directory quota. >> >> Please let me know if you require some more info >> >> Thanks >> >> Kashif >> >> >> >> On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur <vbellur at redhat.com> wrote: >> >>> Can you please provide more details about your volume configuration and >>> the version of gluster that you are using? >>> >>> Regards, >>> Vijay >>> >>> On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig at gmail.com> >>> wrote: >>> >>>> Hi >>>> >>>> I have just moved our 400 TB HPC storage from lustre to gluster. It is >>>> part of a research institute and users have very small files to big files >>>> ( few KB to 20GB) . Our setup consists of 5 servers, each with 96TB RAID 6 >>>> disks. All servers are connected through 10G ethernet but not all clients. >>>> Gluster volumes are distributed without any replication. There are >>>> approximately 80 million files in file system. >>>> I am mounting using glusterfs on clients. >>>> >>>> I have copied everything from lustre to gluster but old file system >>>> exist so I can compare. >>>> >>>> The problem, I am facing is extremely slow du on even a small >>>> directory. Also the time taken is substantially different each time. >>>> I tried du from same client on a particular directory twice and got >>>> these results. >>>> >>>> time du -sh /data/aa/bb/cc >>>> 3.7G /data/aa/bb/cc >>>> real 7m29.243s >>>> user 0m1.448s >>>> sys 0m7.067s >>>> >>>> time du -sh /data/aa/bb/cc >>>> 3.7G /data/aa/bb/cc >>>> real 16m43.735s >>>> user 0m1.097s >>>> sys 0m5.802s >>>> >>>> 16m and 7m is too long for a 3.7 G directory. I must mention that the >>>> directory contains huge number of files (208736) >>>> >>>> but running du on same directory on old data gives this result >>>> >>>> time du -sh /olddata/aa/bb/cc >>>> 4.0G /olddata/aa/bb/cc >>>> real 3m1.255s >>>> user 0m0.755s >>>> sys 0m38.099s >>>> >>>> much better if I run same command again >>>> >>>> time du -sh /olddata/aa/bb/cc >>>> 4.0G /olddata/aa/bb/cc >>>> real 0m8.309s >>>> user 0m0.313s >>>> sys 0m7.755s >>>> >>>> Is there anything I can do to improve this performance? I would also >>>> like hear from some one who is running same kind of setup. >>>> >>>> Thanks >>>> >>>> Kashif >>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170612/fc04aa2c/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: gluster_profile.log Type: application/octet-stream Size: 5026269 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170612/fc04aa2c/attachment-0002.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: gvp.log Type: application/octet-stream Size: 125737 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170612/fc04aa2c/attachment-0003.obj>
Hi Vijay Did you manage to look into the gluster profile logs ? Thanks Kashif On Mon, Jun 12, 2017 at 11:40 AM, mohammad kashif <kashif.alig at gmail.com> wrote:> Hi Vijay > > I have enabled client profiling and used this script > https://github.com/bengland2/gluster-profile-analysis/blob/ > master/gvp-client.sh to extract data. I am attaching output files. I > don't have any reference data to compare with my output. Hopefully you can > make some sense out of it. > > On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur <vbellur at redhat.com> wrote: > >> Would it be possible for you to turn on client profiling and then run du? >> Instructions for turning on client profiling can be found at [1]. Providing >> the client profile information can help us figure out where the latency >> could be stemming from. >> >> Regards, >> Vijay >> >> [1] https://gluster.readthedocs.io/en/latest/Administrator% >> 20Guide/Performance%20Testing/#client-side-profiling >> >> On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif <kashif.alig at gmail.com> >> wrote: >> >>> Hi Vijay >>> >>> Thanks for your quick response. I am using gluster 3.8.11 on Centos 7 >>> servers >>> glusterfs-3.8.11-1.el7.x86_64 >>> >>> clients are centos 6 but I tested with a centos 7 client as well and >>> results didn't change >>> >>> gluster volume info Volume Name: atlasglust >>> Type: Distribute >>> Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 5 >>> Transport-type: tcp >>> Bricks: >>> Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 >>> Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 >>> Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 >>> Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 >>> Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 >>> Options Reconfigured: >>> nfs.disable: on >>> performance.readdir-ahead: on >>> transport.address-family: inet >>> auth.allow: x.y.z >>> >>> I am not using directory quota. >>> >>> Please let me know if you require some more info >>> >>> Thanks >>> >>> Kashif >>> >>> >>> >>> On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur <vbellur at redhat.com> wrote: >>> >>>> Can you please provide more details about your volume configuration and >>>> the version of gluster that you are using? >>>> >>>> Regards, >>>> Vijay >>>> >>>> On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig at gmail.com> >>>> wrote: >>>> >>>>> Hi >>>>> >>>>> I have just moved our 400 TB HPC storage from lustre to gluster. It is >>>>> part of a research institute and users have very small files to big files >>>>> ( few KB to 20GB) . Our setup consists of 5 servers, each with 96TB RAID 6 >>>>> disks. All servers are connected through 10G ethernet but not all clients. >>>>> Gluster volumes are distributed without any replication. There are >>>>> approximately 80 million files in file system. >>>>> I am mounting using glusterfs on clients. >>>>> >>>>> I have copied everything from lustre to gluster but old file system >>>>> exist so I can compare. >>>>> >>>>> The problem, I am facing is extremely slow du on even a small >>>>> directory. Also the time taken is substantially different each time. >>>>> I tried du from same client on a particular directory twice and got >>>>> these results. >>>>> >>>>> time du -sh /data/aa/bb/cc >>>>> 3.7G /data/aa/bb/cc >>>>> real 7m29.243s >>>>> user 0m1.448s >>>>> sys 0m7.067s >>>>> >>>>> time du -sh /data/aa/bb/cc >>>>> 3.7G /data/aa/bb/cc >>>>> real 16m43.735s >>>>> user 0m1.097s >>>>> sys 0m5.802s >>>>> >>>>> 16m and 7m is too long for a 3.7 G directory. I must mention that the >>>>> directory contains huge number of files (208736) >>>>> >>>>> but running du on same directory on old data gives this result >>>>> >>>>> time du -sh /olddata/aa/bb/cc >>>>> 4.0G /olddata/aa/bb/cc >>>>> real 3m1.255s >>>>> user 0m0.755s >>>>> sys 0m38.099s >>>>> >>>>> much better if I run same command again >>>>> >>>>> time du -sh /olddata/aa/bb/cc >>>>> 4.0G /olddata/aa/bb/cc >>>>> real 0m8.309s >>>>> user 0m0.313s >>>>> sys 0m7.755s >>>>> >>>>> Is there anything I can do to improve this performance? I would also >>>>> like hear from some one who is running same kind of setup. >>>>> >>>>> Thanks >>>>> >>>>> Kashif >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170616/90e090de/attachment.html>
Hi Mohammad, A lot of time is being spent in addressing metadata calls as expected. Can you consider testing out with 3.11 with md-cache [1] and readdirp [2] improvements? Adding Poornima and Raghavendra who worked on these enhancements to help out further. Thanks, Vijay [1] https://gluster.readthedocs.io/en/latest/release-notes/3.9.0/ [2] https://github.com/gluster/glusterfs/issues/166 On Fri, Jun 16, 2017 at 2:49 PM, mohammad kashif <kashif.alig at gmail.com> wrote:> Hi Vijay > > Did you manage to look into the gluster profile logs ? > > Thanks > > Kashif > > On Mon, Jun 12, 2017 at 11:40 AM, mohammad kashif <kashif.alig at gmail.com> > wrote: > >> Hi Vijay >> >> I have enabled client profiling and used this script >> https://github.com/bengland2/gluster-profile-analysis/blob/m >> aster/gvp-client.sh to extract data. I am attaching output files. I >> don't have any reference data to compare with my output. Hopefully you can >> make some sense out of it. >> >> On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur <vbellur at redhat.com> >> wrote: >> >>> Would it be possible for you to turn on client profiling and then run >>> du? Instructions for turning on client profiling can be found at [1]. >>> Providing the client profile information can help us figure out where the >>> latency could be stemming from. >>> >>> Regards, >>> Vijay >>> >>> [1] https://gluster.readthedocs.io/en/latest/Administrator%2 >>> 0Guide/Performance%20Testing/#client-side-profiling >>> >>> On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif <kashif.alig at gmail.com> >>> wrote: >>> >>>> Hi Vijay >>>> >>>> Thanks for your quick response. I am using gluster 3.8.11 on Centos 7 >>>> servers >>>> glusterfs-3.8.11-1.el7.x86_64 >>>> >>>> clients are centos 6 but I tested with a centos 7 client as well and >>>> results didn't change >>>> >>>> gluster volume info Volume Name: atlasglust >>>> Type: Distribute >>>> Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 5 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 >>>> Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 >>>> Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 >>>> Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 >>>> Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 >>>> Options Reconfigured: >>>> nfs.disable: on >>>> performance.readdir-ahead: on >>>> transport.address-family: inet >>>> auth.allow: x.y.z >>>> >>>> I am not using directory quota. >>>> >>>> Please let me know if you require some more info >>>> >>>> Thanks >>>> >>>> Kashif >>>> >>>> >>>> >>>> On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur <vbellur at redhat.com> >>>> wrote: >>>> >>>>> Can you please provide more details about your volume configuration >>>>> and the version of gluster that you are using? >>>>> >>>>> Regards, >>>>> Vijay >>>>> >>>>> On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig at gmail.com >>>>> > wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> I have just moved our 400 TB HPC storage from lustre to gluster. It >>>>>> is part of a research institute and users have very small files to big >>>>>> files ( few KB to 20GB) . Our setup consists of 5 servers, each with 96TB >>>>>> RAID 6 disks. All servers are connected through 10G ethernet but not all >>>>>> clients. Gluster volumes are distributed without any replication. There >>>>>> are approximately 80 million files in file system. >>>>>> I am mounting using glusterfs on clients. >>>>>> >>>>>> I have copied everything from lustre to gluster but old file system >>>>>> exist so I can compare. >>>>>> >>>>>> The problem, I am facing is extremely slow du on even a small >>>>>> directory. Also the time taken is substantially different each time. >>>>>> I tried du from same client on a particular directory twice and got >>>>>> these results. >>>>>> >>>>>> time du -sh /data/aa/bb/cc >>>>>> 3.7G /data/aa/bb/cc >>>>>> real 7m29.243s >>>>>> user 0m1.448s >>>>>> sys 0m7.067s >>>>>> >>>>>> time du -sh /data/aa/bb/cc >>>>>> 3.7G /data/aa/bb/cc >>>>>> real 16m43.735s >>>>>> user 0m1.097s >>>>>> sys 0m5.802s >>>>>> >>>>>> 16m and 7m is too long for a 3.7 G directory. I must mention that the >>>>>> directory contains huge number of files (208736) >>>>>> >>>>>> but running du on same directory on old data gives this result >>>>>> >>>>>> time du -sh /olddata/aa/bb/cc >>>>>> 4.0G /olddata/aa/bb/cc >>>>>> real 3m1.255s >>>>>> user 0m0.755s >>>>>> sys 0m38.099s >>>>>> >>>>>> much better if I run same command again >>>>>> >>>>>> time du -sh /olddata/aa/bb/cc >>>>>> 4.0G /olddata/aa/bb/cc >>>>>> real 0m8.309s >>>>>> user 0m0.313s >>>>>> sys 0m7.755s >>>>>> >>>>>> Is there anything I can do to improve this performance? I would also >>>>>> like hear from some one who is running same kind of setup. >>>>>> >>>>>> Thanks >>>>>> >>>>>> Kashif >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170618/1cf67c8d/attachment.html>