Hi Vijay Thanks for your quick response. I am using gluster 3.8.11 on Centos 7 servers glusterfs-3.8.11-1.el7.x86_64 clients are centos 6 but I tested with a centos 7 client as well and results didn't change gluster volume info Volume Name: atlasglust Type: Distribute Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b Status: Started Snapshot Count: 0 Number of Bricks: 5 Transport-type: tcp Bricks: Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet auth.allow: x.y.z I am not using directory quota. Please let me know if you require some more info Thanks Kashif On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur <vbellur at redhat.com> wrote:> Can you please provide more details about your volume configuration and > the version of gluster that you are using? > > Regards, > Vijay > > On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig at gmail.com> > wrote: > >> Hi >> >> I have just moved our 400 TB HPC storage from lustre to gluster. It is >> part of a research institute and users have very small files to big files >> ( few KB to 20GB) . Our setup consists of 5 servers, each with 96TB RAID 6 >> disks. All servers are connected through 10G ethernet but not all clients. >> Gluster volumes are distributed without any replication. There are >> approximately 80 million files in file system. >> I am mounting using glusterfs on clients. >> >> I have copied everything from lustre to gluster but old file system exist >> so I can compare. >> >> The problem, I am facing is extremely slow du on even a small directory. >> Also the time taken is substantially different each time. >> I tried du from same client on a particular directory twice and got >> these results. >> >> time du -sh /data/aa/bb/cc >> 3.7G /data/aa/bb/cc >> real 7m29.243s >> user 0m1.448s >> sys 0m7.067s >> >> time du -sh /data/aa/bb/cc >> 3.7G /data/aa/bb/cc >> real 16m43.735s >> user 0m1.097s >> sys 0m5.802s >> >> 16m and 7m is too long for a 3.7 G directory. I must mention that the >> directory contains huge number of files (208736) >> >> but running du on same directory on old data gives this result >> >> time du -sh /olddata/aa/bb/cc >> 4.0G /olddata/aa/bb/cc >> real 3m1.255s >> user 0m0.755s >> sys 0m38.099s >> >> much better if I run same command again >> >> time du -sh /olddata/aa/bb/cc >> 4.0G /olddata/aa/bb/cc >> real 0m8.309s >> user 0m0.313s >> sys 0m7.755s >> >> Is there anything I can do to improve this performance? I would also like >> hear from some one who is running same kind of setup. >> >> Thanks >> >> Kashif >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170609/b444b99e/attachment.html>
Would it be possible for you to turn on client profiling and then run du? Instructions for turning on client profiling can be found at [1]. Providing the client profile information can help us figure out where the latency could be stemming from. Regards, Vijay [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Performance%20Testing/#client-side-profiling On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif <kashif.alig at gmail.com> wrote:> Hi Vijay > > Thanks for your quick response. I am using gluster 3.8.11 on Centos 7 > servers > glusterfs-3.8.11-1.el7.x86_64 > > clients are centos 6 but I tested with a centos 7 client as well and > results didn't change > > gluster volume info Volume Name: atlasglust > Type: Distribute > Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b > Status: Started > Snapshot Count: 0 > Number of Bricks: 5 > Transport-type: tcp > Bricks: > Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 > Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 > Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 > Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 > Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 > Options Reconfigured: > nfs.disable: on > performance.readdir-ahead: on > transport.address-family: inet > auth.allow: x.y.z > > I am not using directory quota. > > Please let me know if you require some more info > > Thanks > > Kashif > > > > On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur <vbellur at redhat.com> wrote: > >> Can you please provide more details about your volume configuration and >> the version of gluster that you are using? >> >> Regards, >> Vijay >> >> On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig at gmail.com> >> wrote: >> >>> Hi >>> >>> I have just moved our 400 TB HPC storage from lustre to gluster. It is >>> part of a research institute and users have very small files to big files >>> ( few KB to 20GB) . Our setup consists of 5 servers, each with 96TB RAID 6 >>> disks. All servers are connected through 10G ethernet but not all clients. >>> Gluster volumes are distributed without any replication. There are >>> approximately 80 million files in file system. >>> I am mounting using glusterfs on clients. >>> >>> I have copied everything from lustre to gluster but old file system >>> exist so I can compare. >>> >>> The problem, I am facing is extremely slow du on even a small directory. >>> Also the time taken is substantially different each time. >>> I tried du from same client on a particular directory twice and got >>> these results. >>> >>> time du -sh /data/aa/bb/cc >>> 3.7G /data/aa/bb/cc >>> real 7m29.243s >>> user 0m1.448s >>> sys 0m7.067s >>> >>> time du -sh /data/aa/bb/cc >>> 3.7G /data/aa/bb/cc >>> real 16m43.735s >>> user 0m1.097s >>> sys 0m5.802s >>> >>> 16m and 7m is too long for a 3.7 G directory. I must mention that the >>> directory contains huge number of files (208736) >>> >>> but running du on same directory on old data gives this result >>> >>> time du -sh /olddata/aa/bb/cc >>> 4.0G /olddata/aa/bb/cc >>> real 3m1.255s >>> user 0m0.755s >>> sys 0m38.099s >>> >>> much better if I run same command again >>> >>> time du -sh /olddata/aa/bb/cc >>> 4.0G /olddata/aa/bb/cc >>> real 0m8.309s >>> user 0m0.313s >>> sys 0m7.755s >>> >>> Is there anything I can do to improve this performance? I would also >>> like hear from some one who is running same kind of setup. >>> >>> Thanks >>> >>> Kashif >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170610/237fed6d/attachment.html>
Hi Vijay I have enabled client profiling and used this script https://github.com/bengland2/gluster-profile-analysis/blob/master/gvp-client.sh to extract data. I am attaching output files. I don't have any reference data to compare with my output. Hopefully you can make some sense out of it. On Sat, Jun 10, 2017 at 10:47 AM, Vijay Bellur <vbellur at redhat.com> wrote:> Would it be possible for you to turn on client profiling and then run du? > Instructions for turning on client profiling can be found at [1]. Providing > the client profile information can help us figure out where the latency > could be stemming from. > > Regards, > Vijay > > [1] https://gluster.readthedocs.io/en/latest/Administrator%20Guide/ > Performance%20Testing/#client-side-profiling > > On Fri, Jun 9, 2017 at 7:22 PM, mohammad kashif <kashif.alig at gmail.com> > wrote: > >> Hi Vijay >> >> Thanks for your quick response. I am using gluster 3.8.11 on Centos 7 >> servers >> glusterfs-3.8.11-1.el7.x86_64 >> >> clients are centos 6 but I tested with a centos 7 client as well and >> results didn't change >> >> gluster volume info Volume Name: atlasglust >> Type: Distribute >> Volume ID: fbf0ebb8-deab-4388-9d8a-f722618a624b >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 5 >> Transport-type: tcp >> Bricks: >> Brick1: pplxgluster01.x.y.z:/glusteratlas/brick001/gv0 >> Brick2: pplxgluster02..x.y.z:/glusteratlas/brick002/gv0 >> Brick3: pplxgluster03.x.y.z:/glusteratlas/brick003/gv0 >> Brick4: pplxgluster04.x.y.z:/glusteratlas/brick004/gv0 >> Brick5: pplxgluster05.x.y.z:/glusteratlas/brick005/gv0 >> Options Reconfigured: >> nfs.disable: on >> performance.readdir-ahead: on >> transport.address-family: inet >> auth.allow: x.y.z >> >> I am not using directory quota. >> >> Please let me know if you require some more info >> >> Thanks >> >> Kashif >> >> >> >> On Fri, Jun 9, 2017 at 2:34 PM, Vijay Bellur <vbellur at redhat.com> wrote: >> >>> Can you please provide more details about your volume configuration and >>> the version of gluster that you are using? >>> >>> Regards, >>> Vijay >>> >>> On Fri, Jun 9, 2017 at 5:35 PM, mohammad kashif <kashif.alig at gmail.com> >>> wrote: >>> >>>> Hi >>>> >>>> I have just moved our 400 TB HPC storage from lustre to gluster. It is >>>> part of a research institute and users have very small files to big files >>>> ( few KB to 20GB) . Our setup consists of 5 servers, each with 96TB RAID 6 >>>> disks. All servers are connected through 10G ethernet but not all clients. >>>> Gluster volumes are distributed without any replication. There are >>>> approximately 80 million files in file system. >>>> I am mounting using glusterfs on clients. >>>> >>>> I have copied everything from lustre to gluster but old file system >>>> exist so I can compare. >>>> >>>> The problem, I am facing is extremely slow du on even a small >>>> directory. Also the time taken is substantially different each time. >>>> I tried du from same client on a particular directory twice and got >>>> these results. >>>> >>>> time du -sh /data/aa/bb/cc >>>> 3.7G /data/aa/bb/cc >>>> real 7m29.243s >>>> user 0m1.448s >>>> sys 0m7.067s >>>> >>>> time du -sh /data/aa/bb/cc >>>> 3.7G /data/aa/bb/cc >>>> real 16m43.735s >>>> user 0m1.097s >>>> sys 0m5.802s >>>> >>>> 16m and 7m is too long for a 3.7 G directory. I must mention that the >>>> directory contains huge number of files (208736) >>>> >>>> but running du on same directory on old data gives this result >>>> >>>> time du -sh /olddata/aa/bb/cc >>>> 4.0G /olddata/aa/bb/cc >>>> real 3m1.255s >>>> user 0m0.755s >>>> sys 0m38.099s >>>> >>>> much better if I run same command again >>>> >>>> time du -sh /olddata/aa/bb/cc >>>> 4.0G /olddata/aa/bb/cc >>>> real 0m8.309s >>>> user 0m0.313s >>>> sys 0m7.755s >>>> >>>> Is there anything I can do to improve this performance? I would also >>>> like hear from some one who is running same kind of setup. >>>> >>>> Thanks >>>> >>>> Kashif >>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170612/fc04aa2c/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: gluster_profile.log Type: application/octet-stream Size: 5026269 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170612/fc04aa2c/attachment-0002.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: gvp.log Type: application/octet-stream Size: 125737 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170612/fc04aa2c/attachment-0003.obj>