Raghavendra Gowdappa
2018-Jul-15 01:59 UTC
[Gluster-users] Slow write times to gluster disk
On 6/30/18, Raghavendra Gowdappa <rgowdapp at redhat.com> wrote:> On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley <phaley at mit.edu> wrote: > >> >> Hi Raghavendra, >> >> We ran the tests (write tests) and I copied the log files for both the >> server and the client to http://mseas.mit.edu/download/ >> phaley/GlusterUsers/2018/Jun29/ . Is there any additional trace >> information you need? (If so, where should I look for it?) >> > > Nothing for now. I can see from logs that workaround is not helping. fstat > requests are not absorbed by md-cache and read-ahead is witnessing them and > flushing its read-ahead cache. I am investigating more on md-cache (It also > seems to be invalidating inodes quite frequently which actually might be > the root cause of seeing so many fstat requests from kernel). Will post > when I find anything relevant.+Poornima. @Poornima, Can you investigate why fstats sent by kernel are not absorbed by md-cache in sequential read tests? Note that md-cache doesn't flush its metadata cache on reads (which can be a bug for applications requiring strict atime consistency). So, I am expecting fstats should've absorbed by it. regards, Raghavendra> > >> Also the volume information you requested >> >> [root at mseas-data2 ~]# gluster volume info data-volume >> >> Volume Name: data-volume >> Type: Distribute >> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 >> Status: Started >> Number of Bricks: 2 >> Transport-type: tcp >> Bricks: >> Brick1: mseas-data2:/mnt/brick1 >> Brick2: mseas-data2:/mnt/brick2 >> Options Reconfigured: >> diagnostics.client-log-level: TRACE >> network.inode-lru-limit: 50000 >> performance.md-cache-timeout: 60 >> performance.open-behind: off >> disperse.eager-lock: off >> auth.allow: * >> server.allow-insecure: on >> nfs.exports-auth-enable: on >> diagnostics.brick-sys-log-level: WARNING >> performance.readdir-ahead: on >> nfs.disable: on >> nfs.export-volumes: off >> [root at mseas-data2 ~]# >> >> >> On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote: >> >> >> >> On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley <phaley at mit.edu> wrote: >> >>> >>> Hi Raghavendra, >>> >>> Our technician was able to try the manual setting today. He found that >>> our upper limit for performance.md-cache-timeout was 60 not 600, so he >>> used that value, along with the network.inode-lru-limit=50000. >>> >>> The result was another small (~1%) increase in speed. Does this suggest >>> some addition tests/changes we could try? >>> >> >> Can you set gluster option diagnostics.client-log-level to TRACE and run >> sequential read tests again (with md-cache-timeout value of 60)? >> >> #gluster volume set <volname> diagnostics.client-log-level TRACE >> >> Also are you sure that open-behind was turned off? Can you give the >> output >> of, >> >> # gluster volume info <volname> >> >> >>> Thanks >>> >>> Pat >>> >>> >>> >>> >>> On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote: >>> >>> >>> >>> On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley <phaley at mit.edu> wrote: >>> >>>> >>>> Hi Raghavendra, >>>> >>>> Setting the performance.write-behind off had a small improvement on the >>>> write speed (~3%), >>>> >>>> We were unable to turn on "group metadata-cache". When we try get >>>> errors like >>>> >>>> # gluster volume set data-volume group metadata-cache >>>> '/var/lib/glusterd/groups/metadata-cache' file format not valid. >>>> >>>> Was metadata-cache available for gluster 3.7.11? We ask because the >>>> release notes for 3.11 mentions ?Feature for metadata-caching/small >>>> file >>>> performance is production ready.? (https://gluster.readthedocs.i >>>> o/en/latest/release-notes/3.11.0/). >>>> >>>> Do any of these results suggest anything? If not, what further tests >>>> would be useful? >>>> >>> >>> Group metadata-cache is just a bunch of options one sets on a volume. >>> So, >>> You can set them manually using gluster cli. Following are the options >>> and >>> their values: >>> >>> performance.md-cache-timeout=600 >>> network.inode-lru-limit=50000 >>> >>> >>> >>>> Thanks >>>> >>>> Pat >>>> >>>> >>>> >>>> >>>> On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote: >>>> >>>> >>>> >>>> On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley <phaley at mit.edu> wrote: >>>> >>>>> >>>>> Hi Raghavendra, >>>>> >>>>> Thanks for the suggestions. Our technician will be in on Monday. >>>>> We'll test then and let you know the results. >>>>> >>>>> One question I have, is the "group metadata-cache" option supposed to >>>>> directly impact the performance or is it to help collect data? If the >>>>> latter, where will the data be located? >>>>> >>>> >>>> It impacts performance. >>>> >>>> >>>>> Thanks again. >>>>> >>>>> Pat >>>>> >>>>> >>>>> >>>>> On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote: >>>>> >>>>> >>>>> >>>>> On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa < >>>>> rgowdapp at redhat.com> wrote: >>>>> >>>>>> For the case of writes to glusterfs mount, >>>>>> >>>>>> I saw in earlier conversations that there are too many lookups, but >>>>>> small number of writes. Since writes cached in write-behind would >>>>>> invalidate metadata cache, lookups won't be absorbed by md-cache. I >>>>>> am >>>>>> wondering what would results look like if we turn off >>>>>> performance.write-behind. >>>>>> >>>>>> @Pat, >>>>>> >>>>>> Can you set, >>>>>> >>>>>> # gluster volume set <volname> performance.write-behind off >>>>>> >>>>> >>>>> Please turn on "group metadata-cache" for write tests too. >>>>> >>>>> >>>>>> and redo the tests writing to glusterfs mount? Let us know about the >>>>>> results you see. >>>>>> >>>>>> regards, >>>>>> Raghavendra >>>>>> >>>>>> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa < >>>>>> rgowdapp at redhat.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra Gowdappa < >>>>>>> rgowdapp at redhat.com> wrote: >>>>>>> >>>>>>>> For the case of reading from Glusterfs mount, read-ahead should >>>>>>>> help. However, we've known issues with read-ahead[1][2]. To work >>>>>>>> around >>>>>>>> these, can you try with, >>>>>>>> >>>>>>>> 1. Turn off performance.open-behind >>>>>>>> #gluster volume set <volname> performance.open-behind off >>>>>>>> >>>>>>>> 2. enable group meta metadata-cache >>>>>>>> # gluster volume set <volname> group metadata-cache >>>>>>>> >>>>>>> >>>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1084508 >>>>>>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1214489 >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley <phaley at mit.edu> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> We were recently revisiting our problems with the slowness of >>>>>>>>> gluster writes (http://lists.gluster.org/pipe >>>>>>>>> rmail/gluster-users/2017-April/030529.html). Specifically we were >>>>>>>>> testing the suggestions in a recent post ( >>>>>>>>> http://lists.gluster.org/pipermail/gluster-users/2018-March >>>>>>>>> /033699.html). The first two suggestions (specifying a >>>>>>>>> negative-timeout in the mount settings or adding >>>>>>>>> rpc-auth-allow-insecure to >>>>>>>>> glusterd.vol) did not improve our performance, while setting >>>>>>>>> "disperse.eager-lock off" provided a tiny (5%) speed-up. >>>>>>>>> >>>>>>>>> Some of the various tests we have tried earlier can be seen in the >>>>>>>>> links below. Do any of the above observations suggest what we >>>>>>>>> could try >>>>>>>>> next to either improve the speed or debug the issue? Thanks >>>>>>>>> >>>>>>>>> http://lists.gluster.org/pipermail/gluster-users/2017-June/0 >>>>>>>>> 31565.html >>>>>>>>> http://lists.gluster.org/pipermail/gluster-users/2017-May/03 >>>>>>>>> 0937.html >>>>>>>>> >>>>>>>>> Pat >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>>>>>> Pat Haley Email: phaley at mit.edu >>>>>>>>> Center for Ocean Engineering Phone: (617) 253-6824 >>>>>>>>> Dept. of Mechanical Engineering Fax: (617) 253-8125 >>>>>>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >>>>>>>>> 77 Massachusetts Avenue >>>>>>>>> Cambridge, MA 02139-4301 >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Gluster-users mailing list >>>>>>>>> Gluster-users at gluster.org >>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> >>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>> Pat Haley Email: phaley at mit.edu >>>>> Center for Ocean Engineering Phone: (617) 253-6824 >>>>> Dept. of Mechanical Engineering Fax: (617) 253-8125 >>>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >>>>> 77 Massachusetts Avenue >>>>> Cambridge, MA 02139-4301 >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>>> >>>> -- >>>> >>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>> Pat Haley Email: phaley at mit.edu >>>> Center for Ocean Engineering Phone: (617) 253-6824 >>>> Dept. of Mechanical Engineering Fax: (617) 253-8125 >>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >>>> 77 Massachusetts Avenue >>>> Cambridge, MA 02139-4301 >>>> >>>> >>> >>> -- >>> >>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>> Pat Haley Email: phaley at mit.edu >>> Center for Ocean Engineering Phone: (617) 253-6824 >>> Dept. of Mechanical Engineering Fax: (617) 253-8125 >>> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >>> 77 Massachusetts Avenue >>> Cambridge, MA 02139-4301 >>> >>> >> >> -- >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> Pat Haley Email: phaley at mit.edu >> Center for Ocean Engineering Phone: (617) 253-6824 >> Dept. of Mechanical Engineering Fax: (617) 253-8125 >> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >> 77 Massachusetts Avenue >> Cambridge, MA 02139-4301 >> >> >