thr3ads.net - Gluster users - [Gluster-users] Slow write times to gluster disk [Jul 2018]

If this information is useful, please help other people find it:
Share via:
Raghavendra Gowdappa
2018-Jul-15 01:59 UTC
[Gluster-users] Slow write times to gluster disk

On 6/30/18, Raghavendra Gowdappa <rgowdapp at redhat.com>
wrote:> On Fri, Jun 29, 2018 at 10:38 PM, Pat Haley <phaley at mit.edu>
wrote:
>
>>
>> Hi Raghavendra,
>>
>> We ran the tests (write tests) and I copied the log files for both the
>> server and the client to http://mseas.mit.edu/download/
>> phaley/GlusterUsers/2018/Jun29/ .  Is there any additional trace
>> information you need?  (If so, where should I look for it?)
>>
>
> Nothing for now. I can see from logs that workaround is not helping. fstat
> requests are not absorbed by md-cache and read-ahead is witnessing them and
> flushing its read-ahead cache. I am investigating more on md-cache (It also
> seems to be invalidating inodes quite frequently which actually might be
> the root cause of seeing so many fstat requests from kernel). Will post
> when I find anything relevant.
+Poornima.

@Poornima,

Can you investigate why fstats sent by kernel are not absorbed by
md-cache in sequential read tests? Note that md-cache doesn't flush
its metadata cache on reads (which can be a bug for applications
requiring strict atime consistency). So, I am expecting fstats
should've absorbed by it.

regards,
Raghavendra
>
>
>> Also the volume information you requested
>>
>> [root at mseas-data2 ~]# gluster volume info data-volume
>>
>> Volume Name: data-volume
>> Type: Distribute
>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>> Status: Started
>> Number of Bricks: 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: mseas-data2:/mnt/brick1
>> Brick2: mseas-data2:/mnt/brick2
>> Options Reconfigured:
>> diagnostics.client-log-level: TRACE
>> network.inode-lru-limit: 50000
>> performance.md-cache-timeout: 60
>> performance.open-behind: off
>> disperse.eager-lock: off
>> auth.allow: *
>> server.allow-insecure: on
>> nfs.exports-auth-enable: on
>> diagnostics.brick-sys-log-level: WARNING
>> performance.readdir-ahead: on
>> nfs.disable: on
>> nfs.export-volumes: off
>> [root at mseas-data2 ~]#
>>
>>
>> On 06/29/2018 12:28 PM, Raghavendra Gowdappa wrote:
>>
>>
>>
>> On Fri, Jun 29, 2018 at 8:24 PM, Pat Haley <phaley at mit.edu>
wrote:
>>
>>>
>>> Hi Raghavendra,
>>>
>>> Our technician was able to try the manual setting today.  He found
that
>>> our upper limit for performance.md-cache-timeout was 60 not 600, so
he
>>> used that value, along with the network.inode-lru-limit=50000.
>>>
>>> The result was another small (~1%) increase in speed.  Does this
suggest
>>> some addition tests/changes we could try?
>>>
>>
>> Can you set gluster option diagnostics.client-log-level to TRACE  and
run
>> sequential read tests again (with md-cache-timeout value of 60)?
>>
>> #gluster volume set <volname> diagnostics.client-log-level TRACE
>>
>> Also are you sure that open-behind was turned off? Can you give the
>> output
>> of,
>>
>> # gluster volume info <volname>
>>
>>
>>> Thanks
>>>
>>> Pat
>>>
>>>
>>>
>>>
>>> On 06/25/2018 09:39 PM, Raghavendra Gowdappa wrote:
>>>
>>>
>>>
>>> On Tue, Jun 26, 2018 at 3:21 AM, Pat Haley <phaley at
mit.edu> wrote:
>>>
>>>>
>>>> Hi Raghavendra,
>>>>
>>>> Setting the performance.write-behind off had a small
improvement on the
>>>> write speed (~3%),
>>>>
>>>> We were unable to turn on "group metadata-cache". 
When we try get
>>>> errors like
>>>>
>>>> # gluster volume set data-volume group metadata-cache
>>>> '/var/lib/glusterd/groups/metadata-cache' file format
not valid.
>>>>
>>>> Was metadata-cache available for gluster 3.7.11? We ask because
the
>>>> release notes for 3.11 mentions ?Feature for
metadata-caching/small
>>>> file
>>>> performance is production ready.?
(https://gluster.readthedocs.i
>>>> o/en/latest/release-notes/3.11.0/).
>>>>
>>>> Do any of these results suggest anything?  If not, what further
tests
>>>> would be useful?
>>>>
>>>
>>> Group metadata-cache is just a bunch of options one sets on a
volume.
>>> So,
>>> You can set them manually using gluster cli. Following are the
options
>>> and
>>> their values:
>>>
>>> performance.md-cache-timeout=600
>>> network.inode-lru-limit=50000
>>>
>>>
>>>
>>>> Thanks
>>>>
>>>> Pat
>>>>
>>>>
>>>>
>>>>
>>>> On 06/22/2018 07:51 AM, Raghavendra Gowdappa wrote:
>>>>
>>>>
>>>>
>>>> On Thu, Jun 21, 2018 at 8:41 PM, Pat Haley <phaley at
mit.edu> wrote:
>>>>
>>>>>
>>>>> Hi Raghavendra,
>>>>>
>>>>> Thanks for the suggestions.  Our technician will be in on
Monday.
>>>>> We'll test then and let you know the results.
>>>>>
>>>>> One question I have, is the "group
metadata-cache" option supposed to
>>>>> directly impact the performance or is it to help collect
data?  If the
>>>>> latter, where will the data be located?
>>>>>
>>>>
>>>> It impacts performance.
>>>>
>>>>
>>>>> Thanks again.
>>>>>
>>>>> Pat
>>>>>
>>>>>
>>>>>
>>>>> On 06/21/2018 01:01 AM, Raghavendra Gowdappa wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 21, 2018 at 10:24 AM, Raghavendra Gowdappa <
>>>>> rgowdapp at redhat.com> wrote:
>>>>>
>>>>>> For the case of writes to glusterfs mount,
>>>>>>
>>>>>> I saw in earlier conversations that there are too many
lookups, but
>>>>>> small number of writes. Since writes cached in
write-behind would
>>>>>> invalidate metadata cache, lookups won't be
absorbed by md-cache. I
>>>>>> am
>>>>>> wondering what would results look like if we turn off
>>>>>> performance.write-behind.
>>>>>>
>>>>>> @Pat,
>>>>>>
>>>>>> Can you set,
>>>>>>
>>>>>> # gluster volume set <volname>
performance.write-behind off
>>>>>>
>>>>>
>>>>> Please turn on "group metadata-cache" for write
tests too.
>>>>>
>>>>>
>>>>>> and redo the tests writing to glusterfs mount? Let us
know about the
>>>>>> results you see.
>>>>>>
>>>>>> regards,
>>>>>> Raghavendra
>>>>>>
>>>>>> On Thu, Jun 21, 2018 at 8:33 AM, Raghavendra Gowdappa
<
>>>>>> rgowdapp at redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 21, 2018 at 8:32 AM, Raghavendra
Gowdappa <
>>>>>>> rgowdapp at redhat.com> wrote:
>>>>>>>
>>>>>>>> For the case of reading from Glusterfs mount,
read-ahead should
>>>>>>>> help. However, we've known issues with
read-ahead[1][2]. To work
>>>>>>>> around
>>>>>>>> these, can you try with,
>>>>>>>>
>>>>>>>> 1. Turn off performance.open-behind
>>>>>>>> #gluster volume set <volname>
performance.open-behind off
>>>>>>>>
>>>>>>>> 2. enable group meta metadata-cache
>>>>>>>> # gluster volume set <volname> group
metadata-cache
>>>>>>>>
>>>>>>>
>>>>>>> [1] 
https://bugzilla.redhat.com/show_bug.cgi?id=1084508
>>>>>>> [2]
https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 21, 2018 at 5:00 AM, Pat Haley
<phaley at mit.edu> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> We were recently revisiting our problems
with the slowness of
>>>>>>>>> gluster writes
(http://lists.gluster.org/pipe
>>>>>>>>>
rmail/gluster-users/2017-April/030529.html). Specifically we were
>>>>>>>>> testing the suggestions in a recent post (
>>>>>>>>>
http://lists.gluster.org/pipermail/gluster-users/2018-March
>>>>>>>>> /033699.html). The first two suggestions
(specifying a
>>>>>>>>> negative-timeout in the mount settings or
adding
>>>>>>>>> rpc-auth-allow-insecure to
>>>>>>>>> glusterd.vol) did not improve our
performance, while setting
>>>>>>>>> "disperse.eager-lock off"
provided a tiny (5%) speed-up.
>>>>>>>>>
>>>>>>>>> Some of the various tests we have tried
earlier can be seen in the
>>>>>>>>> links below.  Do any of the above
observations suggest what we
>>>>>>>>> could try
>>>>>>>>> next to either improve the speed or debug
the issue?  Thanks
>>>>>>>>>
>>>>>>>>>
http://lists.gluster.org/pipermail/gluster-users/2017-June/0
>>>>>>>>> 31565.html
>>>>>>>>>
http://lists.gluster.org/pipermail/gluster-users/2017-May/03
>>>>>>>>> 0937.html
>>>>>>>>>
>>>>>>>>> Pat
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>>>>> Pat Haley                          Email: 
phaley at mit.edu
>>>>>>>>> Center for Ocean Engineering       Phone: 
(617) 253-6824
>>>>>>>>> Dept. of Mechanical Engineering    Fax:   
(617) 253-8125
>>>>>>>>> MIT, Room 5-213                   
http://web.mit.edu/phaley/www/
>>>>>>>>> 77 Massachusetts Avenue
>>>>>>>>> Cambridge, MA  02139-4301
>>>>>>>>>
>>>>>>>>>
_______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>> Pat Haley                          Email:  phaley at
mit.edu
>>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>>> MIT, Room 5-213                   
http://web.mit.edu/phaley/www/
>>>>> 77 Massachusetts Avenue
>>>>> Cambridge, MA  02139-4301
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>> Pat Haley                          Email:  phaley at mit.edu
>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>> MIT, Room 5-213                   
http://web.mit.edu/phaley/www/
>>>> 77 Massachusetts Avenue
>>>> Cambridge, MA  02139-4301
>>>>
>>>>
>>>
>>> --
>>>
>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>> Pat Haley                          Email:  phaley at mit.edu
>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>> MIT, Room 5-213                    http://web.mit.edu/phaley/www/
>>> 77 Massachusetts Avenue
>>> Cambridge, MA  02139-4301
>>>
>>>
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley                          Email:  phaley at mit.edu
>> Center for Ocean Engineering       Phone:  (617) 253-6824
>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>> MIT, Room 5-213                    http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>>
>>
>
Gluster users - Jul 2018 - Slow write times to gluster disk

[Gluster-users] Slow write times to gluster disk