thr3ads.net - Gluster users - [Gluster-users] File system very slow [May 2020]

If this information is useful, please help other people find it:
Share via:

Karthik Subrahmanya

2020-May-27 07:27 UTC

[Gluster-users] File system very slow

Hi,

Please provide the following information to understand the setup and debug
this further:
- Which version of gluster you are using?
- 'gluster volume status atlassian' to confirm both bricks and shds are
up
or not
- Complete output of 'gluster volume profile atlassian info' before
running
'du' and during 'du'. Redirect this output to separate files and
attach
them here
- Get the client side profile as well by following
https://docs.gluster.org/en/latest/Administrator%20Guide/Performance%20Testing/
- 'gluster volume heal atlassian info' to check whether there are any
pending heals and client side heal is contributing to this

Regards,
Karthik

On Wed, May 27, 2020 at 1:06 AM <vadud3 at gmail.com> wrote:
> I had a parsing error. It is Volume Name: atlassian
>
> On Tue, May 26, 2020 at 3:12 PM <vadud3 at gmail.com> wrote:
>
>> # gluster volume info
>>
>> Volume Name: myvol
>> Type: Replicate
>> Volume ID: cbdef65c-79ea-496e-b777-b6a2981b29cf
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: node1:/data/foo/gluster
>> Brick2: node2:/data/foo/gluster
>> Options Reconfigured:
>> client.event-threads: 4
>> server.event-threads: 4
>> performance.stat-prefetch: on
>> network.inode-lru-limit: 16384
>> performance.md-cache-timeout: 1
>> performance.cache-invalidation: false
>> performance.cache-samba-metadata: false
>> features.cache-invalidation-timeout: 600
>> features.cache-invalidation: on
>> performance.io-thread-count: 16
>> performance.cache-refresh-timeout: 5
>> performance.write-behind-window-size: 5MB
>> performance.cache-size: 1GB
>> transport.address-family: inet
>> storage.fips-mode-rchecksum: on
>> nfs.disable: on
>> performance.client-io-threads: off
>> diagnostics.latency-measurement: on
>> diagnostics.count-fop-hits: on
>>
>> On Tue, May 26, 2020 at 3:06 PM Sunil Kumar Heggodu Gopala Acharya <
>> sheggodu at redhat.com> wrote:
>>
>>> Hi,
>>>
>>> Please share the gluster volume information.
>>>
>>> # gluster vol info
>>>
>>>
>>> Regards,
>>>
>>> Sunil kumar Acharya
>>>
>>>
>>> On Wed, May 27, 2020 at 12:30 AM <vadud3 at gmail.com> wrote:
>>>
>>>> I made the following changes for small file performance as
suggested by
>>>>
http://blog.gluster.org/gluster-tiering-and-small-file-performance/
>>>>
>>>> I am still seeing du -sh /data/shared taking 39 minutes.
>>>>
>>>> Any other tuning I can do. Most of my files are 15K. Here is
sample of
>>>> small files with size and number of occurrences
>>>>
>>>> FileSize.    # of occurrence
>>>> ====        ===========>>>>
>>>> 1.1K 1122
>>>> 1.1M 1040
>>>> 1.2K 1281
>>>> 1.2M 1357
>>>> 1.3K 1149
>>>> 1.3M 1098
>>>> 1.4K 1119
>>>> 1.5K 1189
>>>> 1.6K 1036
>>>> 1.7K 1169
>>>> 11K 2157
>>>> 12K 2398
>>>> 13K 2402
>>>> 14K 2406*15K 2426*
>>>> 16K 2386
>>>> 17K 1986
>>>> 18K 2037
>>>> 19K 1829
>>>> 2.0K 1027
>>>> 2.1K 1048
>>>> 2.4K 1013
>>>> 20K 1585
>>>> 21K 1713
>>>> 22K 1590
>>>> 23K 1371
>>>> 24K 1428
>>>> 25K 1444
>>>> 26K 1391
>>>> 27K 1217
>>>> 28K 1485
>>>> 29K 1282
>>>> 30K 1303
>>>> 31K 1275
>>>> 32K 1296
>>>> 33K 1058
>>>> 36K 1023
>>>> 37K 1107
>>>> 39K 1092
>>>> 41K 1034
>>>> 42K 1187
>>>> 46K 1030
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, May 25, 2020 at 5:30 PM <vadud3 at gmail.com>
wrote:
>>>>
>>>>> time du -sh /data/shared
>>>>>
>>>>> 431G    /data/shared
>>>>>
>>>>> real    45m49.992s
>>>>> user    0m20.043s
>>>>> sys    2m32.456s
>>>>>
>>>>>
>>>>> gluster fs is extremely slow
>>>>>
>>>>> Any suggestions on what settings to change to improve it?
>>>>>
>>>>>
>>>>> --
>>>>> Asif Iqbal
>>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>>> A: Because it messes up the order in which people normally
read text.
>>>>> Q: Why is top-posting such a bad thing?
>>>>>
>>>>>
>>>>
>>>> --
>>>> Asif Iqbal
>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>> A: Because it messes up the order in which people normally read
text.
>>>> Q: Why is top-posting such a bad thing?
>>>>
>>>> ________
>>>>
>>>>
>>>>
>>>> Community Meeting Calendar:
>>>>
>>>> Schedule -
>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>> Bridge: https://bluejeans.com/441850968
>>>>
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>
>> --
>> Asif Iqbal
>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>>
>>
>
> --
> Asif Iqbal
> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200527/9ddf8460/attachment.html>

Strahil Nikolov

2020-May-27 12:14 UTC

head link

[Gluster-users] File system very slow

Also,

can you provide a ping between the nodes, so we get an idea of the lattency
between the nodes.
Also, I'm interested how much time  it  takes  on the bricks to
'du'.

Best Regards,
Strahil Nikolov

?? 27 ??? 2020 ?. 10:27:34 GMT+03:00, Karthik Subrahmanya <ksubrahm at
redhat.com> ??????:>Hi,
>
>Please provide the following information to understand the setup and
>debug
>this further:
>- Which version of gluster you are using?
>- 'gluster volume status atlassian' to confirm both bricks and shds
are
>up
>or not
>- Complete output of 'gluster volume profile atlassian info' before
>running
>'du' and during 'du'. Redirect this output to separate files
and attach
>them here
>- Get the client side profile as well by following
>https://docs.gluster.org/en/latest/Administrator%20Guide/Performance%20Testing/
>- 'gluster volume heal atlassian info' to check whether there are
any
>pending heals and client side heal is contributing to this
>
>Regards,
>Karthik
>
>On Wed, May 27, 2020 at 1:06 AM <vadud3 at gmail.com> wrote:
>
>> I had a parsing error. It is Volume Name: atlassian
>>
>> On Tue, May 26, 2020 at 3:12 PM <vadud3 at gmail.com> wrote:
>>
>>> # gluster volume info
>>>
>>> Volume Name: myvol
>>> Type: Replicate
>>> Volume ID: cbdef65c-79ea-496e-b777-b6a2981b29cf
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: node1:/data/foo/gluster
>>> Brick2: node2:/data/foo/gluster
>>> Options Reconfigured:
>>> client.event-threads: 4
>>> server.event-threads: 4
>>> performance.stat-prefetch: on
>>> network.inode-lru-limit: 16384
>>> performance.md-cache-timeout: 1
>>> performance.cache-invalidation: false
>>> performance.cache-samba-metadata: false
>>> features.cache-invalidation-timeout: 600
>>> features.cache-invalidation: on
>>> performance.io-thread-count: 16
>>> performance.cache-refresh-timeout: 5
>>> performance.write-behind-window-size: 5MB
>>> performance.cache-size: 1GB
>>> transport.address-family: inet
>>> storage.fips-mode-rchecksum: on
>>> nfs.disable: on
>>> performance.client-io-threads: off
>>> diagnostics.latency-measurement: on
>>> diagnostics.count-fop-hits: on
>>>
>>> On Tue, May 26, 2020 at 3:06 PM Sunil Kumar Heggodu Gopala Acharya
<
>>> sheggodu at redhat.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Please share the gluster volume information.
>>>>
>>>> # gluster vol info
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Sunil kumar Acharya
>>>>
>>>>
>>>> On Wed, May 27, 2020 at 12:30 AM <vadud3 at gmail.com>
wrote:
>>>>
>>>>> I made the following changes for small file performance as
>suggested by
>>>>>
>http://blog.gluster.org/gluster-tiering-and-small-file-performance/
>>>>>
>>>>> I am still seeing du -sh /data/shared taking 39 minutes.
>>>>>
>>>>> Any other tuning I can do. Most of my files are 15K. Here
is
>sample of
>>>>> small files with size and number of occurrences
>>>>>
>>>>> FileSize.    # of occurrence
>>>>> ====        ===========>>>>>
>>>>> 1.1K 1122
>>>>> 1.1M 1040
>>>>> 1.2K 1281
>>>>> 1.2M 1357
>>>>> 1.3K 1149
>>>>> 1.3M 1098
>>>>> 1.4K 1119
>>>>> 1.5K 1189
>>>>> 1.6K 1036
>>>>> 1.7K 1169
>>>>> 11K 2157
>>>>> 12K 2398
>>>>> 13K 2402
>>>>> 14K 2406*15K 2426*
>>>>> 16K 2386
>>>>> 17K 1986
>>>>> 18K 2037
>>>>> 19K 1829
>>>>> 2.0K 1027
>>>>> 2.1K 1048
>>>>> 2.4K 1013
>>>>> 20K 1585
>>>>> 21K 1713
>>>>> 22K 1590
>>>>> 23K 1371
>>>>> 24K 1428
>>>>> 25K 1444
>>>>> 26K 1391
>>>>> 27K 1217
>>>>> 28K 1485
>>>>> 29K 1282
>>>>> 30K 1303
>>>>> 31K 1275
>>>>> 32K 1296
>>>>> 33K 1058
>>>>> 36K 1023
>>>>> 37K 1107
>>>>> 39K 1092
>>>>> 41K 1034
>>>>> 42K 1187
>>>>> 46K 1030
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, May 25, 2020 at 5:30 PM <vadud3 at gmail.com>
wrote:
>>>>>
>>>>>> time du -sh /data/shared
>>>>>>
>>>>>> 431G    /data/shared
>>>>>>
>>>>>> real    45m49.992s
>>>>>> user    0m20.043s
>>>>>> sys    2m32.456s
>>>>>>
>>>>>>
>>>>>> gluster fs is extremely slow
>>>>>>
>>>>>> Any suggestions on what settings to change to improve
it?
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Asif Iqbal
>>>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>>>> A: Because it messes up the order in which people
normally read
>text.
>>>>>> Q: Why is top-posting such a bad thing?
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Asif Iqbal
>>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>>> A: Because it messes up the order in which people normally
read
>text.
>>>>> Q: Why is top-posting such a bad thing?
>>>>>
>>>>> ________
>>>>>
>>>>>
>>>>>
>>>>> Community Meeting Calendar:
>>>>>
>>>>> Schedule -
>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>>> Bridge: https://bluejeans.com/441850968
>>>>>
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>
>>> --
>>> Asif Iqbal
>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>> A: Because it messes up the order in which people normally read
>text.
>>> Q: Why is top-posting such a bad thing?
>>>
>>>
>>
>> --
>> Asif Iqbal
>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>

vadud3 at gmail.com

2020-May-27 14:02 UTC

head link

[Gluster-users] File system very slow

- # gluster --version
glusterfs 7.5

- # gluster volume status atlassian
Status of volume: atlassian
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data/atlassian/gluster  49152     0          Y       1791
Brick node2:/data/atlassian/gluster  49152     0          Y       1773
Self-heal Daemon on localhost               N/A       N/A        Y
1807
Self-heal Daemon on node1.example.c
example.net                               N/A       N/A        Y       1778

Task Status of Volume atlassian
------------------------------------------------------------------------------
There are no active volume tasks

- # attached pre-du and during-du log from server


- I do not have a remote client. when I tried to run these
gluster volume profile your-volume start says already started since I am on
the server
# setfattr -n trusted.io-stats-dump -v /tmp/io-stats-pre.txt /mnt runs but
no output in /tmp/io-stats-pre.txt


- # gluster volume heal atlassian info
Brick node1:/data/atlassian/gluster
Status: Connected
Number of entries: 0

Brick node2:/data/atlassian/gluster
Status: Connected
Number of entries: 0

Let me know if you need anything else. Appreciate your help


On Wed, May 27, 2020 at 3:27 AM Karthik Subrahmanya <ksubrahm at
redhat.com>
wrote:
> Hi,
>
> Please provide the following information to understand the setup and debug
> this further:
> - Which version of gluster you are using?
> - 'gluster volume status atlassian' to confirm both bricks and shds
are up
> or not
> - Complete output of 'gluster volume profile atlassian info' before
> running 'du' and during 'du'. Redirect this output to
separate files and
> attach them here
> - Get the client side profile as well by following
>
https://docs.gluster.org/en/latest/Administrator%20Guide/Performance%20Testing/
> - 'gluster volume heal atlassian info' to check whether there are
any
> pending heals and client side heal is contributing to this
>
> Regards,
> Karthik
>
> On Wed, May 27, 2020 at 1:06 AM <vadud3 at gmail.com> wrote:
>
>> I had a parsing error. It is Volume Name: atlassian
>>
>> On Tue, May 26, 2020 at 3:12 PM <vadud3 at gmail.com> wrote:
>>
>>> # gluster volume info
>>>
>>> Volume Name: myvol
>>> Type: Replicate
>>> Volume ID: cbdef65c-79ea-496e-b777-b6a2981b29cf
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x 2 = 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: node1:/data/foo/gluster
>>> Brick2: node2:/data/foo/gluster
>>> Options Reconfigured:
>>> client.event-threads: 4
>>> server.event-threads: 4
>>> performance.stat-prefetch: on
>>> network.inode-lru-limit: 16384
>>> performance.md-cache-timeout: 1
>>> performance.cache-invalidation: false
>>> performance.cache-samba-metadata: false
>>> features.cache-invalidation-timeout: 600
>>> features.cache-invalidation: on
>>> performance.io-thread-count: 16
>>> performance.cache-refresh-timeout: 5
>>> performance.write-behind-window-size: 5MB
>>> performance.cache-size: 1GB
>>> transport.address-family: inet
>>> storage.fips-mode-rchecksum: on
>>> nfs.disable: on
>>> performance.client-io-threads: off
>>> diagnostics.latency-measurement: on
>>> diagnostics.count-fop-hits: on
>>>
>>> On Tue, May 26, 2020 at 3:06 PM Sunil Kumar Heggodu Gopala Acharya
<
>>> sheggodu at redhat.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Please share the gluster volume information.
>>>>
>>>> # gluster vol info
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Sunil kumar Acharya
>>>>
>>>>
>>>> On Wed, May 27, 2020 at 12:30 AM <vadud3 at gmail.com>
wrote:
>>>>
>>>>> I made the following changes for small file performance as
suggested
>>>>> by
http://blog.gluster.org/gluster-tiering-and-small-file-performance/
>>>>>
>>>>> I am still seeing du -sh /data/shared taking 39 minutes.
>>>>>
>>>>> Any other tuning I can do. Most of my files are 15K. Here
is sample of
>>>>> small files with size and number of occurrences
>>>>>
>>>>> FileSize.    # of occurrence
>>>>> ====        ===========>>>>>
>>>>> 1.1K 1122
>>>>> 1.1M 1040
>>>>> 1.2K 1281
>>>>> 1.2M 1357
>>>>> 1.3K 1149
>>>>> 1.3M 1098
>>>>> 1.4K 1119
>>>>> 1.5K 1189
>>>>> 1.6K 1036
>>>>> 1.7K 1169
>>>>> 11K 2157
>>>>> 12K 2398
>>>>> 13K 2402
>>>>> 14K 2406*15K 2426*
>>>>> 16K 2386
>>>>> 17K 1986
>>>>> 18K 2037
>>>>> 19K 1829
>>>>> 2.0K 1027
>>>>> 2.1K 1048
>>>>> 2.4K 1013
>>>>> 20K 1585
>>>>> 21K 1713
>>>>> 22K 1590
>>>>> 23K 1371
>>>>> 24K 1428
>>>>> 25K 1444
>>>>> 26K 1391
>>>>> 27K 1217
>>>>> 28K 1485
>>>>> 29K 1282
>>>>> 30K 1303
>>>>> 31K 1275
>>>>> 32K 1296
>>>>> 33K 1058
>>>>> 36K 1023
>>>>> 37K 1107
>>>>> 39K 1092
>>>>> 41K 1034
>>>>> 42K 1187
>>>>> 46K 1030
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, May 25, 2020 at 5:30 PM <vadud3 at gmail.com>
wrote:
>>>>>
>>>>>> time du -sh /data/shared
>>>>>>
>>>>>> 431G    /data/shared
>>>>>>
>>>>>> real    45m49.992s
>>>>>> user    0m20.043s
>>>>>> sys    2m32.456s
>>>>>>
>>>>>>
>>>>>> gluster fs is extremely slow
>>>>>>
>>>>>> Any suggestions on what settings to change to improve
it?
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Asif Iqbal
>>>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>>>> A: Because it messes up the order in which people
normally read text.
>>>>>> Q: Why is top-posting such a bad thing?
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Asif Iqbal
>>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>>>> A: Because it messes up the order in which people normally
read text.
>>>>> Q: Why is top-posting such a bad thing?
>>>>>
>>>>> ________
>>>>>
>>>>>
>>>>>
>>>>> Community Meeting Calendar:
>>>>>
>>>>> Schedule -
>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>>> Bridge: https://bluejeans.com/441850968
>>>>>
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>
>>> --
>>> Asif Iqbal
>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>>> A: Because it messes up the order in which people normally read
text.
>>> Q: Why is top-posting such a bad thing?
>>>
>>>
>>
>> --
>> Asif Iqbal
>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
>> A: Because it messes up the order in which people normally read text.
>> Q: Why is top-posting such a bad thing?
>>
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
-- 
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200527/a2b6fd8f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gvp.pre-du.log
Type: application/octet-stream
Size: 13336 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200527/a2b6fd8f/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gvp-during-du.log
Type: application/octet-stream
Size: 144470 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200527/a2b6fd8f/attachment-0001.obj>

Gluster users - May 2020 - File system very slow

[Gluster-users] File system very slow

[Gluster-users] File system very slow

[Gluster-users] File system very slow