thr3ads.net - Gluster users - [Gluster-users] Slow performance of gluster volume [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Abi Askushi

2017-Sep-05 09:02 UTC

[Gluster-users] Slow performance of gluster volume

Hi Krutika,

I already have a preallocated disk on VM.
Now I am checking performance with dd on the hypervisors which have the
gluster volume configured.

I tried also several values of shard-block-size and I keep getting the same
low values on write performance.
Enabling client-io-threads also did not have any affect.

The version of gluster I am using is glusterfs 3.8.12 built on May 11 2017
18:46:20.
The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster as
storage.

Below are the current settings:


Volume Name: vms
Type: Replicate
Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gluster0:/gluster/vms/brick
Brick2: gluster1:/gluster/vms/brick
Brick3: gluster2:/gluster/vms/brick (arbiter)
Options Reconfigured:
server.event-threads: 4
client.event-threads: 4
performance.client-io-threads: on
features.shard-block-size: 512MB
cluster.granular-entry-heal: enable
performance.strict-o-direct: on
network.ping-timeout: 30
storage.owner-gid: 36
storage.owner-uid: 36
user.cifs: off
features.shard: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
cluster.data-self-heal-algorithm: full
cluster.server-quorum-type: server
cluster.quorum-type: auto
cluster.eager-lock: enable
network.remote-dio: off
performance.low-prio-threads: 32
performance.stat-prefetch: on
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: on


I observed that when testing with dd if=/dev/zero of=testfile bs=1G count=1
I get 65MB/s on the vms gluster volume (and the network traffic between the
servers reaches ~ 500Mbps), while when testing with dd if=/dev/zero
of=testfile bs=1G count=1 *oflag=direct *I get a consistent 10MB/s and the
network traffic hardly reaching 100Mbps.

Any other things one can do?

On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at redhat.com>
wrote:
> I'm assuming you are using this volume to store vm images, because I
see
> shard in the options list.
>
> Speaking from shard translator's POV, one thing you can do to improve
> performance is to use preallocated images.
> This will at least eliminate the need for shard to perform multiple steps
> as part of the writes - such as creating the shard and then writing to it
> and then updating the aggregated file size - all of which require one
> network call each, which further get blown up once they reach AFR
> (replicate) into many more network calls.
>
> Second, I'm assuming you're using the default shard block size of
4MB (you
> can confirm this using `gluster volume get <VOL> shard-block-size`).
In our
> tests, we've found that larger shard sizes perform better. So maybe
change
> the shard-block-size to 64MB (`gluster volume set <VOL>
shard-block-size
> 64MB`).
>
> Third, keep stat-prefetch enabled. We've found that qemu sends quite a
lot
> of [f]stats which can be served from the (md)cache to improve performance.
> So enable that.
>
> Also, could you also enable client-io-threads and see if that improves
> performance?
>
> Which version of gluster are you using BTW?
>
> -Krutika
>
>
> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at
gmail.com>
> wrote:
>
>> Hi all,
>>
>> I have a gluster volume used to host several VMs (managed through
oVirt).
>> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit
>> network for the storage.
>>
>> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1
>> oflag=direct) out of the volume (e.g. writing at /root/) the
performance of
>> the dd is reported to be ~ 700MB/s, which is quite decent. When testing
the
>> dd on the gluster volume I get ~ 43 MB/s which way lower from the
previous.
>> When testing with dd the gluster volume, the network traffic was not
>> exceeding 450 Mbps on the network interface. I would expect to reach
near
>> 900 Mbps considering that there is 1 Gbit of bandwidth available. This
>> results having VMs with very slow performance (especially on their
write
>> operations).
>>
>> The full details of the volume are below. Any advise on what can be
>> tweaked will be highly appreciated.
>>
>> Volume Name: vms
>> Type: Replicate
>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster0:/gluster/vms/brick
>> Brick2: gluster1:/gluster/vms/brick
>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>> Options Reconfigured:
>> cluster.granular-entry-heal: enable
>> performance.strict-o-direct: on
>> network.ping-timeout: 30
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: off
>> performance.low-prio-threads: 32
>> performance.stat-prefetch: off
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: on
>> nfs.export-volumes: on
>>
>>
>> Thanx,
>> Alex
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170905/ac6e6de4/attachment.html>

Krutika Dhananjay

2017-Sep-05 09:48 UTC

head link

[Gluster-users] Slow performance of gluster volume

OK my understanding is that with preallocated disks the performance with
and without shard will be the same.

In any case, please attach the volume profile[1], so we can see what else
is slowing things down.

-Krutika

[1] -
https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/#running-glusterfs-volume-profile-command

On Tue, Sep 5, 2017 at 2:32 PM, Abi Askushi <rightkicktech at gmail.com>
wrote:
> Hi Krutika,
>
> I already have a preallocated disk on VM.
> Now I am checking performance with dd on the hypervisors which have the
> gluster volume configured.
>
> I tried also several values of shard-block-size and I keep getting the
> same low values on write performance.
> Enabling client-io-threads also did not have any affect.
>
> The version of gluster I am using is glusterfs 3.8.12 built on May 11 2017
> 18:46:20.
> The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster as
> storage.
>
> Below are the current settings:
>
>
> Volume Name: vms
> Type: Replicate
> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster0:/gluster/vms/brick
> Brick2: gluster1:/gluster/vms/brick
> Brick3: gluster2:/gluster/vms/brick (arbiter)
> Options Reconfigured:
> server.event-threads: 4
> client.event-threads: 4
> performance.client-io-threads: on
> features.shard-block-size: 512MB
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.stat-prefetch: on
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> performance.readdir-ahead: on
> nfs.disable: on
> nfs.export-volumes: on
>
>
> I observed that when testing with dd if=/dev/zero of=testfile bs=1G
> count=1 I get 65MB/s on the vms gluster volume (and the network traffic
> between the servers reaches ~ 500Mbps), while when testing with dd
> if=/dev/zero of=testfile bs=1G count=1 *oflag=direct *I get a consistent
> 10MB/s and the network traffic hardly reaching 100Mbps.
>
> Any other things one can do?
>
> On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at
redhat.com>
> wrote:
>
>> I'm assuming you are using this volume to store vm images, because
I see
>> shard in the options list.
>>
>> Speaking from shard translator's POV, one thing you can do to
improve
>> performance is to use preallocated images.
>> This will at least eliminate the need for shard to perform multiple
steps
>> as part of the writes - such as creating the shard and then writing to
it
>> and then updating the aggregated file size - all of which require one
>> network call each, which further get blown up once they reach AFR
>> (replicate) into many more network calls.
>>
>> Second, I'm assuming you're using the default shard block size
of 4MB
>> (you can confirm this using `gluster volume get <VOL>
shard-block-size`).
>> In our tests, we've found that larger shard sizes perform better.
So maybe
>> change the shard-block-size to 64MB (`gluster volume set <VOL>
>> shard-block-size 64MB`).
>>
>> Third, keep stat-prefetch enabled. We've found that qemu sends
quite a
>> lot of [f]stats which can be served from the (md)cache to improve
>> performance. So enable that.
>>
>> Also, could you also enable client-io-threads and see if that improves
>> performance?
>>
>> Which version of gluster are you using BTW?
>>
>> -Krutika
>>
>>
>> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at
gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I have a gluster volume used to host several VMs (managed through
>>> oVirt).
>>> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit
>>> network for the storage.
>>>
>>> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1
>>> oflag=direct) out of the volume (e.g. writing at /root/) the
performance of
>>> the dd is reported to be ~ 700MB/s, which is quite decent. When
testing the
>>> dd on the gluster volume I get ~ 43 MB/s which way lower from the
previous.
>>> When testing with dd the gluster volume, the network traffic was
not
>>> exceeding 450 Mbps on the network interface. I would expect to
reach near
>>> 900 Mbps considering that there is 1 Gbit of bandwidth available.
This
>>> results having VMs with very slow performance (especially on their
write
>>> operations).
>>>
>>> The full details of the volume are below. Any advise on what can be
>>> tweaked will be highly appreciated.
>>>
>>> Volume Name: vms
>>> Type: Replicate
>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gluster0:/gluster/vms/brick
>>> Brick2: gluster1:/gluster/vms/brick
>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>> Options Reconfigured:
>>> cluster.granular-entry-heal: enable
>>> performance.strict-o-direct: on
>>> network.ping-timeout: 30
>>> storage.owner-gid: 36
>>> storage.owner-uid: 36
>>> user.cifs: off
>>> features.shard: on
>>> cluster.shd-wait-qlength: 10000
>>> cluster.shd-max-threads: 8
>>> cluster.locking-scheme: granular
>>> cluster.data-self-heal-algorithm: full
>>> cluster.server-quorum-type: server
>>> cluster.quorum-type: auto
>>> cluster.eager-lock: enable
>>> network.remote-dio: off
>>> performance.low-prio-threads: 32
>>> performance.stat-prefetch: off
>>> performance.io-cache: off
>>> performance.read-ahead: off
>>> performance.quick-read: off
>>> transport.address-family: inet
>>> performance.readdir-ahead: on
>>> nfs.disable: on
>>> nfs.export-volumes: on
>>>
>>>
>>> Thanx,
>>> Alex
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170905/eba3e690/attachment.html>

Abi Askushi

2017-Sep-05 11:27 UTC

head link

[Gluster-users] Slow performance of gluster volume

Hi Krutika,

Attached the profile stats. I enabled profiling then ran some dd tests.
Also 3 Windows VMs are running on top this volume but did not do any stress
testing on the VMs. I have left the profiling enabled in case more time is
needed for useful stats.

Thanx

On Tue, Sep 5, 2017 at 12:48 PM, Krutika Dhananjay <kdhananj at
redhat.com>
wrote:
> OK my understanding is that with preallocated disks the performance with
> and without shard will be the same.
>
> In any case, please attach the volume profile[1], so we can see what else
> is slowing things down.
>
> -Krutika
>
> [1] - https://gluster.readthedocs.io/en/latest/Administrator%
> 20Guide/Monitoring%20Workload/#running-glusterfs-volume-profile-command
>
> On Tue, Sep 5, 2017 at 2:32 PM, Abi Askushi <rightkicktech at
gmail.com>
> wrote:
>
>> Hi Krutika,
>>
>> I already have a preallocated disk on VM.
>> Now I am checking performance with dd on the hypervisors which have the
>> gluster volume configured.
>>
>> I tried also several values of shard-block-size and I keep getting the
>> same low values on write performance.
>> Enabling client-io-threads also did not have any affect.
>>
>> The version of gluster I am using is glusterfs 3.8.12 built on May 11
>> 2017 18:46:20.
>> The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster
>> as storage.
>>
>> Below are the current settings:
>>
>>
>> Volume Name: vms
>> Type: Replicate
>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster0:/gluster/vms/brick
>> Brick2: gluster1:/gluster/vms/brick
>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>> Options Reconfigured:
>> server.event-threads: 4
>> client.event-threads: 4
>> performance.client-io-threads: on
>> features.shard-block-size: 512MB
>> cluster.granular-entry-heal: enable
>> performance.strict-o-direct: on
>> network.ping-timeout: 30
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 10000
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: off
>> performance.low-prio-threads: 32
>> performance.stat-prefetch: on
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> nfs.disable: on
>> nfs.export-volumes: on
>>
>>
>> I observed that when testing with dd if=/dev/zero of=testfile bs=1G
>> count=1 I get 65MB/s on the vms gluster volume (and the network traffic
>> between the servers reaches ~ 500Mbps), while when testing with dd
>> if=/dev/zero of=testfile bs=1G count=1 *oflag=direct *I get a
consistent
>> 10MB/s and the network traffic hardly reaching 100Mbps.
>>
>> Any other things one can do?
>>
>> On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at
redhat.com>
>> wrote:
>>
>>> I'm assuming you are using this volume to store vm images,
because I see
>>> shard in the options list.
>>>
>>> Speaking from shard translator's POV, one thing you can do to
improve
>>> performance is to use preallocated images.
>>> This will at least eliminate the need for shard to perform multiple
>>> steps as part of the writes - such as creating the shard and then
writing
>>> to it and then updating the aggregated file size - all of which
require one
>>> network call each, which further get blown up once they reach AFR
>>> (replicate) into many more network calls.
>>>
>>> Second, I'm assuming you're using the default shard block
size of 4MB
>>> (you can confirm this using `gluster volume get <VOL>
shard-block-size`).
>>> In our tests, we've found that larger shard sizes perform
better. So maybe
>>> change the shard-block-size to 64MB (`gluster volume set
<VOL>
>>> shard-block-size 64MB`).
>>>
>>> Third, keep stat-prefetch enabled. We've found that qemu sends
quite a
>>> lot of [f]stats which can be served from the (md)cache to improve
>>> performance. So enable that.
>>>
>>> Also, could you also enable client-io-threads and see if that
improves
>>> performance?
>>>
>>> Which version of gluster are you using BTW?
>>>
>>> -Krutika
>>>
>>>
>>> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at
gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I have a gluster volume used to host several VMs (managed
through
>>>> oVirt).
>>>> The volume is a replica 3 with arbiter and the 3 servers use 1
Gbit
>>>> network for the storage.
>>>>
>>>> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1
>>>> oflag=direct) out of the volume (e.g. writing at /root/) the
performance of
>>>> the dd is reported to be ~ 700MB/s, which is quite decent. When
testing the
>>>> dd on the gluster volume I get ~ 43 MB/s which way lower from
the previous.
>>>> When testing with dd the gluster volume, the network traffic
was not
>>>> exceeding 450 Mbps on the network interface. I would expect to
reach near
>>>> 900 Mbps considering that there is 1 Gbit of bandwidth
available. This
>>>> results having VMs with very slow performance (especially on
their write
>>>> operations).
>>>>
>>>> The full details of the volume are below. Any advise on what
can be
>>>> tweaked will be highly appreciated.
>>>>
>>>> Volume Name: vms
>>>> Type: Replicate
>>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gluster0:/gluster/vms/brick
>>>> Brick2: gluster1:/gluster/vms/brick
>>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>>> Options Reconfigured:
>>>> cluster.granular-entry-heal: enable
>>>> performance.strict-o-direct: on
>>>> network.ping-timeout: 30
>>>> storage.owner-gid: 36
>>>> storage.owner-uid: 36
>>>> user.cifs: off
>>>> features.shard: on
>>>> cluster.shd-wait-qlength: 10000
>>>> cluster.shd-max-threads: 8
>>>> cluster.locking-scheme: granular
>>>> cluster.data-self-heal-algorithm: full
>>>> cluster.server-quorum-type: server
>>>> cluster.quorum-type: auto
>>>> cluster.eager-lock: enable
>>>> network.remote-dio: off
>>>> performance.low-prio-threads: 32
>>>> performance.stat-prefetch: off
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> transport.address-family: inet
>>>> performance.readdir-ahead: on
>>>> nfs.disable: on
>>>> nfs.export-volumes: on
>>>>
>>>>
>>>> Thanx,
>>>> Alex
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170905/6564e14d/attachment.html>
-------------- next part --------------
Brick: gluster0:/gluster/vms/brick
----------------------------------
Cumulative Stats:
   Block Size:                 32b+                 256b+                 512b+ 
 No. of Reads:                    0                  7093                 79384 
No. of Writes:                   12                   134                 16639 
 
   Block Size:               1024b+                2048b+                4096b+ 
 No. of Reads:                76171                 88973                408733 
No. of Writes:               128548                129482                622604 
 
   Block Size:               8192b+               16384b+               32768b+ 
 No. of Reads:               562933                175791                164097 
No. of Writes:               379782                132651                 93864 
 
   Block Size:              65536b+              131072b+ 
 No. of Reads:               151097                551006 
No. of Writes:               121271                670952 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us            411      FORGET
      0.00       0.00 us       0.00 us       0.00 us         188771     RELEASE
      0.00       0.00 us       0.00 us       0.00 us          37310  RELEASEDIR
      0.00      56.58 us      44.00 us      88.00 us             12    SETXATTR
      0.00      89.00 us      64.00 us     160.00 us             12       RMDIR
      0.00     278.80 us     135.00 us     520.00 us              5    TRUNCATE
      0.00     110.71 us      34.00 us     164.00 us             14     XATTROP
      0.00     113.64 us       9.00 us     383.00 us             14     READDIR
      0.00     127.00 us      96.00 us     163.00 us             24      RENAME
      0.00     265.09 us      80.00 us    1892.00 us             22      UNLINK
      0.00     359.81 us       8.00 us    6695.00 us             43    GETXATTR
      0.00    9722.50 us    7233.00 us   14112.00 us              4       MKNOD
      0.00    8439.92 us     128.00 us   17393.00 us             12       MKDIR
      0.00    1566.25 us      44.00 us   21348.00 us            104 REMOVEXATTR
      0.00    1802.21 us      43.00 us   47897.00 us            111     SETATTR
      0.00   10117.71 us     110.00 us   62262.00 us             24      CREATE
      0.00    3599.83 us      10.00 us   15143.00 us             70       FLUSH
      0.00    2881.88 us      31.00 us  100555.00 us            194        OPEN
      0.00    1528.12 us      20.00 us   37185.00 us            472    READDIRP
      0.00    2272.81 us      25.00 us   94470.00 us            328       FSTAT
      0.00    1658.86 us       1.00 us   27155.00 us            567     OPENDIR
      0.00     542.75 us      15.00 us   91928.00 us           1740        STAT
      0.01    1699.46 us      11.00 us   45324.00 us            780      STATFS
      0.05   28118.57 us      10.00 us  587648.00 us            399     ENTRYLK
      0.13   25935.31 us      10.00 us 1693505.00 us            986     INODELK
      0.47    9175.02 us      11.00 us  642079.00 us          10369       FSYNC
      0.65    8647.81 us      10.00 us 64883650.00 us          15345      LOOKUP
      1.23    9379.15 us      16.00 us 7110244.00 us          26708    FXATTROP
      3.01    7832.97 us      17.00 us  799409.00 us          78472        READ
     44.22  261211.86 us      51.00 us 1015501.00 us          34569       WRITE
     50.22  182315.78 us       8.00 us 1584450.00 us          56251    FINODELK
 
    Duration: 67635 seconds
   Data Read: 100176806862 bytes
Data Written: 112553284400 bytes
 
Interval 9 Stats:
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                    2                     2                     1 
No. of Writes:                    0                     1                     0 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                    2                   104                    20 
No. of Writes:                    0                   161                    77 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                  342                   353                   165 
No. of Writes:                   46                    14                    11 
 
   Block Size:             131072b+ 
 No. of Reads:                   16 
No. of Writes:                  930 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us             23      FORGET
      0.00       0.00 us       0.00 us       0.00 us              4     RELEASE
      0.00       0.00 us       0.00 us       0.00 us              3  RELEASEDIR
      0.00      76.00 us      76.00 us      76.00 us              1       FSTAT
      0.00      43.00 us      17.00 us      69.00 us              2       FLUSH
      0.00      36.50 us      28.00 us      41.00 us              4     OPENDIR
      0.00      81.50 us      50.00 us     120.00 us              4        OPEN
      0.00      46.60 us      23.00 us      93.00 us             10      STATFS
      0.00     156.50 us      23.00 us     336.00 us              6    READDIRP
      0.00     499.03 us      42.00 us    6709.00 us             33        STAT
      0.03    1201.35 us      49.00 us   47975.00 us            214      LOOKUP
      0.19    1265.42 us      20.00 us  163429.00 us           1276    FXATTROP
      0.35    8882.97 us      14.00 us  418304.00 us            344       FSYNC
      0.59    4796.28 us      22.00 us  188365.00 us           1063        READ
     10.98   76545.35 us    3614.00 us  430475.00 us           1246       WRITE
     87.86  244152.22 us      11.00 us 1306226.00 us           3126    FINODELK
 
    Duration: 16 seconds
   Data Read: 31334024 bytes
Data Written: 125681664 bytes
 
Brick: gluster2:/gluster/vms/brick
----------------------------------
Cumulative Stats:
   Block Size:                  1b+ 
 No. of Reads:                    0 
No. of Writes:              2481551 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us            577      FORGET
      0.00       0.00 us       0.00 us       0.00 us         266910     RELEASE
      0.00       0.00 us       0.00 us       0.00 us          38816  RELEASEDIR
      0.00      77.60 us      49.00 us      95.00 us              5    TRUNCATE
      0.00      80.67 us      44.00 us     126.00 us             12    SETXATTR
      0.00     106.83 us      66.00 us     153.00 us             12       RMDIR
      0.00     129.50 us      42.00 us     257.00 us             14     XATTROP
      0.00     134.93 us      11.00 us     508.00 us             14     READDIR
      0.00     110.82 us      65.00 us     241.00 us             22      UNLINK
      0.00      58.95 us       9.00 us     160.00 us             42    GETXATTR
      0.00      87.30 us      45.00 us     176.00 us             33       FSTAT
      0.00      42.63 us      11.00 us      94.00 us             70       FLUSH
      0.01      81.05 us      38.00 us     208.00 us            104 REMOVEXATTR
      0.01      89.28 us      37.00 us     172.00 us            111     SETATTR
      0.02      71.49 us      31.00 us     160.00 us            194        OPEN
      0.02      42.48 us       9.00 us    1465.00 us            400     ENTRYLK
      0.05      56.30 us       1.00 us     183.00 us            567     OPENDIR
      0.05    8110.50 us    2480.00 us   11210.00 us              4       MKNOD
      0.06      42.69 us       9.00 us     214.00 us            986     INODELK
      0.14    4015.42 us     102.00 us   16841.00 us             24      RENAME
      0.31    9026.62 us      95.00 us  102332.00 us             24      CREATE
      0.37   21182.58 us     164.00 us  103452.00 us             12       MKDIR
      2.71      54.37 us      11.00 us    1930.00 us          34568       WRITE
      9.09     236.01 us      16.00 us  193113.00 us          26709    FXATTROP
     10.01     452.32 us       9.00 us  127407.00 us          15347      LOOKUP
     17.85     209.26 us       8.00 us  119349.00 us          59169    FINODELK
     59.28    3964.37 us      12.00 us  231606.00 us          10372       FSYNC
 
    Duration: 71894 seconds
   Data Read: 0 bytes
Data Written: 2481551 bytes
 
Interval 9 Stats:
   Block Size:                  1b+ 
 No. of Reads:                    0 
No. of Writes:                 1239 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us             34      FORGET
      0.00       0.00 us       0.00 us       0.00 us              4     RELEASE
      0.00       0.00 us       0.00 us       0.00 us              3  RELEASEDIR
      0.00      37.50 us      30.00 us      45.00 us              2       FLUSH
      0.01      75.75 us      58.00 us     113.00 us              4        OPEN
      0.01      82.50 us      55.00 us     141.00 us              4     OPENDIR
      2.02      52.36 us      13.00 us     143.00 us           1239       WRITE
      2.45     371.39 us      46.00 us    8633.00 us            212      LOOKUP
      7.23     182.03 us      19.00 us  122613.00 us           1275    FXATTROP
     43.53    4071.27 us      14.00 us  116265.00 us            343       FSYNC
     44.74     423.54 us       9.00 us   92407.00 us           3389    FINODELK
 
    Duration: 16 seconds
   Data Read: 0 bytes
Data Written: 1239 bytes
 
Brick: gluster1:/gluster/vms/brick
----------------------------------
Cumulative Stats:
   Block Size:                 32b+                 256b+                 512b+ 
 No. of Reads:                    0                 11694                  2839 
No. of Writes:                    8                    90                 13871 
 
   Block Size:               1024b+                2048b+                4096b+ 
 No. of Reads:                 5962                  4739                 46620 
No. of Writes:                97317                 94324                478065 
 
   Block Size:               8192b+               16384b+               32768b+ 
 No. of Reads:                18976                 20815                 26327 
No. of Writes:               261476                108447                 73657 
 
   Block Size:              65536b+              131072b+ 
 No. of Reads:                23025                 37767 
No. of Writes:                91901                666916 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us            175      FORGET
      0.00       0.00 us       0.00 us       0.00 us          25105     RELEASE
      0.00       0.00 us       0.00 us       0.00 us          32798  RELEASEDIR
      0.00      76.75 us      53.00 us     110.00 us             12    SETXATTR
      0.00      95.83 us      58.00 us     141.00 us             12       RMDIR
      0.00     291.40 us     226.00 us     446.00 us              5    TRUNCATE
      0.00     134.50 us      61.00 us     278.00 us             14     XATTROP
      0.00     128.46 us      82.00 us     209.00 us             24      RENAME
      0.00     173.65 us      10.00 us     451.00 us             20     READDIR
      0.00     161.91 us      82.00 us     662.00 us             22      UNLINK
      0.00    8810.25 us    5320.00 us   11953.00 us              4       MKNOD
      0.00    6320.17 us     183.00 us   13998.00 us             12       MKDIR
      0.00     838.83 us      39.00 us   12448.00 us            104 REMOVEXATTR
      0.00     934.96 us      39.00 us   29283.00 us            111     SETATTR
      0.00    2144.17 us      11.00 us   41256.00 us             70       FLUSH
      0.00    6324.54 us     112.00 us   19306.00 us             24      CREATE
      0.00     936.09 us      22.00 us   23659.00 us            187       FSTAT
      0.01    1918.45 us      27.00 us  104525.00 us            194        OPEN
      0.01     835.49 us       1.00 us   66949.00 us            570     OPENDIR
      0.01     906.71 us      17.00 us   24395.00 us            635    READDIRP
      0.01     814.12 us      11.00 us   22800.00 us            782      STATFS
      0.01    1638.38 us       7.00 us   27751.00 us            399     ENTRYLK
      0.02     676.51 us      23.00 us   60154.00 us           1341        STAT
      0.02   12337.72 us       9.00 us 1093769.00 us             90    GETXATTR
      0.06    3416.72 us      10.00 us 2658066.00 us            986     INODELK
      0.57    1937.28 us      12.00 us  264094.00 us          15322      LOOKUP
      1.42    1290.09 us       7.00 us  357697.00 us          57354    FINODELK
      1.87    9391.45 us      12.00 us  808993.00 us          10372       FSYNC
      1.88    3670.51 us      15.00 us 1453184.00 us          26708    FXATTROP
      7.69    6216.58 us      27.00 us  571432.00 us          64311        READ
     86.39  130052.64 us      47.00 us  849076.00 us          34555       WRITE
 
    Duration: 58560 seconds
   Data Read: 8469686670 bytes
Data Written: 106142019052 bytes
 
Interval 9 Stats:
   Block Size:                256b+                 512b+                1024b+ 
 No. of Reads:                    4                     6                     8 
No. of Writes:                    0                     1                     0 
 
   Block Size:               2048b+                4096b+                8192b+ 
 No. of Reads:                    9                   308                   128 
No. of Writes:                    0                   161                    77 
 
   Block Size:              16384b+               32768b+               65536b+ 
 No. of Reads:                  334                   391                    89 
No. of Writes:                   46                    14                    11 
 
   Block Size:             131072b+ 
 No. of Reads:                   12 
No. of Writes:                  929 
 %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
 ---------   -----------   -----------   -----------   ------------        ----
      0.00       0.00 us       0.00 us       0.00 us              4     RELEASE
      0.00       0.00 us       0.00 us       0.00 us              3  RELEASEDIR
      0.00     200.00 us     200.00 us     200.00 us              1    READDIRP
      0.00    1328.50 us      31.00 us    2626.00 us              2       FSTAT
      0.00    1357.75 us     189.00 us    2867.00 us              4     OPENDIR
      0.00    2471.75 us      42.00 us    8373.00 us              4        OPEN
      0.01   12659.00 us    9422.00 us   15896.00 us              2       FLUSH
      0.01    3924.20 us      19.00 us   10299.00 us             10      STATFS
      0.06    3631.49 us      38.00 us   22434.00 us             47        STAT
      0.50    6662.97 us      58.00 us   48168.00 us            213      LOOKUP
      1.37   11380.10 us      13.00 us  143645.00 us            343       FSYNC
      4.33    3938.49 us       7.00 us  237791.00 us           3142    FINODELK
      4.49   10040.88 us      18.00 us  193886.00 us           1276    FXATTROP
      6.13   13579.99 us      89.00 us  251439.00 us           1289        READ
     83.10  191342.48 us    6995.00 us  736901.00 us           1240       WRITE
 
    Duration: 16 seconds
   Data Read: 29314832 bytes
Data Written: 125550592 bytes

Ben Turner

2017-Sep-10 23:31 UTC

head link

[Gluster-users] Slow performance of gluster volume

----- Original Message -----> From: "Abi Askushi" <rightkicktech at gmail.com>
> To: "Krutika Dhananjay" <kdhananj at redhat.com>
> Cc: "gluster-user" <gluster-users at gluster.org>
> Sent: Tuesday, September 5, 2017 5:02:46 AM
> Subject: Re: [Gluster-users] Slow performance of gluster volume
> 
> Hi Krutika,
> 
> I already have a preallocated disk on VM.
> Now I am checking performance with dd on the hypervisors which have the
> gluster volume configured.
> 
> I tried also several values of shard-block-size and I keep getting the same
> low values on write performance.
> Enabling client-io-threads also did not have any affect.
> 
> The version of gluster I am using is glusterfs 3.8.12 built on May 11 2017
> 18:46:20.
> The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster as
> storage.
> 
> Below are the current settings:
> 
> 
> Volume Name: vms
> Type: Replicate
> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster0:/gluster/vms/brick
> Brick2: gluster1:/gluster/vms/brick
> Brick3: gluster2:/gluster/vms/brick (arbiter)
> Options Reconfigured:
> server.event-threads: 4
> client.event-threads: 4
> performance.client-io-threads: on
> features.shard-block-size: 512MB
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.stat-prefetch: on
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> performance.readdir-ahead: on
> nfs.disable: on
> nfs.export-volumes: on
> 
> 
> I observed that when testing with dd if=/dev/zero of=testfile bs=1G count=1
I
> get 65MB/s on the vms gluster volume (and the network traffic between the
> servers reaches ~ 500Mbps), while when testing with dd if=/dev/zero
> of=testfile bs=1G count=1 oflag=direct I get a consistent 10MB/s and the
> network traffic hardly reaching 100Mbps.
I have a replica 3 volume that I was seeing ~65 MB / sec on my VMs, I ended up
upgrading to a newer version and now I get closer to 150-180 MB / sec writes. 
Since you are using arbiter I would expect faster writes for you, what gluster
version are you running?  What OS?

-b

> 
> Any other things one can do?
> 
> On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay < kdhananj at
redhat.com >
> wrote:
> 
> 
> 
> I'm assuming you are using this volume to store vm images, because I
see
> shard in the options list.
> 
> Speaking from shard translator's POV, one thing you can do to improve
> performance is to use preallocated images.
> This will at least eliminate the need for shard to perform multiple steps
as
> part of the writes - such as creating the shard and then writing to it and
> then updating the aggregated file size - all of which require one network
> call each, which further get blown up once they reach AFR (replicate) into
> many more network calls.
> 
> Second, I'm assuming you're using the default shard block size of
4MB (you
> can confirm this using `gluster volume get <VOL> shard-block-size`).
In our
> tests, we've found that larger shard sizes perform better. So maybe
change
> the shard-block-size to 64MB (`gluster volume set <VOL>
shard-block-size
> 64MB`).
> 
> Third, keep stat-prefetch enabled. We've found that qemu sends quite a
lot of
> [f]stats which can be served from the (md)cache to improve performance. So
> enable that.
> 
> Also, could you also enable client-io-threads and see if that improves
> performance?
> 
> Which version of gluster are you using BTW?
> 
> -Krutika
> 
> 
> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi < rightkicktech at gmail.com
>
> wrote:
> 
> 
> 
> Hi all,
> 
> I have a gluster volume used to host several VMs (managed through oVirt).
> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit network
> for the storage.
> 
> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1
oflag=direct)
> out of the volume (e.g. writing at /root/) the performance of the dd is
> reported to be ~ 700MB/s, which is quite decent. When testing the dd on the
> gluster volume I get ~ 43 MB/s which way lower from the previous. When
> testing with dd the gluster volume, the network traffic was not exceeding
> 450 Mbps on the network interface. I would expect to reach near 900 Mbps
> considering that there is 1 Gbit of bandwidth available. This results
having
> VMs with very slow performance (especially on their write operations).
> 
> The full details of the volume are below. Any advise on what can be tweaked
> will be highly appreciated.
> 
> Volume Name: vms
> Type: Replicate
> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster0:/gluster/vms/brick
> Brick2: gluster1:/gluster/vms/brick
> Brick3: gluster2:/gluster/vms/brick (arbiter)
> Options Reconfigured:
> cluster.granular-entry-heal: enable
> performance.strict-o-direct: on
> network.ping-timeout: 30
> storage.owner-gid: 36
> storage.owner-uid: 36
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 10000
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> performance.readdir-ahead: on
> nfs.disable: on
> nfs.export-volumes: on
> 
> 
> Thanx,
> Alex
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

Seemingly Similar Threads

Search for more seemingly similar threads

Gluster users - Sep 2017 - Slow performance of gluster volume

[Gluster-users] Slow performance of gluster volume

[Gluster-users] Slow performance of gluster volume

[Gluster-users] Slow performance of gluster volume

[Gluster-users] Slow performance of gluster volume

Seemingly Similar Threads