Hi Krutika, I already have a preallocated disk on VM. Now I am checking performance with dd on the hypervisors which have the gluster volume configured. I tried also several values of shard-block-size and I keep getting the same low values on write performance. Enabling client-io-threads also did not have any affect. The version of gluster I am using is glusterfs 3.8.12 built on May 11 2017 18:46:20. The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster as storage. Below are the current settings: Volume Name: vms Type: Replicate Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: gluster0:/gluster/vms/brick Brick2: gluster1:/gluster/vms/brick Brick3: gluster2:/gluster/vms/brick (arbiter) Options Reconfigured: server.event-threads: 4 client.event-threads: 4 performance.client-io-threads: on features.shard-block-size: 512MB cluster.granular-entry-heal: enable performance.strict-o-direct: on network.ping-timeout: 30 storage.owner-gid: 36 storage.owner-uid: 36 user.cifs: off features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: off performance.low-prio-threads: 32 performance.stat-prefetch: on performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on nfs.export-volumes: on I observed that when testing with dd if=/dev/zero of=testfile bs=1G count=1 I get 65MB/s on the vms gluster volume (and the network traffic between the servers reaches ~ 500Mbps), while when testing with dd if=/dev/zero of=testfile bs=1G count=1 *oflag=direct *I get a consistent 10MB/s and the network traffic hardly reaching 100Mbps. Any other things one can do? On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at redhat.com> wrote:> I'm assuming you are using this volume to store vm images, because I see > shard in the options list. > > Speaking from shard translator's POV, one thing you can do to improve > performance is to use preallocated images. > This will at least eliminate the need for shard to perform multiple steps > as part of the writes - such as creating the shard and then writing to it > and then updating the aggregated file size - all of which require one > network call each, which further get blown up once they reach AFR > (replicate) into many more network calls. > > Second, I'm assuming you're using the default shard block size of 4MB (you > can confirm this using `gluster volume get <VOL> shard-block-size`). In our > tests, we've found that larger shard sizes perform better. So maybe change > the shard-block-size to 64MB (`gluster volume set <VOL> shard-block-size > 64MB`). > > Third, keep stat-prefetch enabled. We've found that qemu sends quite a lot > of [f]stats which can be served from the (md)cache to improve performance. > So enable that. > > Also, could you also enable client-io-threads and see if that improves > performance? > > Which version of gluster are you using BTW? > > -Krutika > > > On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at gmail.com> > wrote: > >> Hi all, >> >> I have a gluster volume used to host several VMs (managed through oVirt). >> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit >> network for the storage. >> >> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1 >> oflag=direct) out of the volume (e.g. writing at /root/) the performance of >> the dd is reported to be ~ 700MB/s, which is quite decent. When testing the >> dd on the gluster volume I get ~ 43 MB/s which way lower from the previous. >> When testing with dd the gluster volume, the network traffic was not >> exceeding 450 Mbps on the network interface. I would expect to reach near >> 900 Mbps considering that there is 1 Gbit of bandwidth available. This >> results having VMs with very slow performance (especially on their write >> operations). >> >> The full details of the volume are below. Any advise on what can be >> tweaked will be highly appreciated. >> >> Volume Name: vms >> Type: Replicate >> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x (2 + 1) = 3 >> Transport-type: tcp >> Bricks: >> Brick1: gluster0:/gluster/vms/brick >> Brick2: gluster1:/gluster/vms/brick >> Brick3: gluster2:/gluster/vms/brick (arbiter) >> Options Reconfigured: >> cluster.granular-entry-heal: enable >> performance.strict-o-direct: on >> network.ping-timeout: 30 >> storage.owner-gid: 36 >> storage.owner-uid: 36 >> user.cifs: off >> features.shard: on >> cluster.shd-wait-qlength: 10000 >> cluster.shd-max-threads: 8 >> cluster.locking-scheme: granular >> cluster.data-self-heal-algorithm: full >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> cluster.eager-lock: enable >> network.remote-dio: off >> performance.low-prio-threads: 32 >> performance.stat-prefetch: off >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> transport.address-family: inet >> performance.readdir-ahead: on >> nfs.disable: on >> nfs.export-volumes: on >> >> >> Thanx, >> Alex >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170905/ac6e6de4/attachment.html>
Krutika Dhananjay
2017-Sep-05 09:48 UTC
[Gluster-users] Slow performance of gluster volume
OK my understanding is that with preallocated disks the performance with and without shard will be the same. In any case, please attach the volume profile[1], so we can see what else is slowing things down. -Krutika [1] - https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/#running-glusterfs-volume-profile-command On Tue, Sep 5, 2017 at 2:32 PM, Abi Askushi <rightkicktech at gmail.com> wrote:> Hi Krutika, > > I already have a preallocated disk on VM. > Now I am checking performance with dd on the hypervisors which have the > gluster volume configured. > > I tried also several values of shard-block-size and I keep getting the > same low values on write performance. > Enabling client-io-threads also did not have any affect. > > The version of gluster I am using is glusterfs 3.8.12 built on May 11 2017 > 18:46:20. > The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster as > storage. > > Below are the current settings: > > > Volume Name: vms > Type: Replicate > Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: gluster0:/gluster/vms/brick > Brick2: gluster1:/gluster/vms/brick > Brick3: gluster2:/gluster/vms/brick (arbiter) > Options Reconfigured: > server.event-threads: 4 > client.event-threads: 4 > performance.client-io-threads: on > features.shard-block-size: 512MB > cluster.granular-entry-heal: enable > performance.strict-o-direct: on > network.ping-timeout: 30 > storage.owner-gid: 36 > storage.owner-uid: 36 > user.cifs: off > features.shard: on > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 8 > cluster.locking-scheme: granular > cluster.data-self-heal-algorithm: full > cluster.server-quorum-type: server > cluster.quorum-type: auto > cluster.eager-lock: enable > network.remote-dio: off > performance.low-prio-threads: 32 > performance.stat-prefetch: on > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > transport.address-family: inet > performance.readdir-ahead: on > nfs.disable: on > nfs.export-volumes: on > > > I observed that when testing with dd if=/dev/zero of=testfile bs=1G > count=1 I get 65MB/s on the vms gluster volume (and the network traffic > between the servers reaches ~ 500Mbps), while when testing with dd > if=/dev/zero of=testfile bs=1G count=1 *oflag=direct *I get a consistent > 10MB/s and the network traffic hardly reaching 100Mbps. > > Any other things one can do? > > On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at redhat.com> > wrote: > >> I'm assuming you are using this volume to store vm images, because I see >> shard in the options list. >> >> Speaking from shard translator's POV, one thing you can do to improve >> performance is to use preallocated images. >> This will at least eliminate the need for shard to perform multiple steps >> as part of the writes - such as creating the shard and then writing to it >> and then updating the aggregated file size - all of which require one >> network call each, which further get blown up once they reach AFR >> (replicate) into many more network calls. >> >> Second, I'm assuming you're using the default shard block size of 4MB >> (you can confirm this using `gluster volume get <VOL> shard-block-size`). >> In our tests, we've found that larger shard sizes perform better. So maybe >> change the shard-block-size to 64MB (`gluster volume set <VOL> >> shard-block-size 64MB`). >> >> Third, keep stat-prefetch enabled. We've found that qemu sends quite a >> lot of [f]stats which can be served from the (md)cache to improve >> performance. So enable that. >> >> Also, could you also enable client-io-threads and see if that improves >> performance? >> >> Which version of gluster are you using BTW? >> >> -Krutika >> >> >> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at gmail.com> >> wrote: >> >>> Hi all, >>> >>> I have a gluster volume used to host several VMs (managed through >>> oVirt). >>> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit >>> network for the storage. >>> >>> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1 >>> oflag=direct) out of the volume (e.g. writing at /root/) the performance of >>> the dd is reported to be ~ 700MB/s, which is quite decent. When testing the >>> dd on the gluster volume I get ~ 43 MB/s which way lower from the previous. >>> When testing with dd the gluster volume, the network traffic was not >>> exceeding 450 Mbps on the network interface. I would expect to reach near >>> 900 Mbps considering that there is 1 Gbit of bandwidth available. This >>> results having VMs with very slow performance (especially on their write >>> operations). >>> >>> The full details of the volume are below. Any advise on what can be >>> tweaked will be highly appreciated. >>> >>> Volume Name: vms >>> Type: Replicate >>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x (2 + 1) = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: gluster0:/gluster/vms/brick >>> Brick2: gluster1:/gluster/vms/brick >>> Brick3: gluster2:/gluster/vms/brick (arbiter) >>> Options Reconfigured: >>> cluster.granular-entry-heal: enable >>> performance.strict-o-direct: on >>> network.ping-timeout: 30 >>> storage.owner-gid: 36 >>> storage.owner-uid: 36 >>> user.cifs: off >>> features.shard: on >>> cluster.shd-wait-qlength: 10000 >>> cluster.shd-max-threads: 8 >>> cluster.locking-scheme: granular >>> cluster.data-self-heal-algorithm: full >>> cluster.server-quorum-type: server >>> cluster.quorum-type: auto >>> cluster.eager-lock: enable >>> network.remote-dio: off >>> performance.low-prio-threads: 32 >>> performance.stat-prefetch: off >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> transport.address-family: inet >>> performance.readdir-ahead: on >>> nfs.disable: on >>> nfs.export-volumes: on >>> >>> >>> Thanx, >>> Alex >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170905/eba3e690/attachment.html>
Hi Krutika, Attached the profile stats. I enabled profiling then ran some dd tests. Also 3 Windows VMs are running on top this volume but did not do any stress testing on the VMs. I have left the profiling enabled in case more time is needed for useful stats. Thanx On Tue, Sep 5, 2017 at 12:48 PM, Krutika Dhananjay <kdhananj at redhat.com> wrote:> OK my understanding is that with preallocated disks the performance with > and without shard will be the same. > > In any case, please attach the volume profile[1], so we can see what else > is slowing things down. > > -Krutika > > [1] - https://gluster.readthedocs.io/en/latest/Administrator% > 20Guide/Monitoring%20Workload/#running-glusterfs-volume-profile-command > > On Tue, Sep 5, 2017 at 2:32 PM, Abi Askushi <rightkicktech at gmail.com> > wrote: > >> Hi Krutika, >> >> I already have a preallocated disk on VM. >> Now I am checking performance with dd on the hypervisors which have the >> gluster volume configured. >> >> I tried also several values of shard-block-size and I keep getting the >> same low values on write performance. >> Enabling client-io-threads also did not have any affect. >> >> The version of gluster I am using is glusterfs 3.8.12 built on May 11 >> 2017 18:46:20. >> The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster >> as storage. >> >> Below are the current settings: >> >> >> Volume Name: vms >> Type: Replicate >> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x (2 + 1) = 3 >> Transport-type: tcp >> Bricks: >> Brick1: gluster0:/gluster/vms/brick >> Brick2: gluster1:/gluster/vms/brick >> Brick3: gluster2:/gluster/vms/brick (arbiter) >> Options Reconfigured: >> server.event-threads: 4 >> client.event-threads: 4 >> performance.client-io-threads: on >> features.shard-block-size: 512MB >> cluster.granular-entry-heal: enable >> performance.strict-o-direct: on >> network.ping-timeout: 30 >> storage.owner-gid: 36 >> storage.owner-uid: 36 >> user.cifs: off >> features.shard: on >> cluster.shd-wait-qlength: 10000 >> cluster.shd-max-threads: 8 >> cluster.locking-scheme: granular >> cluster.data-self-heal-algorithm: full >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> cluster.eager-lock: enable >> network.remote-dio: off >> performance.low-prio-threads: 32 >> performance.stat-prefetch: on >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> transport.address-family: inet >> performance.readdir-ahead: on >> nfs.disable: on >> nfs.export-volumes: on >> >> >> I observed that when testing with dd if=/dev/zero of=testfile bs=1G >> count=1 I get 65MB/s on the vms gluster volume (and the network traffic >> between the servers reaches ~ 500Mbps), while when testing with dd >> if=/dev/zero of=testfile bs=1G count=1 *oflag=direct *I get a consistent >> 10MB/s and the network traffic hardly reaching 100Mbps. >> >> Any other things one can do? >> >> On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at redhat.com> >> wrote: >> >>> I'm assuming you are using this volume to store vm images, because I see >>> shard in the options list. >>> >>> Speaking from shard translator's POV, one thing you can do to improve >>> performance is to use preallocated images. >>> This will at least eliminate the need for shard to perform multiple >>> steps as part of the writes - such as creating the shard and then writing >>> to it and then updating the aggregated file size - all of which require one >>> network call each, which further get blown up once they reach AFR >>> (replicate) into many more network calls. >>> >>> Second, I'm assuming you're using the default shard block size of 4MB >>> (you can confirm this using `gluster volume get <VOL> shard-block-size`). >>> In our tests, we've found that larger shard sizes perform better. So maybe >>> change the shard-block-size to 64MB (`gluster volume set <VOL> >>> shard-block-size 64MB`). >>> >>> Third, keep stat-prefetch enabled. We've found that qemu sends quite a >>> lot of [f]stats which can be served from the (md)cache to improve >>> performance. So enable that. >>> >>> Also, could you also enable client-io-threads and see if that improves >>> performance? >>> >>> Which version of gluster are you using BTW? >>> >>> -Krutika >>> >>> >>> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at gmail.com> >>> wrote: >>> >>>> Hi all, >>>> >>>> I have a gluster volume used to host several VMs (managed through >>>> oVirt). >>>> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit >>>> network for the storage. >>>> >>>> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1 >>>> oflag=direct) out of the volume (e.g. writing at /root/) the performance of >>>> the dd is reported to be ~ 700MB/s, which is quite decent. When testing the >>>> dd on the gluster volume I get ~ 43 MB/s which way lower from the previous. >>>> When testing with dd the gluster volume, the network traffic was not >>>> exceeding 450 Mbps on the network interface. I would expect to reach near >>>> 900 Mbps considering that there is 1 Gbit of bandwidth available. This >>>> results having VMs with very slow performance (especially on their write >>>> operations). >>>> >>>> The full details of the volume are below. Any advise on what can be >>>> tweaked will be highly appreciated. >>>> >>>> Volume Name: vms >>>> Type: Replicate >>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 1 x (2 + 1) = 3 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: gluster0:/gluster/vms/brick >>>> Brick2: gluster1:/gluster/vms/brick >>>> Brick3: gluster2:/gluster/vms/brick (arbiter) >>>> Options Reconfigured: >>>> cluster.granular-entry-heal: enable >>>> performance.strict-o-direct: on >>>> network.ping-timeout: 30 >>>> storage.owner-gid: 36 >>>> storage.owner-uid: 36 >>>> user.cifs: off >>>> features.shard: on >>>> cluster.shd-wait-qlength: 10000 >>>> cluster.shd-max-threads: 8 >>>> cluster.locking-scheme: granular >>>> cluster.data-self-heal-algorithm: full >>>> cluster.server-quorum-type: server >>>> cluster.quorum-type: auto >>>> cluster.eager-lock: enable >>>> network.remote-dio: off >>>> performance.low-prio-threads: 32 >>>> performance.stat-prefetch: off >>>> performance.io-cache: off >>>> performance.read-ahead: off >>>> performance.quick-read: off >>>> transport.address-family: inet >>>> performance.readdir-ahead: on >>>> nfs.disable: on >>>> nfs.export-volumes: on >>>> >>>> >>>> Thanx, >>>> Alex >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170905/6564e14d/attachment.html> -------------- next part -------------- Brick: gluster0:/gluster/vms/brick ---------------------------------- Cumulative Stats: Block Size: 32b+ 256b+ 512b+ No. of Reads: 0 7093 79384 No. of Writes: 12 134 16639 Block Size: 1024b+ 2048b+ 4096b+ No. of Reads: 76171 88973 408733 No. of Writes: 128548 129482 622604 Block Size: 8192b+ 16384b+ 32768b+ No. of Reads: 562933 175791 164097 No. of Writes: 379782 132651 93864 Block Size: 65536b+ 131072b+ No. of Reads: 151097 551006 No. of Writes: 121271 670952 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 411 FORGET 0.00 0.00 us 0.00 us 0.00 us 188771 RELEASE 0.00 0.00 us 0.00 us 0.00 us 37310 RELEASEDIR 0.00 56.58 us 44.00 us 88.00 us 12 SETXATTR 0.00 89.00 us 64.00 us 160.00 us 12 RMDIR 0.00 278.80 us 135.00 us 520.00 us 5 TRUNCATE 0.00 110.71 us 34.00 us 164.00 us 14 XATTROP 0.00 113.64 us 9.00 us 383.00 us 14 READDIR 0.00 127.00 us 96.00 us 163.00 us 24 RENAME 0.00 265.09 us 80.00 us 1892.00 us 22 UNLINK 0.00 359.81 us 8.00 us 6695.00 us 43 GETXATTR 0.00 9722.50 us 7233.00 us 14112.00 us 4 MKNOD 0.00 8439.92 us 128.00 us 17393.00 us 12 MKDIR 0.00 1566.25 us 44.00 us 21348.00 us 104 REMOVEXATTR 0.00 1802.21 us 43.00 us 47897.00 us 111 SETATTR 0.00 10117.71 us 110.00 us 62262.00 us 24 CREATE 0.00 3599.83 us 10.00 us 15143.00 us 70 FLUSH 0.00 2881.88 us 31.00 us 100555.00 us 194 OPEN 0.00 1528.12 us 20.00 us 37185.00 us 472 READDIRP 0.00 2272.81 us 25.00 us 94470.00 us 328 FSTAT 0.00 1658.86 us 1.00 us 27155.00 us 567 OPENDIR 0.00 542.75 us 15.00 us 91928.00 us 1740 STAT 0.01 1699.46 us 11.00 us 45324.00 us 780 STATFS 0.05 28118.57 us 10.00 us 587648.00 us 399 ENTRYLK 0.13 25935.31 us 10.00 us 1693505.00 us 986 INODELK 0.47 9175.02 us 11.00 us 642079.00 us 10369 FSYNC 0.65 8647.81 us 10.00 us 64883650.00 us 15345 LOOKUP 1.23 9379.15 us 16.00 us 7110244.00 us 26708 FXATTROP 3.01 7832.97 us 17.00 us 799409.00 us 78472 READ 44.22 261211.86 us 51.00 us 1015501.00 us 34569 WRITE 50.22 182315.78 us 8.00 us 1584450.00 us 56251 FINODELK Duration: 67635 seconds Data Read: 100176806862 bytes Data Written: 112553284400 bytes Interval 9 Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 2 2 1 No. of Writes: 0 1 0 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 2 104 20 No. of Writes: 0 161 77 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 342 353 165 No. of Writes: 46 14 11 Block Size: 131072b+ No. of Reads: 16 No. of Writes: 930 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 23 FORGET 0.00 0.00 us 0.00 us 0.00 us 4 RELEASE 0.00 0.00 us 0.00 us 0.00 us 3 RELEASEDIR 0.00 76.00 us 76.00 us 76.00 us 1 FSTAT 0.00 43.00 us 17.00 us 69.00 us 2 FLUSH 0.00 36.50 us 28.00 us 41.00 us 4 OPENDIR 0.00 81.50 us 50.00 us 120.00 us 4 OPEN 0.00 46.60 us 23.00 us 93.00 us 10 STATFS 0.00 156.50 us 23.00 us 336.00 us 6 READDIRP 0.00 499.03 us 42.00 us 6709.00 us 33 STAT 0.03 1201.35 us 49.00 us 47975.00 us 214 LOOKUP 0.19 1265.42 us 20.00 us 163429.00 us 1276 FXATTROP 0.35 8882.97 us 14.00 us 418304.00 us 344 FSYNC 0.59 4796.28 us 22.00 us 188365.00 us 1063 READ 10.98 76545.35 us 3614.00 us 430475.00 us 1246 WRITE 87.86 244152.22 us 11.00 us 1306226.00 us 3126 FINODELK Duration: 16 seconds Data Read: 31334024 bytes Data Written: 125681664 bytes Brick: gluster2:/gluster/vms/brick ---------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 2481551 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 577 FORGET 0.00 0.00 us 0.00 us 0.00 us 266910 RELEASE 0.00 0.00 us 0.00 us 0.00 us 38816 RELEASEDIR 0.00 77.60 us 49.00 us 95.00 us 5 TRUNCATE 0.00 80.67 us 44.00 us 126.00 us 12 SETXATTR 0.00 106.83 us 66.00 us 153.00 us 12 RMDIR 0.00 129.50 us 42.00 us 257.00 us 14 XATTROP 0.00 134.93 us 11.00 us 508.00 us 14 READDIR 0.00 110.82 us 65.00 us 241.00 us 22 UNLINK 0.00 58.95 us 9.00 us 160.00 us 42 GETXATTR 0.00 87.30 us 45.00 us 176.00 us 33 FSTAT 0.00 42.63 us 11.00 us 94.00 us 70 FLUSH 0.01 81.05 us 38.00 us 208.00 us 104 REMOVEXATTR 0.01 89.28 us 37.00 us 172.00 us 111 SETATTR 0.02 71.49 us 31.00 us 160.00 us 194 OPEN 0.02 42.48 us 9.00 us 1465.00 us 400 ENTRYLK 0.05 56.30 us 1.00 us 183.00 us 567 OPENDIR 0.05 8110.50 us 2480.00 us 11210.00 us 4 MKNOD 0.06 42.69 us 9.00 us 214.00 us 986 INODELK 0.14 4015.42 us 102.00 us 16841.00 us 24 RENAME 0.31 9026.62 us 95.00 us 102332.00 us 24 CREATE 0.37 21182.58 us 164.00 us 103452.00 us 12 MKDIR 2.71 54.37 us 11.00 us 1930.00 us 34568 WRITE 9.09 236.01 us 16.00 us 193113.00 us 26709 FXATTROP 10.01 452.32 us 9.00 us 127407.00 us 15347 LOOKUP 17.85 209.26 us 8.00 us 119349.00 us 59169 FINODELK 59.28 3964.37 us 12.00 us 231606.00 us 10372 FSYNC Duration: 71894 seconds Data Read: 0 bytes Data Written: 2481551 bytes Interval 9 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 1239 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 34 FORGET 0.00 0.00 us 0.00 us 0.00 us 4 RELEASE 0.00 0.00 us 0.00 us 0.00 us 3 RELEASEDIR 0.00 37.50 us 30.00 us 45.00 us 2 FLUSH 0.01 75.75 us 58.00 us 113.00 us 4 OPEN 0.01 82.50 us 55.00 us 141.00 us 4 OPENDIR 2.02 52.36 us 13.00 us 143.00 us 1239 WRITE 2.45 371.39 us 46.00 us 8633.00 us 212 LOOKUP 7.23 182.03 us 19.00 us 122613.00 us 1275 FXATTROP 43.53 4071.27 us 14.00 us 116265.00 us 343 FSYNC 44.74 423.54 us 9.00 us 92407.00 us 3389 FINODELK Duration: 16 seconds Data Read: 0 bytes Data Written: 1239 bytes Brick: gluster1:/gluster/vms/brick ---------------------------------- Cumulative Stats: Block Size: 32b+ 256b+ 512b+ No. of Reads: 0 11694 2839 No. of Writes: 8 90 13871 Block Size: 1024b+ 2048b+ 4096b+ No. of Reads: 5962 4739 46620 No. of Writes: 97317 94324 478065 Block Size: 8192b+ 16384b+ 32768b+ No. of Reads: 18976 20815 26327 No. of Writes: 261476 108447 73657 Block Size: 65536b+ 131072b+ No. of Reads: 23025 37767 No. of Writes: 91901 666916 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 175 FORGET 0.00 0.00 us 0.00 us 0.00 us 25105 RELEASE 0.00 0.00 us 0.00 us 0.00 us 32798 RELEASEDIR 0.00 76.75 us 53.00 us 110.00 us 12 SETXATTR 0.00 95.83 us 58.00 us 141.00 us 12 RMDIR 0.00 291.40 us 226.00 us 446.00 us 5 TRUNCATE 0.00 134.50 us 61.00 us 278.00 us 14 XATTROP 0.00 128.46 us 82.00 us 209.00 us 24 RENAME 0.00 173.65 us 10.00 us 451.00 us 20 READDIR 0.00 161.91 us 82.00 us 662.00 us 22 UNLINK 0.00 8810.25 us 5320.00 us 11953.00 us 4 MKNOD 0.00 6320.17 us 183.00 us 13998.00 us 12 MKDIR 0.00 838.83 us 39.00 us 12448.00 us 104 REMOVEXATTR 0.00 934.96 us 39.00 us 29283.00 us 111 SETATTR 0.00 2144.17 us 11.00 us 41256.00 us 70 FLUSH 0.00 6324.54 us 112.00 us 19306.00 us 24 CREATE 0.00 936.09 us 22.00 us 23659.00 us 187 FSTAT 0.01 1918.45 us 27.00 us 104525.00 us 194 OPEN 0.01 835.49 us 1.00 us 66949.00 us 570 OPENDIR 0.01 906.71 us 17.00 us 24395.00 us 635 READDIRP 0.01 814.12 us 11.00 us 22800.00 us 782 STATFS 0.01 1638.38 us 7.00 us 27751.00 us 399 ENTRYLK 0.02 676.51 us 23.00 us 60154.00 us 1341 STAT 0.02 12337.72 us 9.00 us 1093769.00 us 90 GETXATTR 0.06 3416.72 us 10.00 us 2658066.00 us 986 INODELK 0.57 1937.28 us 12.00 us 264094.00 us 15322 LOOKUP 1.42 1290.09 us 7.00 us 357697.00 us 57354 FINODELK 1.87 9391.45 us 12.00 us 808993.00 us 10372 FSYNC 1.88 3670.51 us 15.00 us 1453184.00 us 26708 FXATTROP 7.69 6216.58 us 27.00 us 571432.00 us 64311 READ 86.39 130052.64 us 47.00 us 849076.00 us 34555 WRITE Duration: 58560 seconds Data Read: 8469686670 bytes Data Written: 106142019052 bytes Interval 9 Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 4 6 8 No. of Writes: 0 1 0 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 9 308 128 No. of Writes: 0 161 77 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 334 391 89 No. of Writes: 46 14 11 Block Size: 131072b+ No. of Reads: 12 No. of Writes: 929 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 4 RELEASE 0.00 0.00 us 0.00 us 0.00 us 3 RELEASEDIR 0.00 200.00 us 200.00 us 200.00 us 1 READDIRP 0.00 1328.50 us 31.00 us 2626.00 us 2 FSTAT 0.00 1357.75 us 189.00 us 2867.00 us 4 OPENDIR 0.00 2471.75 us 42.00 us 8373.00 us 4 OPEN 0.01 12659.00 us 9422.00 us 15896.00 us 2 FLUSH 0.01 3924.20 us 19.00 us 10299.00 us 10 STATFS 0.06 3631.49 us 38.00 us 22434.00 us 47 STAT 0.50 6662.97 us 58.00 us 48168.00 us 213 LOOKUP 1.37 11380.10 us 13.00 us 143645.00 us 343 FSYNC 4.33 3938.49 us 7.00 us 237791.00 us 3142 FINODELK 4.49 10040.88 us 18.00 us 193886.00 us 1276 FXATTROP 6.13 13579.99 us 89.00 us 251439.00 us 1289 READ 83.10 191342.48 us 6995.00 us 736901.00 us 1240 WRITE Duration: 16 seconds Data Read: 29314832 bytes Data Written: 125550592 bytes
----- Original Message -----> From: "Abi Askushi" <rightkicktech at gmail.com> > To: "Krutika Dhananjay" <kdhananj at redhat.com> > Cc: "gluster-user" <gluster-users at gluster.org> > Sent: Tuesday, September 5, 2017 5:02:46 AM > Subject: Re: [Gluster-users] Slow performance of gluster volume > > Hi Krutika, > > I already have a preallocated disk on VM. > Now I am checking performance with dd on the hypervisors which have the > gluster volume configured. > > I tried also several values of shard-block-size and I keep getting the same > low values on write performance. > Enabling client-io-threads also did not have any affect. > > The version of gluster I am using is glusterfs 3.8.12 built on May 11 2017 > 18:46:20. > The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster as > storage. > > Below are the current settings: > > > Volume Name: vms > Type: Replicate > Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: gluster0:/gluster/vms/brick > Brick2: gluster1:/gluster/vms/brick > Brick3: gluster2:/gluster/vms/brick (arbiter) > Options Reconfigured: > server.event-threads: 4 > client.event-threads: 4 > performance.client-io-threads: on > features.shard-block-size: 512MB > cluster.granular-entry-heal: enable > performance.strict-o-direct: on > network.ping-timeout: 30 > storage.owner-gid: 36 > storage.owner-uid: 36 > user.cifs: off > features.shard: on > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 8 > cluster.locking-scheme: granular > cluster.data-self-heal-algorithm: full > cluster.server-quorum-type: server > cluster.quorum-type: auto > cluster.eager-lock: enable > network.remote-dio: off > performance.low-prio-threads: 32 > performance.stat-prefetch: on > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > transport.address-family: inet > performance.readdir-ahead: on > nfs.disable: on > nfs.export-volumes: on > > > I observed that when testing with dd if=/dev/zero of=testfile bs=1G count=1 I > get 65MB/s on the vms gluster volume (and the network traffic between the > servers reaches ~ 500Mbps), while when testing with dd if=/dev/zero > of=testfile bs=1G count=1 oflag=direct I get a consistent 10MB/s and the > network traffic hardly reaching 100Mbps.I have a replica 3 volume that I was seeing ~65 MB / sec on my VMs, I ended up upgrading to a newer version and now I get closer to 150-180 MB / sec writes. Since you are using arbiter I would expect faster writes for you, what gluster version are you running? What OS? -b> > Any other things one can do? > > On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay < kdhananj at redhat.com > > wrote: > > > > I'm assuming you are using this volume to store vm images, because I see > shard in the options list. > > Speaking from shard translator's POV, one thing you can do to improve > performance is to use preallocated images. > This will at least eliminate the need for shard to perform multiple steps as > part of the writes - such as creating the shard and then writing to it and > then updating the aggregated file size - all of which require one network > call each, which further get blown up once they reach AFR (replicate) into > many more network calls. > > Second, I'm assuming you're using the default shard block size of 4MB (you > can confirm this using `gluster volume get <VOL> shard-block-size`). In our > tests, we've found that larger shard sizes perform better. So maybe change > the shard-block-size to 64MB (`gluster volume set <VOL> shard-block-size > 64MB`). > > Third, keep stat-prefetch enabled. We've found that qemu sends quite a lot of > [f]stats which can be served from the (md)cache to improve performance. So > enable that. > > Also, could you also enable client-io-threads and see if that improves > performance? > > Which version of gluster are you using BTW? > > -Krutika > > > On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi < rightkicktech at gmail.com > > wrote: > > > > Hi all, > > I have a gluster volume used to host several VMs (managed through oVirt). > The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit network > for the storage. > > When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1 oflag=direct) > out of the volume (e.g. writing at /root/) the performance of the dd is > reported to be ~ 700MB/s, which is quite decent. When testing the dd on the > gluster volume I get ~ 43 MB/s which way lower from the previous. When > testing with dd the gluster volume, the network traffic was not exceeding > 450 Mbps on the network interface. I would expect to reach near 900 Mbps > considering that there is 1 Gbit of bandwidth available. This results having > VMs with very slow performance (especially on their write operations). > > The full details of the volume are below. Any advise on what can be tweaked > will be highly appreciated. > > Volume Name: vms > Type: Replicate > Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: gluster0:/gluster/vms/brick > Brick2: gluster1:/gluster/vms/brick > Brick3: gluster2:/gluster/vms/brick (arbiter) > Options Reconfigured: > cluster.granular-entry-heal: enable > performance.strict-o-direct: on > network.ping-timeout: 30 > storage.owner-gid: 36 > storage.owner-uid: 36 > user.cifs: off > features.shard: on > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 8 > cluster.locking-scheme: granular > cluster.data-self-heal-algorithm: full > cluster.server-quorum-type: server > cluster.quorum-type: auto > cluster.eager-lock: enable > network.remote-dio: off > performance.low-prio-threads: 32 > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > transport.address-family: inet > performance.readdir-ahead: on > nfs.disable: on > nfs.export-volumes: on > > > Thanx, > Alex > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users
Possibly Parallel Threads
- Slow performance of gluster volume
- [ovirt-users] Re: Gluster problems, cluster performance issues
- [ovirt-users] Re: Gluster problems, cluster performance issues
- [ovirt-users] Re: Gluster problems, cluster performance issues
- [ovirt-users] Re: Gluster problems, cluster performance issues