thr3ads.net - Gluster users - [Gluster-users] Extremely low performance

If this information is useful, please help other people find it:
Share via:

Vladimir Melnik

2019-Jul-03 08:39 UTC

[Gluster-users] Extremely low performance - am I doing something wrong?

Dear colleagues,

I have a lab with a bunch of virtual machines (the virtualization is
provided by KVM) running on the same physical host. 4 of these VMs are
working as a GlusterFS cluster and there's one more VM that works as a
client. I'll specify all the packages' versions in the ending of this
message.

I created 2 volumes - one is having type "Distributed-Replicate" and
another one is "Distribute". The problem is that both of volumes are
showing really poor performance.

Here's what I see on the client:
$ mount | grep gluster
10.13.1.16:storage1 on /mnt/glusterfs1 type
fuse.glusterfs(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
10.13.1.16:storage2 on /mnt/glusterfs2 type
fuse.glusterfs(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

$ for i in {1..5}; do { dd if=/dev/zero of=/mnt/glusterfs1/test.tmp bs=1M
count=10 oflag=sync; rm -f /mnt/glusterfs1/test.tmp; } done
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.47936 s, 7.1 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.62546 s, 6.5 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.71229 s, 6.1 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.68607 s, 6.2 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.82204 s, 5.8 MB/s

$ for i in {1..5}; do { dd if=/dev/zero of=/mnt/glusterfs2/test.tmp bs=1M
count=10 oflag=sync; rm -f /mnt/glusterfs2/test.tmp; } done
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.15739 s, 9.1 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.978528 s, 10.7 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.910642 s, 11.5 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.998249 s, 10.5 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.03377 s, 10.1 MB/s

The distributed one shows a bit better performance than the
distributed-replicated one, but it's still poor. :-(

The disk storage itself is OK, here's what I see on each of 4 GlusterFS
servers:
for i in {1..5}; do { dd if=/dev/zero of=/mnt/storage1/test.tmp bs=1M count=10
oflag=sync; rm -f /mnt/storage1/test.tmp; } done
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0656698 s, 160 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0476927 s, 220 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.036526 s, 287 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0329145 s, 319 MB/s
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0403988 s, 260 MB/s

The network between all 5 VMs is OK, they all are working on the same
physical host.

Can't understand, what am I doing wrong. :-(

Here's the detailed info about the volumes:
Volume Name: storage1
Type: Distributed-Replicate
Volume ID: a42e2554-99e5-4331-bcc4-0900d002ae32
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: gluster1.k8s.maitre-d.tucha.ua:/mnt/storage1/brick1
Brick2: gluster2.k8s.maitre-d.tucha.ua:/mnt/storage1/brick2
Brick3: gluster3.k8s.maitre-d.tucha.ua:/mnt/storage1/brick_arbiter (arbiter)
Brick4: gluster3.k8s.maitre-d.tucha.ua:/mnt/storage1/brick3
Brick5: gluster4.k8s.maitre-d.tucha.ua:/mnt/storage1/brick4
Brick6: gluster1.k8s.maitre-d.tucha.ua:/mnt/storage1/brick_arbiter (arbiter)
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Volume Name: storage2
Type: Distribute
Volume ID: df4d8096-ad03-493e-9e0e-586ce21fb067
Status: Started
Snapshot Count: 0
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: gluster1.k8s.maitre-d.tucha.ua:/mnt/storage2
Brick2: gluster2.k8s.maitre-d.tucha.ua:/mnt/storage2
Brick3: gluster3.k8s.maitre-d.tucha.ua:/mnt/storage2
Brick4: gluster4.k8s.maitre-d.tucha.ua:/mnt/storage2
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

The OS is CentOS Linux release 7.6.1810. The packages I'm using are:
glusterfs-6.3-1.el7.x86_64
glusterfs-api-6.3-1.el7.x86_64
glusterfs-cli-6.3-1.el7.x86_64
glusterfs-client-xlators-6.3-1.el7.x86_64
glusterfs-fuse-6.3-1.el7.x86_64
glusterfs-libs-6.3-1.el7.x86_64
glusterfs-server-6.3-1.el7.x86_64
kernel-3.10.0-327.el7.x86_64
kernel-3.10.0-514.2.2.el7.x86_64
kernel-3.10.0-957.12.1.el7.x86_64
kernel-3.10.0-957.12.2.el7.x86_64
kernel-3.10.0-957.21.3.el7.x86_64
kernel-tools-3.10.0-957.21.3.el7.x86_64
kernel-tools-libs-3.10.0-957.21.3.el7.x86_6

Please, be so kind as to help me to understand, did I do it wrong or
that's quite normal performance of GlusterFS?

Thanks in advance!

Vladimir Melnik

2019-Jul-03 10:20 UTC

head link

[Gluster-users] Extremely low performance - am I doing something wrong?

Just to be exact - here are the results of iperf3 measurements between the
client and one of servers:

$ iperf3 -c gluster1
Connecting to host gluster1, port 5201
[  4] local 10.13.16.1 port 33156 connected to 10.13.1.16 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  77.8 MBytes   652 Mbits/sec    3    337 KBytes
[  4]   1.00-2.00   sec  89.7 MBytes   752 Mbits/sec    0    505 KBytes
[  4]   2.00-3.00   sec   103 MBytes   862 Mbits/sec    2    631 KBytes
[  4]   3.00-4.00   sec   104 MBytes   870 Mbits/sec    1    741 KBytes
[  4]   4.00-5.00   sec  98.8 MBytes   828 Mbits/sec    1    834 KBytes
[  4]   5.00-6.00   sec   101 MBytes   849 Mbits/sec    0    923 KBytes
[  4]   6.00-7.00   sec   102 MBytes   860 Mbits/sec    0   1005 KBytes
[  4]   7.00-8.00   sec   106 MBytes   890 Mbits/sec    0   1.06 MBytes
[  4]   8.00-9.00   sec   109 MBytes   913 Mbits/sec    0   1.13 MBytes
[  4]   9.00-10.00  sec   109 MBytes   912 Mbits/sec    0   1.20 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1000 MBytes   839 Mbits/sec    7             sender
[  4]   0.00-10.00  sec   998 MBytes   837 Mbits/sec                  receiver

iperf Done.

$ iperf3 -c gluster1 -R
Connecting to host gluster1, port 5201
Reverse mode, remote host gluster1 is sending
[  4] local 10.13.16.1 port 33160 connected to 10.13.1.16 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  58.8 MBytes   492 Mbits/sec
[  4]   1.00-2.00   sec  80.1 MBytes   673 Mbits/sec
[  4]   2.00-3.00   sec  83.8 MBytes   703 Mbits/sec
[  4]   3.00-4.00   sec  95.6 MBytes   800 Mbits/sec
[  4]   4.00-5.00   sec   102 MBytes   858 Mbits/sec
[  4]   5.00-6.00   sec   101 MBytes   850 Mbits/sec
[  4]   6.00-7.00   sec   102 MBytes   860 Mbits/sec
[  4]   7.00-8.00   sec   107 MBytes   898 Mbits/sec
[  4]   8.00-9.00   sec   106 MBytes   893 Mbits/sec
[  4]   9.00-10.00  sec   108 MBytes   904 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   949 MBytes   796 Mbits/sec    6             sender
[  4]   0.00-10.00  sec   946 MBytes   794 Mbits/sec                  receiver

iperf Done.

So, the bandwidth seems to be OK too.

On Wed, Jul 03, 2019 at 11:39:41AM +0300, Vladimir Melnik
wrote:> Dear colleagues,
> 
> I have a lab with a bunch of virtual machines (the virtualization is
> provided by KVM) running on the same physical host. 4 of these VMs are
> working as a GlusterFS cluster and there's one more VM that works as a
> client. I'll specify all the packages' versions in the ending of
this
> message.
> 
> I created 2 volumes - one is having type "Distributed-Replicate"
and
> another one is "Distribute". The problem is that both of volumes
are
> showing really poor performance.
> 
> Here's what I see on the client:
> $ mount | grep gluster
> 10.13.1.16:storage1 on /mnt/glusterfs1 type
fuse.glusterfs(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
> 10.13.1.16:storage2 on /mnt/glusterfs2 type
fuse.glusterfs(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
> 
> $ for i in {1..5}; do { dd if=/dev/zero of=/mnt/glusterfs1/test.tmp bs=1M
count=10 oflag=sync; rm -f /mnt/glusterfs1/test.tmp; } done
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 1.47936 s, 7.1 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 1.62546 s, 6.5 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 1.71229 s, 6.1 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 1.68607 s, 6.2 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 1.82204 s, 5.8 MB/s
> 
> $ for i in {1..5}; do { dd if=/dev/zero of=/mnt/glusterfs2/test.tmp bs=1M
count=10 oflag=sync; rm -f /mnt/glusterfs2/test.tmp; } done
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 1.15739 s, 9.1 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.978528 s, 10.7 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.910642 s, 11.5 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.998249 s, 10.5 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 1.03377 s, 10.1 MB/s
> 
> The distributed one shows a bit better performance than the
> distributed-replicated one, but it's still poor. :-(
> 
> The disk storage itself is OK, here's what I see on each of 4 GlusterFS
> servers:
> for i in {1..5}; do { dd if=/dev/zero of=/mnt/storage1/test.tmp bs=1M
count=10 oflag=sync; rm -f /mnt/storage1/test.tmp; } done
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.0656698 s, 160 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.0476927 s, 220 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.036526 s, 287 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.0329145 s, 319 MB/s
> 10+0 records in
> 10+0 records out
> 10485760 bytes (10 MB) copied, 0.0403988 s, 260 MB/s
> 
> The network between all 5 VMs is OK, they all are working on the same
> physical host.
> 
> Can't understand, what am I doing wrong. :-(
> 
> Here's the detailed info about the volumes:
> Volume Name: storage1
> Type: Distributed-Replicate
> Volume ID: a42e2554-99e5-4331-bcc4-0900d002ae32
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x (2 + 1) = 6
> Transport-type: tcp
> Bricks:
> Brick1: gluster1.k8s.maitre-d.tucha.ua:/mnt/storage1/brick1
> Brick2: gluster2.k8s.maitre-d.tucha.ua:/mnt/storage1/brick2
> Brick3: gluster3.k8s.maitre-d.tucha.ua:/mnt/storage1/brick_arbiter
(arbiter)
> Brick4: gluster3.k8s.maitre-d.tucha.ua:/mnt/storage1/brick3
> Brick5: gluster4.k8s.maitre-d.tucha.ua:/mnt/storage1/brick4
> Brick6: gluster1.k8s.maitre-d.tucha.ua:/mnt/storage1/brick_arbiter
(arbiter)
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> 
> Volume Name: storage2
> Type: Distribute
> Volume ID: df4d8096-ad03-493e-9e0e-586ce21fb067
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 4
> Transport-type: tcp
> Bricks:
> Brick1: gluster1.k8s.maitre-d.tucha.ua:/mnt/storage2
> Brick2: gluster2.k8s.maitre-d.tucha.ua:/mnt/storage2
> Brick3: gluster3.k8s.maitre-d.tucha.ua:/mnt/storage2
> Brick4: gluster4.k8s.maitre-d.tucha.ua:/mnt/storage2
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
> 
> The OS is CentOS Linux release 7.6.1810. The packages I'm using are:
> glusterfs-6.3-1.el7.x86_64
> glusterfs-api-6.3-1.el7.x86_64
> glusterfs-cli-6.3-1.el7.x86_64
> glusterfs-client-xlators-6.3-1.el7.x86_64
> glusterfs-fuse-6.3-1.el7.x86_64
> glusterfs-libs-6.3-1.el7.x86_64
> glusterfs-server-6.3-1.el7.x86_64
> kernel-3.10.0-327.el7.x86_64
> kernel-3.10.0-514.2.2.el7.x86_64
> kernel-3.10.0-957.12.1.el7.x86_64
> kernel-3.10.0-957.12.2.el7.x86_64
> kernel-3.10.0-957.21.3.el7.x86_64
> kernel-tools-3.10.0-957.21.3.el7.x86_64
> kernel-tools-libs-3.10.0-957.21.3.el7.x86_6
> 
> Please, be so kind as to help me to understand, did I do it wrong or
> that's quite normal performance of GlusterFS?
> 
> Thanks in advance!
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
-- 
V.Melnik

Gluster users - Jul 2019 - Extremely low performance - am I doing something wrong?

[Gluster-users] Extremely low performance - am I doing something wrong?

[Gluster-users] Extremely low performance - am I doing something wrong?