thr3ads.net - Gluster users - [Gluster-users] Poor performance on a server-class system vs. desktop [Nov 2020]

If this information is useful, please help other people find it:
Share via:

Dmitry Antipov

2020-Nov-25 16:08 UTC

[Gluster-users] Poor performance on a server-class system vs. desktop

I'm trying to investigate the poor I/O performance results
observed on a server-class system vs. the desktop-class one.

The second one is 8-core notebook with NVME disk. According to

fio --name test --filename=XXX --bs=4k --rw=randwrite --ioengine=libaio
--direct=1 \
     --iodepth=128 --numjobs=1 --runtime=60 --time_based=1

this disk is able to perform 4K random writes at ~100K IOPS. When I create the
glusterfs volume using the same disk as backing store:

Volume Name: test1
Type: Replicate
Volume ID: 87bad2a9-7a4a-43fc-94d2-de72965b63d6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.1.112:/glusterfs/test1-000
Brick2: 192.168.1.112:/glusterfs/test1-001
Brick3: 192.168.1.112:/glusterfs/test1-002
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

and run:

[global]
name=ref-write
filename=testfile
ioengine=gfapi_async
volume=test1
brick=localhost
create_on_open=1
rw=randwrite
direct=1
numjobs=1
time_based=1
runtime=60

[test-4-kbytes]
bs=4k
size=1G
iodepth=128

I'm seeing ~10K IOPS. So adding an extra layer (glusterfs :-) between an I/O
client
(fio in this case) and NVME disk introduces ~10x overhead. Maybe worse than
expected,
but the things goes even worse when I'm switching to the server.

The server is 32-core machine with NVME disk capable to serve the same I/O
pattern
at ~200K IOPS. I've expected something similar to linear scalability, i.e.
~20K
IOPS then running the same fio workload on a gluster volume. But I surprisingly
got something very close to the same ~10K IOPS as seen on the desktop-class
machine.
So, here is ~20x overhead vs. ~10x one on the desktop.

The OSes are different (Fedora Core 33 on a notebook and relatively old Debian 9
on
server), but both systems runs the fairly recent 5.9.x kernels (without massive
tricky
tuning via sysctl or similar methods) and glusterfs 8.2, using XFS as the
filesystem
under the bricks.

I would greatly appreciate any ideas on debugging this.

Dmitry

Strahil Nikolov

2020-Nov-25 17:42 UTC

head link

[Gluster-users] Poor performance on a server-class system vs. desktop

Having the same performance on 2 very fast disks indicate that you are
hitting a limit.
You can start with this article: 
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements

Most probably increasing the performance.io-thread-count could help.


Best Regards,
Strahil Nikolov


? 19:08 +0300 ?? 25.11.2020 (??), Dmitry Antipov ??????:> I'm trying to investigate the poor I/O performance results
> observed on a server-class system vs. the desktop-class one.
> 
> The second one is 8-core notebook with NVME disk. According to
> 
> fio --name test --filename=XXX --bs=4k --rw=randwrite --
> ioengine=libaio --direct=1 \
>      --iodepth=128 --numjobs=1 --runtime=60 --time_based=1
> 
> this disk is able to perform 4K random writes at ~100K IOPS. When I
> create the
> glusterfs volume using the same disk as backing store:
> 
> Volume Name: test1
> Type: Replicate
> Volume ID: 87bad2a9-7a4a-43fc-94d2-de72965b63d6
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: 192.168.1.112:/glusterfs/test1-000
> Brick2: 192.168.1.112:/glusterfs/test1-001
> Brick3: 192.168.1.112:/glusterfs/test1-002
> Options Reconfigured:
> storage.fips-mode-rchecksum: on
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> 
> and run:
> 
> [global]
> name=ref-write
> filename=testfile
> ioengine=gfapi_async
> volume=test1
> brick=localhost
> create_on_open=1
> rw=randwrite
> direct=1
> numjobs=1
> time_based=1
> runtime=60
> 
> [test-4-kbytes]
> bs=4k
> size=1G
> iodepth=128
> 
> I'm seeing ~10K IOPS. So adding an extra layer (glusterfs :-) between
> an I/O client
> (fio in this case) and NVME disk introduces ~10x overhead. Maybe
> worse than expected,
> but the things goes even worse when I'm switching to the server.
> 
> The server is 32-core machine with NVME disk capable to serve the
> same I/O pattern
> at ~200K IOPS. I've expected something similar to linear scalability,
> i.e. ~20K
> IOPS then running the same fio workload on a gluster volume. But I
> surprisingly
> got something very close to the same ~10K IOPS as seen on the
> desktop-class machine.
> So, here is ~20x overhead vs. ~10x one on the desktop.
> 
> The OSes are different (Fedora Core 33 on a notebook and relatively
> old Debian 9 on
> server), but both systems runs the fairly recent 5.9.x kernels
> (without massive tricky
> tuning via sysctl or similar methods) and glusterfs 8.2, using XFS as
> the filesystem
> under the bricks.
> 
> I would greatly appreciate any ideas on debugging this.
> 
> Dmitry
> ________
> 
> 
> 
> Community Meeting Calendar:
> 
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

Gluster users - Nov 2020 - Poor performance on a server-class system vs. desktop

[Gluster-users] Poor performance on a server-class system vs. desktop

[Gluster-users] Poor performance on a server-class system vs. desktop