Ewen Chan
2020-Nov-26 03:33 UTC
[Gluster-users] Poor performance on a server-class system vs. desktop
Dmitry: Is there a way to check and see if the GlusterFS write requests is being routed through the network interface? I am asking this because of your bricks/host definition as you showed below. Thanks. Sincerely, Ewen ________________________________ From: gluster-users-bounces at gluster.org <gluster-users-bounces at gluster.org> on behalf of Strahil Nikolov <hunter86_bg at yahoo.com> Sent: November 25, 2020 12:42 PM To: Dmitry Antipov <dmantipov at yandex.ru> Cc: gluster-users <gluster-users at gluster.org> Subject: Re: [Gluster-users] Poor performance on a server-class system vs. desktop Having the same performance on 2 very fast disks indicate that you are hitting a limit. You can start with this article: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/small_file_performance_enhancements Most probably increasing the performance.io-thread-count could help. Best Regards, Strahil Nikolov ? 19:08 +0300 ?? 25.11.2020 (??), Dmitry Antipov ??????:> I'm trying to investigate the poor I/O performance results > observed on a server-class system vs. the desktop-class one. > > The second one is 8-core notebook with NVME disk. According to > > fio --name test --filename=XXX --bs=4k --rw=randwrite -- > ioengine=libaio --direct=1 \ > --iodepth=128 --numjobs=1 --runtime=60 --time_based=1 > > this disk is able to perform 4K random writes at ~100K IOPS. When I > create the > glusterfs volume using the same disk as backing store: > > Volume Name: test1 > Type: Replicate > Volume ID: 87bad2a9-7a4a-43fc-94d2-de72965b63d6 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: 192.168.1.112:/glusterfs/test1-000 > Brick2: 192.168.1.112:/glusterfs/test1-001 > Brick3: 192.168.1.112:/glusterfs/test1-002 > Options Reconfigured: > storage.fips-mode-rchecksum: on > transport.address-family: inet > nfs.disable: on > performance.client-io-threads: off > > and run: > > [global] > name=ref-write > filename=testfile > ioengine=gfapi_async > volume=test1 > brick=localhost > create_on_open=1 > rw=randwrite > direct=1 > numjobs=1 > time_based=1 > runtime=60 > > [test-4-kbytes] > bs=4k > size=1G > iodepth=128 > > I'm seeing ~10K IOPS. So adding an extra layer (glusterfs :-) between > an I/O client > (fio in this case) and NVME disk introduces ~10x overhead. Maybe > worse than expected, > but the things goes even worse when I'm switching to the server. > > The server is 32-core machine with NVME disk capable to serve the > same I/O pattern > at ~200K IOPS. I've expected something similar to linear scalability, > i.e. ~20K > IOPS then running the same fio workload on a gluster volume. But I > surprisingly > got something very close to the same ~10K IOPS as seen on the > desktop-class machine. > So, here is ~20x overhead vs. ~10x one on the desktop. > > The OSes are different (Fedora Core 33 on a notebook and relatively > old Debian 9 on > server), but both systems runs the fairly recent 5.9.x kernels > (without massive tricky > tuning via sysctl or similar methods) and glusterfs 8.2, using XFS as > the filesystem > under the bricks. > > I would greatly appreciate any ideas on debugging this. > > Dmitry > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201126/3d910f3b/attachment.html>
Dmitry Antipov
2020-Nov-26 05:14 UTC
[Gluster-users] Poor performance on a server-class system vs. desktop
On 11/26/20 6:33 AM, Ewen Chan wrote:> Dmitry: > > Is there a way to check and see if the GlusterFS write requests is being routed through the network interface? > > I am asking this because of your bricks/host definition as you showed below.In my test setup, all bricks and client workload (fio) are running on the same host. So all network traffic should be routed through the loopback interface, which is CPU-bounded. Since the server is 32-core and has plenty of RAM, loopback should be faster than even 10GbE. Dmitry