thr3ads.net - Gluster users - [Gluster-users] Is such level of performance degradation to be expected? [Jan 2022]

If this information is useful, please help other people find it:
Share via:

Strahil Nikolov

2022-Jan-24 12:32 UTC

[Gluster-users] Is such level of performance degradation to be expected?

Hi Sam,
with the provided information it will be hard to find the reason behind those
results.
Let me try to guide you to some extend...
Usually synthetic benchmarks do not show anything, because gluster has to be
tuned to your real workload and not to a synth.

Also, RH recommends disks of 3-4TB each in a HW raid of 10-12 disks with a
stripe size between 1M and 2M.Next, you need to ensure that hardware alignment
is properly done.
Also, You can experiment with various I/O schedulers with your workload.
Sometimes deadline is better , sometimes noop/none is. In your case (you are
testing mainly writing) the second one should be more performant if the HW Raid
card can cope with it.
As you test on a single node, network lattency is out of the scope.
You can tune the tuned daemon for sequential or random I/O (which changes the
dirty sysctl settings).
Then the hard part comes, as there are several predefined groups of settings
based on your workload that alter the behaviour totally.You are not required to
use any of them, but they are a good start.
Keep in mind that Gluster performs better the more gluster nodes you have and
the more clients connect to the volume. Single client to single node is not
optimal and even if you had enought RAM (for RAMFS) -> your server cpu would
be your bottleneck.
What kind of workload do you have ? I think that a lot of users would be able to
help with recommendations.
Also, share:HW Raid cache & settingsOSMount optionsVolume optionsWorkload
Best Regards,Strahil Nikolov 
 
  On Sun, Jan 23, 2022 at 7:46, Sam<mygluster22 at eml.cc> wrote:   Hello
Everyone,

I am just starting up with Gluster so pardon my ignorance if I am doing
something incorrectly. In order to test the efficiency of GlusterFS, I wanted to
compare its performance with the native file system on which it resides and thus
I kept both gluster server & client on localhost to discount the role of
network.

"/data" is the XFS mount point of my 36 spinning disks in a RAID10
array. I got following results when I ran a fio based bench script directly on
"/data".

fio Disk Speed Tests (Mixed R/W 50/50):
---------------------------------
Block Size | 4k? ? ? ? ? ? (IOPS) | 64k? ? ? ? ? (IOPS)
? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- 
Read? ? ? | 302.17 MB/s? (75.5k) | 1.68 GB/s? ? (26.3k)
Write? ? ? | 302.97 MB/s? (75.7k) | 1.69 GB/s? ? (26.4k)
Total? ? ? | 605.15 MB/s (151.2k) | 3.38 GB/s? ? (52.8k)
? ? ? ? ? |? ? ? ? ? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? ? 
Block Size | 512k? ? ? ? ? (IOPS) | 1m? ? ? ? ? ? (IOPS)
? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- 
Read? ? ? | 1.73 GB/s? ? (3.3k) | 3.24 GB/s? ? (3.1k)
Write? ? ? | 1.82 GB/s? ? (3.5k) | 3.46 GB/s? ? (3.3k)
Total? ? ? | 3.56 GB/s? ? (6.9k) | 6.71 GB/s? ? (6.5k)

I then created a simple gluster volume "test" under "/data"
after pointing "server" to 127.0.0.1 in "/etc/hosts" and
mounted it on the same server to "/mnt":

# mkdir /data/gluster
# gluster volume create test server:/data/gluster
# gluster volume start test
# mount -t glusterfs server:test /mnt

Now when I am running the same bench script on "/mnt", I am getting
abysmally poor results:

fio Disk Speed Tests (Mixed R/W 50/50):
---------------------------------
Block Size | 4k? ? ? ? ? ? (IOPS) | 64k? ? ? ? ? (IOPS)
? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- 
Read? ? ? | 27.96 MB/s? ? (6.9k) | 13.96 MB/s? ? (218)
Write? ? ? | 27.98 MB/s? ? (6.9k) | 14.59 MB/s? ? (228)
Total? ? ? | 55.94 MB/s? (13.9k) | 28.55 MB/s? ? (446)
? ? ? ? ? |? ? ? ? ? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? ? 
Block Size | 512k? ? ? ? ? (IOPS) | 1m? ? ? ? ? ? (IOPS)
? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- 
Read? ? ? | 41.21 MB/s? ? ? (80) | 126.50 MB/s? ? (123)
Write? ? ? | 42.97 MB/s? ? ? (83) | 134.92 MB/s? ? (131)
Total? ? ? | 84.18 MB/s? ? (163) | 261.42 MB/s? ? (254)

There is plenty of free CPU & RAM available when the above test is running
(70-80%) and negligible i/o wait (2-3%) so where is the bottleneck then? Or this
kind of performance degradation is to be expected? Will really appreciate any
insights.

Thanks,
Sam
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20220124/6f34501b/attachment.html>

Sam

2022-Jan-24 13:29 UTC

head link

[Gluster-users] Is such level of performance degradation to be expected?

Thanks for your response Strahil. 
> Usually synthetic benchmarks do not show anything, because gluster has to
be tuned to your real workload and not to a synth.
I understand that they do not paint the real picture. But doing same benchmark
between a set of file-systems on same server should be able to throw results
that can be compared?
> Also, RH recommends disks of 3-4TB each in a HW raid of 10-12 disks with a
stripe size between 1M and 2M.Next, you need to ensure that hardware alignment is properly done.

Gluster isn't interacting with the underlying RAID device here so that
shouldn't matter. If the XFS layer just below gluster is giving me 3.5 GB/s
random reads and writes (--rw=randrw --direct=1), why Gluster above it is
struggling at 130 MB/s on the same RAID setup. That is 27 times slower.

I understand that Gluster volume may perform better when its bricks are
distributed on different nodes but the fact that its performance penalty when
compared to file-system its residing on it is so much high doesn't inspire
much confidence.

I may be wrong here but system settings, cache settings, raid cache etc.
shouldn't have any play here as its parent file-system is performing
perfectly fine with the default settings.

- Sam

Gluster users - Jan 2022 - Is such level of performance degradation to be expected?

[Gluster-users] Is such level of performance degradation to be expected?

[Gluster-users] Is such level of performance degradation to be expected?