Strahil Nikolov
2022-Jan-24 12:32 UTC
[Gluster-users] Is such level of performance degradation to be expected?
Hi Sam, with the provided information it will be hard to find the reason behind those results. Let me try to guide you to some extend... Usually synthetic benchmarks do not show anything, because gluster has to be tuned to your real workload and not to a synth. Also, RH recommends disks of 3-4TB each in a HW raid of 10-12 disks with a stripe size between 1M and 2M.Next, you need to ensure that hardware alignment is properly done. Also, You can experiment with various I/O schedulers with your workload. Sometimes deadline is better , sometimes noop/none is. In your case (you are testing mainly writing) the second one should be more performant if the HW Raid card can cope with it. As you test on a single node, network lattency is out of the scope. You can tune the tuned daemon for sequential or random I/O (which changes the dirty sysctl settings). Then the hard part comes, as there are several predefined groups of settings based on your workload that alter the behaviour totally.You are not required to use any of them, but they are a good start. Keep in mind that Gluster performs better the more gluster nodes you have and the more clients connect to the volume. Single client to single node is not optimal and even if you had enought RAM (for RAMFS) -> your server cpu would be your bottleneck. What kind of workload do you have ? I think that a lot of users would be able to help with recommendations. Also, share:HW Raid cache & settingsOSMount optionsVolume optionsWorkload Best Regards,Strahil Nikolov On Sun, Jan 23, 2022 at 7:46, Sam<mygluster22 at eml.cc> wrote: Hello Everyone, I am just starting up with Gluster so pardon my ignorance if I am doing something incorrectly. In order to test the efficiency of GlusterFS, I wanted to compare its performance with the native file system on which it resides and thus I kept both gluster server & client on localhost to discount the role of network. "/data" is the XFS mount point of my 36 spinning disks in a RAID10 array. I got following results when I ran a fio based bench script directly on "/data". fio Disk Speed Tests (Mixed R/W 50/50): --------------------------------- Block Size | 4k? ? ? ? ? ? (IOPS) | 64k? ? ? ? ? (IOPS) ? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- Read? ? ? | 302.17 MB/s? (75.5k) | 1.68 GB/s? ? (26.3k) Write? ? ? | 302.97 MB/s? (75.7k) | 1.69 GB/s? ? (26.4k) Total? ? ? | 605.15 MB/s (151.2k) | 3.38 GB/s? ? (52.8k) ? ? ? ? ? |? ? ? ? ? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? ? Block Size | 512k? ? ? ? ? (IOPS) | 1m? ? ? ? ? ? (IOPS) ? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- Read? ? ? | 1.73 GB/s? ? (3.3k) | 3.24 GB/s? ? (3.1k) Write? ? ? | 1.82 GB/s? ? (3.5k) | 3.46 GB/s? ? (3.3k) Total? ? ? | 3.56 GB/s? ? (6.9k) | 6.71 GB/s? ? (6.5k) I then created a simple gluster volume "test" under "/data" after pointing "server" to 127.0.0.1 in "/etc/hosts" and mounted it on the same server to "/mnt": # mkdir /data/gluster # gluster volume create test server:/data/gluster # gluster volume start test # mount -t glusterfs server:test /mnt Now when I am running the same bench script on "/mnt", I am getting abysmally poor results: fio Disk Speed Tests (Mixed R/W 50/50): --------------------------------- Block Size | 4k? ? ? ? ? ? (IOPS) | 64k? ? ? ? ? (IOPS) ? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- Read? ? ? | 27.96 MB/s? ? (6.9k) | 13.96 MB/s? ? (218) Write? ? ? | 27.98 MB/s? ? (6.9k) | 14.59 MB/s? ? (228) Total? ? ? | 55.94 MB/s? (13.9k) | 28.55 MB/s? ? (446) ? ? ? ? ? |? ? ? ? ? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? ? Block Size | 512k? ? ? ? ? (IOPS) | 1m? ? ? ? ? ? (IOPS) ? ------? | ---? ? ? ? ? ? ----? | ----? ? ? ? ? ---- Read? ? ? | 41.21 MB/s? ? ? (80) | 126.50 MB/s? ? (123) Write? ? ? | 42.97 MB/s? ? ? (83) | 134.92 MB/s? ? (131) Total? ? ? | 84.18 MB/s? ? (163) | 261.42 MB/s? ? (254) There is plenty of free CPU & RAM available when the above test is running (70-80%) and negligible i/o wait (2-3%) so where is the bottleneck then? Or this kind of performance degradation is to be expected? Will really appreciate any insights. Thanks, Sam ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220124/6f34501b/attachment.html>
Sam
2022-Jan-24 13:29 UTC
[Gluster-users] Is such level of performance degradation to be expected?
Thanks for your response Strahil.> Usually synthetic benchmarks do not show anything, because gluster has to be tuned to your real workload and not to a synth.I understand that they do not paint the real picture. But doing same benchmark between a set of file-systems on same server should be able to throw results that can be compared?> Also, RH recommends disks of 3-4TB each in a HW raid of 10-12 disks with a stripe size between 1M and 2M.Next, you need to ensure that hardware alignment is properly done. Gluster isn't interacting with the underlying RAID device here so that shouldn't matter. If the XFS layer just below gluster is giving me 3.5 GB/s random reads and writes (--rw=randrw --direct=1), why Gluster above it is struggling at 130 MB/s on the same RAID setup. That is 27 times slower. I understand that Gluster volume may perform better when its bricks are distributed on different nodes but the fact that its performance penalty when compared to file-system its residing on it is so much high doesn't inspire much confidence. I may be wrong here but system settings, cache settings, raid cache etc. shouldn't have any play here as its parent file-system is performing perfectly fine with the default settings. - Sam