Danny
2023-Dec-12 19:29 UTC
[Gluster-users] Gluster Performance - 12 Gbps SSDs and 10 Gbps NIC
Nope, not a caching thing. I've tried multiple different types of fio tests, all produce the same results. Gbps when hitting the disks locally, slow MB\s when hitting the Gluster FUSE mount. I've been reading up on glustr-ganesha, and will give that a try. On Tue, Dec 12, 2023 at 1:58?PM Ramon Selga <ramon.selga at gmail.com> wrote:> Dismiss my first question: you have SAS 12Gbps SSDs Sorry! > > El 12/12/23 a les 19:52, Ramon Selga ha escrit: > > May ask you which kind of disks you have in this setup? rotational, ssd > SAS/SATA, nvme? > > Is there a RAID controller with writeback caching? > > It seems to me your fio test on local brick has a unclear result due to > some caching. > > Try something like (you can consider to increase test file size depending > of your caching memory) : > > fio --size=16G --name=test --filename=/gluster/data/brick/wow --bs=1M > --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers > --end_fsync=1 --iodepth=200 --ioengine=libaio > > Also remember a replica 3 arbiter 1 volume writes synchronously to two > data bricks, halving throughput of your network backend. > > Try similar fio on gluster mount but I hardly see more than 300MB/s > writing sequentially on only one fuse mount even with nvme backend. On the > other side, with 4 to 6 clients, you can easily reach 1.5GB/s of aggregate > throughput > > To start, I think is better to try with default parameters for your > replica volume. > > Best regards! > > Ramon > > > El 12/12/23 a les 19:10, Danny ha escrit: > > Sorry, I noticed that too after I posted, so I instantly upgraded to 10. > Issue remains. > > On Tue, Dec 12, 2023 at 1:09?PM Gilberto Ferreira < > gilberto.nunes32 at gmail.com> wrote: > >> I strongly suggest you update to version 10 or higher. >> It's come with significant improvement regarding performance. >> --- >> Gilberto Nunes Ferreira >> (47) 99676-7530 - Whatsapp / Telegram >> >> >> >> >> >> >> Em ter., 12 de dez. de 2023 ?s 13:03, Danny <dbray925+gluster at gmail.com> >> escreveu: >> >>> MTU is already 9000, and as you can see from the IPERF results, I've got >>> a nice, fast connection between the nodes. >>> >>> On Tue, Dec 12, 2023 at 9:49?AM Strahil Nikolov <hunter86_bg at yahoo.com> >>> wrote: >>> >>>> Hi, >>>> >>>> Let?s try the simple things: >>>> >>>> Check if you can use MTU9000 and if it?s possible, set it on the Bond >>>> Slaves and the bond devices: >>>> ping GLUSTER_PEER -c 10 -M do -s 8972 >>>> >>>> Then try to follow up the recommendations from >>>> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/chap-configuring_red_hat_storage_for_enhancing_performance >>>> >>>> >>>> >>>> Best Regards, >>>> Strahil Nikolov >>>> >>>> On Monday, December 11, 2023, 3:32 PM, Danny < >>>> dbray925+gluster at gmail.com> wrote: >>>> >>>> Hello list, I'm hoping someone can let me know what setting I missed. >>>> >>>> Hardware: >>>> Dell R650 servers, Dual 24 Core Xeon 2.8 GHz, 1 TB RAM >>>> 8x SSD s Negotiated Speed 12 Gbps >>>> PERC H755 Controller - RAID 6 >>>> Created virtual "data" disk from the above 8 SSD drives, for a ~20 TB >>>> /dev/sdb >>>> >>>> OS: >>>> CentOS Stream >>>> kernel-4.18.0-526.el8.x86_64 >>>> glusterfs-7.9-1.el8.x86_64 >>>> >>>> IPERF Test between nodes: >>>> [ ID] Interval Transfer Bitrate Retr >>>> [ 5] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec 0 >>>> sender >>>> [ 5] 0.00-10.04 sec 11.5 GBytes 9.86 Gbits/sec >>>> receiver >>>> >>>> All good there. ~10 Gbps, as expected. >>>> >>>> LVM Install: >>>> export DISK="/dev/sdb" >>>> sudo parted --script $DISK "mklabel gpt" >>>> sudo parted --script $DISK "mkpart primary 0% 100%" >>>> sudo parted --script $DISK "set 1 lvm on" >>>> sudo pvcreate --dataalignment 128K /dev/sdb1 >>>> sudo vgcreate --physicalextentsize 128K gfs_vg /dev/sdb1 >>>> sudo lvcreate -L 16G -n gfs_pool_meta gfs_vg >>>> sudo lvcreate -l 95%FREE -n gfs_pool gfs_vg >>>> sudo lvconvert --chunksize 1280K --thinpool gfs_vg/gfs_pool >>>> --poolmetadata gfs_vg/gfs_pool_meta >>>> sudo lvchange --zero n gfs_vg/gfs_pool >>>> sudo lvcreate -V 19.5TiB --thinpool gfs_vg/gfs_pool -n gfs_lv >>>> sudo mkfs.xfs -f -i size=512 -n size=8192 -d su=128k,sw=10 >>>> /dev/mapper/gfs_vg-gfs_lv >>>> sudo vim /etc/fstab >>>> /dev/mapper/gfs_vg-gfs_lv /gluster/data/brick xfs >>>> rw,inode64,noatime,nouuid 0 0 >>>> >>>> sudo systemctl daemon-reload && sudo mount -a >>>> fio --name=test --filename=/gluster/data/brick/wow --size=1G >>>> --readwrite=write >>>> >>>> Run status group 0 (all jobs): >>>> WRITE: bw=2081MiB/s (2182MB/s), 2081MiB/s-2081MiB/s >>>> (2182MB/s-2182MB/s), io=1024MiB (1074MB), run=492-492msec >>>> >>>> All good there. 2182MB/s =~ 17.5 Gbps. Nice! >>>> >>>> >>>> Gluster install: >>>> export NODE1='10.54.95.123' >>>> export NODE2='10.54.95.124' >>>> export NODE3='10.54.95.125' >>>> sudo gluster peer probe $NODE2 >>>> sudo gluster peer probe $NODE3 >>>> sudo gluster volume create data replica 3 arbiter 1 >>>> $NODE1:/gluster/data/brick $NODE2:/gluster/data/brick >>>> $NODE3:/gluster/data/brick force >>>> sudo gluster volume set data network.ping-timeout 5 >>>> sudo gluster volume set data performance.client-io-threads on >>>> sudo gluster volume set data group metadata-cache >>>> sudo gluster volume start data >>>> sudo gluster volume info all >>>> >>>> Volume Name: data >>>> Type: Replicate >>>> Volume ID: b52b5212-82c8-4b1a-8db3-52468bc0226e >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 1 x (2 + 1) = 3 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: 10.54.95.123:/gluster/data/brick >>>> Brick2: 10.54.95.124:/gluster/data/brick >>>> Brick3: 10.54.95.125:/gluster/data/brick (arbiter) >>>> Options Reconfigured: >>>> network.inode-lru-limit: 200000 >>>> performance.md-cache-timeout: 600 >>>> performance.cache-invalidation: on >>>> performance.stat-prefetch: on >>>> features.cache-invalidation-timeout: 600 >>>> features.cache-invalidation: on >>>> network.ping-timeout: 5 >>>> transport.address-family: inet >>>> storage.fips-mode-rchecksum: on >>>> nfs.disable: on >>>> performance.client-io-threads: on >>>> >>>> sudo vim /etc/fstab >>>> localhost:/data /data glusterfs >>>> defaults,_netdev 0 0 >>>> >>>> sudo systemctl daemon-reload && sudo mount -a >>>> fio --name=test --filename=/data/wow --size=1G --readwrite=write >>>> >>>> Run status group 0 (all jobs): >>>> WRITE: bw=109MiB/s (115MB/s), 109MiB/s-109MiB/s (115MB/s-115MB/s), >>>> io=1024MiB (1074MB), run=9366-9366msec >>>> >>>> Oh no, what's wrong? From 2182MB/s down to only 115MB/s? What am I >>>> missing? I'm not expecting the above ~17 Gbps, but I'm thinking it should >>>> at least be close(r) to ~10 Gbps. >>>> >>>> Any suggestions? >>>> ________ >>>> >>>> >>>> >>>> Community Meeting Calendar: >>>> >>>> Schedule - >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>> Bridge: https://meet.google.com/cpu-eiue-hvk >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://meet.google.com/cpu-eiue-hvk >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >> > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users > > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20231212/82fd1aca/attachment.html>
Gilberto Ferreira
2023-Dec-12 19:59 UTC
[Gluster-users] Gluster Performance - 12 Gbps SSDs and 10 Gbps NIC
Fuse there some overhead. Take a look at libgfapi: https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Features/libgfapi/ I know this doc somehow is out of date, but could be a hint --- Gilberto Nunes Ferreira (47) 99676-7530 - Whatsapp / Telegram Em ter., 12 de dez. de 2023 ?s 16:29, Danny <dbray925+gluster at gmail.com> escreveu:> Nope, not a caching thing. I've tried multiple different types of fio > tests, all produce the same results. Gbps when hitting the disks locally, > slow MB\s when hitting the Gluster FUSE mount. > > I've been reading up on glustr-ganesha, and will give that a try. > > On Tue, Dec 12, 2023 at 1:58?PM Ramon Selga <ramon.selga at gmail.com> wrote: > >> Dismiss my first question: you have SAS 12Gbps SSDs Sorry! >> >> El 12/12/23 a les 19:52, Ramon Selga ha escrit: >> >> May ask you which kind of disks you have in this setup? rotational, ssd >> SAS/SATA, nvme? >> >> Is there a RAID controller with writeback caching? >> >> It seems to me your fio test on local brick has a unclear result due to >> some caching. >> >> Try something like (you can consider to increase test file size depending >> of your caching memory) : >> >> fio --size=16G --name=test --filename=/gluster/data/brick/wow --bs=1M >> --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers >> --end_fsync=1 --iodepth=200 --ioengine=libaio >> >> Also remember a replica 3 arbiter 1 volume writes synchronously to two >> data bricks, halving throughput of your network backend. >> >> Try similar fio on gluster mount but I hardly see more than 300MB/s >> writing sequentially on only one fuse mount even with nvme backend. On the >> other side, with 4 to 6 clients, you can easily reach 1.5GB/s of aggregate >> throughput >> >> To start, I think is better to try with default parameters for your >> replica volume. >> >> Best regards! >> >> Ramon >> >> >> El 12/12/23 a les 19:10, Danny ha escrit: >> >> Sorry, I noticed that too after I posted, so I instantly upgraded to 10. >> Issue remains. >> >> On Tue, Dec 12, 2023 at 1:09?PM Gilberto Ferreira < >> gilberto.nunes32 at gmail.com> wrote: >> >>> I strongly suggest you update to version 10 or higher. >>> It's come with significant improvement regarding performance. >>> --- >>> Gilberto Nunes Ferreira >>> (47) 99676-7530 - Whatsapp / Telegram >>> >>> >>> >>> >>> >>> >>> Em ter., 12 de dez. de 2023 ?s 13:03, Danny <dbray925+gluster at gmail.com> >>> escreveu: >>> >>>> MTU is already 9000, and as you can see from the IPERF results, I've >>>> got a nice, fast connection between the nodes. >>>> >>>> On Tue, Dec 12, 2023 at 9:49?AM Strahil Nikolov <hunter86_bg at yahoo.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> Let?s try the simple things: >>>>> >>>>> Check if you can use MTU9000 and if it?s possible, set it on the Bond >>>>> Slaves and the bond devices: >>>>> ping GLUSTER_PEER -c 10 -M do -s 8972 >>>>> >>>>> Then try to follow up the recommendations from >>>>> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/chap-configuring_red_hat_storage_for_enhancing_performance >>>>> >>>>> >>>>> >>>>> Best Regards, >>>>> Strahil Nikolov >>>>> >>>>> On Monday, December 11, 2023, 3:32 PM, Danny < >>>>> dbray925+gluster at gmail.com> wrote: >>>>> >>>>> Hello list, I'm hoping someone can let me know what setting I missed. >>>>> >>>>> Hardware: >>>>> Dell R650 servers, Dual 24 Core Xeon 2.8 GHz, 1 TB RAM >>>>> 8x SSD s Negotiated Speed 12 Gbps >>>>> PERC H755 Controller - RAID 6 >>>>> Created virtual "data" disk from the above 8 SSD drives, for a ~20 TB >>>>> /dev/sdb >>>>> >>>>> OS: >>>>> CentOS Stream >>>>> kernel-4.18.0-526.el8.x86_64 >>>>> glusterfs-7.9-1.el8.x86_64 >>>>> >>>>> IPERF Test between nodes: >>>>> [ ID] Interval Transfer Bitrate Retr >>>>> [ 5] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec 0 >>>>> sender >>>>> [ 5] 0.00-10.04 sec 11.5 GBytes 9.86 Gbits/sec >>>>> receiver >>>>> >>>>> All good there. ~10 Gbps, as expected. >>>>> >>>>> LVM Install: >>>>> export DISK="/dev/sdb" >>>>> sudo parted --script $DISK "mklabel gpt" >>>>> sudo parted --script $DISK "mkpart primary 0% 100%" >>>>> sudo parted --script $DISK "set 1 lvm on" >>>>> sudo pvcreate --dataalignment 128K /dev/sdb1 >>>>> sudo vgcreate --physicalextentsize 128K gfs_vg /dev/sdb1 >>>>> sudo lvcreate -L 16G -n gfs_pool_meta gfs_vg >>>>> sudo lvcreate -l 95%FREE -n gfs_pool gfs_vg >>>>> sudo lvconvert --chunksize 1280K --thinpool gfs_vg/gfs_pool >>>>> --poolmetadata gfs_vg/gfs_pool_meta >>>>> sudo lvchange --zero n gfs_vg/gfs_pool >>>>> sudo lvcreate -V 19.5TiB --thinpool gfs_vg/gfs_pool -n gfs_lv >>>>> sudo mkfs.xfs -f -i size=512 -n size=8192 -d su=128k,sw=10 >>>>> /dev/mapper/gfs_vg-gfs_lv >>>>> sudo vim /etc/fstab >>>>> /dev/mapper/gfs_vg-gfs_lv /gluster/data/brick xfs >>>>> rw,inode64,noatime,nouuid 0 0 >>>>> >>>>> sudo systemctl daemon-reload && sudo mount -a >>>>> fio --name=test --filename=/gluster/data/brick/wow --size=1G >>>>> --readwrite=write >>>>> >>>>> Run status group 0 (all jobs): >>>>> WRITE: bw=2081MiB/s (2182MB/s), 2081MiB/s-2081MiB/s >>>>> (2182MB/s-2182MB/s), io=1024MiB (1074MB), run=492-492msec >>>>> >>>>> All good there. 2182MB/s =~ 17.5 Gbps. Nice! >>>>> >>>>> >>>>> Gluster install: >>>>> export NODE1='10.54.95.123' >>>>> export NODE2='10.54.95.124' >>>>> export NODE3='10.54.95.125' >>>>> sudo gluster peer probe $NODE2 >>>>> sudo gluster peer probe $NODE3 >>>>> sudo gluster volume create data replica 3 arbiter 1 >>>>> $NODE1:/gluster/data/brick $NODE2:/gluster/data/brick >>>>> $NODE3:/gluster/data/brick force >>>>> sudo gluster volume set data network.ping-timeout 5 >>>>> sudo gluster volume set data performance.client-io-threads on >>>>> sudo gluster volume set data group metadata-cache >>>>> sudo gluster volume start data >>>>> sudo gluster volume info all >>>>> >>>>> Volume Name: data >>>>> Type: Replicate >>>>> Volume ID: b52b5212-82c8-4b1a-8db3-52468bc0226e >>>>> Status: Started >>>>> Snapshot Count: 0 >>>>> Number of Bricks: 1 x (2 + 1) = 3 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: 10.54.95.123:/gluster/data/brick >>>>> Brick2: 10.54.95.124:/gluster/data/brick >>>>> Brick3: 10.54.95.125:/gluster/data/brick (arbiter) >>>>> Options Reconfigured: >>>>> network.inode-lru-limit: 200000 >>>>> performance.md-cache-timeout: 600 >>>>> performance.cache-invalidation: on >>>>> performance.stat-prefetch: on >>>>> features.cache-invalidation-timeout: 600 >>>>> features.cache-invalidation: on >>>>> network.ping-timeout: 5 >>>>> transport.address-family: inet >>>>> storage.fips-mode-rchecksum: on >>>>> nfs.disable: on >>>>> performance.client-io-threads: on >>>>> >>>>> sudo vim /etc/fstab >>>>> localhost:/data /data glusterfs >>>>> defaults,_netdev 0 0 >>>>> >>>>> sudo systemctl daemon-reload && sudo mount -a >>>>> fio --name=test --filename=/data/wow --size=1G --readwrite=write >>>>> >>>>> Run status group 0 (all jobs): >>>>> WRITE: bw=109MiB/s (115MB/s), 109MiB/s-109MiB/s (115MB/s-115MB/s), >>>>> io=1024MiB (1074MB), run=9366-9366msec >>>>> >>>>> Oh no, what's wrong? From 2182MB/s down to only 115MB/s? What am I >>>>> missing? I'm not expecting the above ~17 Gbps, but I'm thinking it should >>>>> at least be close(r) to ~10 Gbps. >>>>> >>>>> Any suggestions? >>>>> ________ >>>>> >>>>> >>>>> >>>>> Community Meeting Calendar: >>>>> >>>>> Schedule - >>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>>> Bridge: https://meet.google.com/cpu-eiue-hvk >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>>> ________ >>>> >>>> >>>> >>>> Community Meeting Calendar: >>>> >>>> Schedule - >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>> Bridge: https://meet.google.com/cpu-eiue-hvk >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20231212/1fa94d04/attachment.html>