Hi, ??? I am looking into a new gluster deployment to replace an ancient one. ??? For this deployment I will be using some repurposed servers I already have in stock. The disk specs are 12 * 3 TB SATA disks. No HW RAID controller. They also have some SSD which would be nice to leverage as cache or similar to improve performance, since it is already there. Advice on how to leverage the SSDs would be greatly appreciated. ??? One of the design choices I have to make is using 3 nodes for a replica-3 with JBOD, or using 2 nodes with a replica-2 and using SW RAID 6 for the disks, maybe adding a 3rd node with a smaller amount of disk as metadata node for the replica set. I would love to hear advice on the pros and cons of each setup from the gluster experts. ??? The data will be accessed from 4 to 6 systems with native gluster, not sure if that makes any difference. ??? The amount of data I have to store there is currently 20 TB, with moderate growth. iops are quite low so high performance is not an issue. The data will fit in any of the two setups. ??? Thanks in advance for your advice! -- Eduardo Mayoral Jimeno Systems engineer, platform department. Arsys Internet. emayoral at arsys.es - +34 941 620 105 - ext 2153
Hi Eduardo,>??? I am looking into a new gluster deployment to replace an ancient one.? >? For this deployment I will be using some repurposed servers I>already have in stock. The disk specs are 12 * 3 TB SATA disks. No HW >RAID controller. They also have some SSD which would be nice to leverage >as cache or similar to improve performance, since it is already there. >Advice on how to leverage the SSDs would be greatly appreciated.Gluster Tiering was dropped in favour of the LVM cache.keep in mind that in RHEL/CentOS 7 you should be careful for migration_threshold value sometimes is smaller than the chunk size.For details check: https://bugzilla.redhat.com/show_bug.cgi?id=1668163>??? One of the design choices I have to make is using 3 nodes for a >replica-3 with JBOD, or using 2 nodes with a replica-2 and using SW RAID >6 for the disks, maybe adding a 3rd node with a smaller amount of disk >as metadata node for the replica set. I would love to hear advice on the >pros and cons of each setup from the gluster experts.If you go with replica3 - your reads will be from 3 servers - thus higher speedsIf you chose replica2 - you will eventually enter a split brain (Not a good one)If you choose replica2 arbiter1 (old replica 3 arbiter1) - you will read from only 2 servers , but save bandwidth. keep in mind that you need high-bandwidth NICs (as bonding/teaming is balancing based on MAC, IP and Port which in your case will all be the same)Another option is to use GlusterD2 with replica2 and remote arbiter (for example in the cloud or somewhere away). This setup does not require the arbiter to responce in a timely manner and is used only if 1 data brick is down. ?> ? The data will be accessed from 4 to 6 systems with native gluster,>not sure if that makes any difference.? >? The amount of data I have to store there is currently 20 TB, with>moderate growth. iops are quite low so high performance is not an issue. >The data will fit in any of the two setups.I would go with replica3 if NICs are 10gbit/s or bigger and replica2 arbiter1 if NICs are smaller.GlusterD2 is still new and might be too risky for production (Gluster Devs can correct me here). My current setup is with Gluster v6.1 on Ovirt in a replica2 arbiter1 with 6 NICs x 1gbit/s ports (consumer grade) and in order to overcome the load-balancing issue , I'm using multiple thin LVs ontop a single NVMe - each LV is a gluster brick . Each gluster? volume has a separate tcp port and thus the teaming device is load-balancing traffic on another NIC. This allows me to stripe my data on VM level , but this setup is only OK for labs . ?Best Regards,Strahil Nikolov -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190605/f5060c02/attachment.html>
Good morning, my comment won't help you directly, but i thought i'd send it anyway... Our first glusterfs setup had 3 servers withs 4 disks=bricks (10TB, JBOD) each. Was running fine in the beginning, but then 1 disk failed. The following heal took ~1 month, with a bad performance (quite high IO). Shortly after the heal hat finished another disk failed -> same problems again. Not funny. For our new system we decided to use 3 servers with 10 disks (10 TB) each, but now the 10 disks in a SW RAID 10 (well, we split the 10 disks into 2 SW RAID 10, each of them is a brick, we have 2 gluster volumes). A lot of disk space "wasted", with this type of SW RAID and a replicate 3 setup, but we wanted to avoid the "healing takes a long time with bad performance" problems. Now mdadm takes care of replicating data, glusterfs should always see "good" bricks. And the decision may depend on what kind of data you have. Many small files, like tens of millions? Or not that much, but bigger files? I once watched a video (i think it was this one: https://www.youtube.com/watch?v=61HDVwttNYI). Recommendation there: RAID 6 or 10 for small files, for big files... well, already 2 years "old" ;-) As i said, this won't help you directly. You have to identify what's most important for your scenario; as you said, high performance is not an issue - if this is true even when you have slight performance issues after a disk fail then ok. My experience so far: the bigger and slower the disks are and the more data you have -> healing will hurt -> try to avoid this. If the disks are small and fast (SSDs), healing will be faster -> JBOD is an option. hth, Hubert Am Mi., 5. Juni 2019 um 11:33 Uhr schrieb Eduardo Mayoral <emayoral at arsys.es>:> > Hi, > > I am looking into a new gluster deployment to replace an ancient one. > > For this deployment I will be using some repurposed servers I > already have in stock. The disk specs are 12 * 3 TB SATA disks. No HW > RAID controller. They also have some SSD which would be nice to leverage > as cache or similar to improve performance, since it is already there. > Advice on how to leverage the SSDs would be greatly appreciated. > > One of the design choices I have to make is using 3 nodes for a > replica-3 with JBOD, or using 2 nodes with a replica-2 and using SW RAID > 6 for the disks, maybe adding a 3rd node with a smaller amount of disk > as metadata node for the replica set. I would love to hear advice on the > pros and cons of each setup from the gluster experts. > > The data will be accessed from 4 to 6 systems with native gluster, > not sure if that makes any difference. > > The amount of data I have to store there is currently 20 TB, with > moderate growth. iops are quite low so high performance is not an issue. > The data will fit in any of the two setups. > > Thanks in advance for your advice! > > -- > Eduardo Mayoral Jimeno > Systems engineer, platform department. Arsys Internet. > emayoral at arsys.es - +34 941 620 105 - ext 2153 > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users