Ernie Dunbar
2018-Jul-31 18:38 UTC
[Gluster-users] Need advice about optimal configuration for VM hosting.
Hi everyone. I need some sage advice for upcoming upgrades we're planning for our Gluster array. I'll start by describing our server cluster: We currently have 3 Proxmox nodes. Two of them are the workhorses, running 12 of our production VMs and a handful of dev VMs that don't see the heavy use of the production VMs. The third one is a slower machine that mostly acts to tell Proxmox how to avoid split-brain issues, but is also running the third Galera node for our database. Galera is actually running on the local disks of each Proxmox node because of the performance issues we've been having with the Gluster array. Under that, we have our Gluster array, which currently consists of two nodes in a Replicated volume type. Each node has two bricks: one for VMs and one for our mail. The problem is that it's a bit overloaded, mostly thanks the the VM traffic. I've got a couple new nodes to throw into that array, but in the past, adding a third node killed performance so badly that even after the array rebuild, we were looking at a decrease in disk performance of about 50%. I'm thinking that adding a fourth node might improve things, but that still means waiting for the Gluster array to rebuild, suffering the performance degradation of the third node, then adding the fourth, suffering the performance degradation of the rebuild again, and then... what? See what happens and add more nodes to hopefully further improve things? That doesn't sound like a good strategy to me. There are too many questions, and too much hoping things get better. Another strategy I was thinking of, might be to build a whole new Gluster array with the latest version of Gluster. Then, instead of suffering the long array rebuild times, we can move VMs one at a time to the new Gluster disk. It's more work for me, but probably less pain for our customers. I was thinking that the new array should be 4 nodes in a Replicated-Distributed volume type, and after we've moved our data off the old array, nuking the old nodes and adding them to the new array. My understanding is that we'd have to add new nodes two at a time to ensure the replication makes sense to Gluster.