Ingo Fischer
2019-Apr-03 06:48 UTC
[Gluster-users] Is "replica 4 arbiter 1" allowed to tweak client-quorum?
Hi All, I had a replica 2 cluster to host my VM images from my Proxmox cluster. I got a bit around split brain scenarios by using "nufa" to make sure the files are located on the host where the machine also runs normally. So in fact one replica could fail and I still had the VM working. But then I thought about doing better and decided to add a node to increase replica and I decided against arbiter approach. During this I also decided to go away from nufa to make it a more normal approach. But in fact by adding the third replica and removing nufa I'm not really better on availability - only split-brain-chance. I'm still at the point that only one node is allowed to fail because else the now active client quorum is no longer met and FS goes read only (which in fact is not really better then failing completely as it was before). So I thought about adding arbiter bricks as "kind of 4th replica (but without space needs) ... but then I read in docs that only "replica 3 arbiter 1" is allowed as combination. Is this still true? If docs are true: Why arbiter is not allowed for higher replica counts? It would allow to improve on client quorum in my understanding. Thank you for your opinion and/or facts :-) Ingo -- Ingo Fischer Technical Director of Platform Gameforge 4D GmbH Albert-Nestler-Stra?e 8 76131 Karlsruhe Germany Tel. +49 721 354 808-2269 ingo.fischer at gameforge.com http://www.gameforge.com Amtsgericht Mannheim, Handelsregisternummer 718029 USt-IdNr.: DE814330106 Gesch?ftsf?hrer Alexander R?sner, Jeffrey Brown
Ravishankar N
2019-Apr-03 07:38 UTC
[Gluster-users] Is "replica 4 arbiter 1" allowed to tweak client-quorum?
On 03/04/19 12:18 PM, Ingo Fischer wrote:> Hi All, > > I had a replica 2 cluster to host my VM images from my Proxmox cluster. > I got a bit around split brain scenarios by using "nufa" to make sure > the files are located on the host where the machine also runs normally. > So in fact one replica could fail and I still had the VM working. > > But then I thought about doing better and decided to add a node to > increase replica and I decided against arbiter approach. During this I > also decided to go away from nufa to make it a more normal approach. > > But in fact by adding the third replica and removing nufa I'm not really > better on availability - only split-brain-chance. I'm still at the point > that only one node is allowed to fail because else the now active client > quorum is no longer met and FS goes read only (which in fact is not > really better then failing completely as it was before). > > So I thought about adding arbiter bricks as "kind of 4th replica (but > without space needs) ... but then I read in docs that only "replica 3 > arbiter 1" is allowed as combination. Is this still true?Yes, this is still true. Slightly off-topic, the 'replica 3 arbiter 1' was supposed to mean there are 3 bricks out of which 1 is an arbiter. This supposedly caused some confusion where people thought there were 4 bricks involved. The CLI syntax was changed in the newer releases to 'replica 2 arbiter 1` to mean there are 2 data bricks and 1 arbiter brick. For backward compatibility, the older syntax still works though. The documentation needs to be updated. :-)> If docs are true: Why arbiter is not allowed for higher replica counts?The main motivation for the arbiter feature was to solve a specific case: people who wanted to avoid split-brains associated with replica 2 but did not want to add another full blown data brick to make it replica 3 for cost reasons.> It would allow to improve on client quorum in my understanding.Agreed but the current implementation is only for a 2+1 configuration. Perhaps it is something we could work on in the future to make it generic like you say.> > Thank you for your opinion and/or facts :-)I don't think NUFA is being worked on/tested actively. If you can afford a 3rd data brick, making it replica 3 is definitely better than a 2+1 arbiter since there is more availability by virtue of the 3rd brick also storing data. Both of them prevent split-brains and are used successfully by OVirt/ VM storage/ hyperconvergance use cases. Even without NUFA, for reads, AFR anyway serves it from the local copy (writes still need to go to all bricks). Regards, Ravi> > Ingo >