On Feb 12, 2016, at 8:34 AM, Ravishankar N <ravishankar at redhat.com> wrote:> Consistency, availability, tolerance to network partitions. You get to pick any two.I wanted the first two. I did not get them. By default, we get split brain. This means no consistency. To cure that, we choose quorums. But when the first of a replica 2 pair goes away, you then loose write access. Without write, we loose availability. So, if you think it is possible, let me know how to reconfigure my array and I will tell you if it worked. If you could update the docs to explain how you get the first two, that would be nice. If you could update the docs to state that the array goes into a partial read-only state if a replica pair goes away, that would be nice. I?m fine with running in a degraded state when a server goes away. When it comes back, I want it to suck down all the new changes from the authoritative replica pair known to the quorum and then once it has all the data, then it can be marked as not-degraded and resume normal operation. I want each node to notice a down server, and when it is part of a 51% partition, I want the remaining replica members of that server to become degraded replica N-1 set. When the server comes back up, and want it to repair back into a replica N state.
I don?t understand why do u want to set quorum with only two servers. It doesn?t make sense at all! simply create a distributed replicated volume with no quorum on your two nodes cluster, when one node goes down, the volume will still be RW without any problems. How come you do want to set quorum on only two nodes cluster?! To set quorum add at least a third node, so when you have a failed node, the other two will still serve the volume as RW. the definition of quorum is as follows: The quorum configuration in a failover cluster determines the number of failures that the cluster can sustain. If an additional failure occurs, the cluster must stop running. The relevant failures in this context are failures of nodes or, in some cases, of a disk witness (which contains a copy of the cluster configuration) or file share witness. It is essential that the cluster stop running if too many failures occur or if there is a problem with communication between the cluster nodes. hopefully this made you better understand what?s quorum, there is no difference between a network partition and a quorum in a cluster of two nodes. that?s why I said what you are trying to do doesn?t make sense. ? Bishoy> On Feb 12, 2016, at 11:32 AM, Mike Stump <mikestump at comcast.net> wrote: > > On Feb 12, 2016, at 8:34 AM, Ravishankar N <ravishankar at redhat.com> wrote: >> Consistency, availability, tolerance to network partitions. You get to pick any two. > > I wanted the first two. I did not get them. By default, we get split brain. This means no consistency. To cure that, we choose quorums. But when the first of a replica 2 pair goes away, you then loose write access. Without write, we loose availability. So, if you think it is possible, let me know how to reconfigure my array and I will tell you if it worked. If you could update the docs to explain how you get the first two, that would be nice. If you could update the docs to state that the array goes into a partial read-only state if a replica pair goes away, that would be nice. > > I?m fine with running in a degraded state when a server goes away. When it comes back, I want it to suck down all the new changes from the authoritative replica pair known to the quorum and then once it has all the data, then it can be marked as not-degraded and resume normal operation. > > I want each node to notice a down server, and when it is part of a 51% partition, I want the remaining replica members of that server to become degraded replica N-1 set. When the server comes back up, and want it to repair back into a replica N state. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160212/b2e41793/attachment.html>
On 02/13/2016 01:02 AM, Mike Stump wrote:> On Feb 12, 2016, at 8:34 AM, Ravishankar N <ravishankar at redhat.com> wrote: >> Consistency, availability, tolerance to network partitions. You get to pick any two. > I wanted the first two. I did not get them. By default, we get split brain. This means no consistency.consistency means the client always gets back the same data it wrote to the volume. For replication, if say the write succeed only on one brick, then further reads will be served from the healthy brick and not accidentally from the stale one. It also means if one client updated the file, other clients also get to see the same update when they access it.> To cure that, we choose quorums. But when the first of a replica 2 pair goes away, you then loose write access. Without write, we loose availability. So, if you think it is possible, let me know how to reconfigure my array and I will tell you if it worked. If you could update the docs to explain how you get the first two, that would be nice. If you could update the docs to state that the array goes into a partial read-only state if a replica pair goes away, that would be nice.Like Bishoy said in another thread, quorum does not really make sense in 2-replica because there is no notion of majority. If you use a 3 way replica with client-quorum enabled, then you have more availability than a 2 way replica. If preventing split-brains is your major concern while not wanting to use 3x replication, you can try arbiter volumes. (https://gluster.readthedocs.org/en/release-3.7.0/Features/afr-arbiter-volumes/)> > I?m fine with running in a degraded state when a server goes away. When it comes back, I want it to suck down all the new changes from the authoritative replica pair known to the quorum and then once it has all the data, then it can be marked as not-degraded and resume normal operation. > > I want each node to notice a down server, and when it is part of a 51% partition, I want the remaining replica members of that server to become degraded replica N-1 set. When the server comes back up, and want it to repair back into a replica N state.AFR does all this but in a distributed synchronous replication system, no matter what the replication factor is, at some point, *preventing* split-brains means failing further writes if the current write would make the only true copy not true anymore. This fencing will be done until the other copies are in sync (i.e. healed) . That *will* mean a loss of availability (for writes) until the duration of heal. About the docs, could you list the links for client and server quorum where you found the details to be inadequate? I can't seem to find anything myself on readthedocs.:( I'm anyway planning to do a detailed write up for arbiter volumes, split-brains, client and server quorums which can serve as a ready reckoner. HTH, Ravi