Rendered at 16:38:39 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
synthesis 1 hours ago [-]
This reminds me of a paper on Byzantine quorum systems "Asymmetric Distributed Trust" from 2020 by Christian Cachin and Björn Tackmann https://doi.org/10.4230/LIPIcs.OPODIS.2019.7.
After reading said paper, I was contemplating the benefits of an assymetric quorum system for (non-byzantine) consensus. I concluded that I couldn't find any benefits in doing so.
If I remember correctly, say that one quorum consists of 3 out of 7 nodes, then you no longer tolerate 3 failures for a 7 node cluster (something you would tolerate with the common majority-quorum approach). So there is no benefit with this. In terms of performance, the leader will synchronize with 6 other nodes. In a majority-quorum system that can tolerate two fails, a leader will synchronize with 4 other nodes. So there is also a performance drawback.
That being said, there may be performance benefits that can be gained by weighting nodes differently and asymmetrically.
rubiquity 1 days ago [-]
This was already known to be true by Heidi Howard’s research that yielded Flexible Paxos[0], Relaxed Paxos[1], and her more general thesis on Distributed consensus[2] as a whole
I don't think this is correct. Heidi's work made a different observation: That you can smear quorum intersection across phases of paxos, whereas the blog post in this submission is observing that you can do bog-standard quorum intersection in a way other than just thinking about majority intersection, via algebraic/geometric structures. I believe these are generally orthogonal observations.
(Heidi's work is both deeper and more practical; this post is just a really cute observation that there's something mathematically deeper underlying the idea of intersecting quora.)
senderista 1 days ago [-]
Research on quorum systems (such as the finite projective planes described in the article) dates back to the 80s.
mjb 23 hours ago [-]
The 70s, if you want to be pedantic (e.g. Gifford's "Weighted Voting for Replicated Data" or Thomas's "A Majority Consensus Approach to Concurrency Control for Multiple Copy Databases", both from '79).
danbruc 1 days ago [-]
The key correctness insight is this: any two majorities of nodes must overlap in at least one node. So between any two consecutive global state changes [...] at least one node participated in both. This single overlapping node carries forward the knowledge of what was previously committed, preventing conflicts and ensuring consistency.
There is another side to this, it must not be possible for two »majorities« to coexist, otherwise they could independently move on in case of a split cluster. This also rules out allowing consensus by majority in addition to majority by a bloc. In the seven node example, there could be a { 1, 2, 3 } and { 4, 5, 6, 7 } split, the first partition being a bloc and the second one being a majority but not containing a bloc.
matthewaveryusa 23 hours ago [-]
Ah thanks for the insight. I was wondering why not ‘block || majority’ to get around scenario 5. Quorum has nothing to do with majority and everything to do with trust structure. My mind is blown.
danbruc 9 hours ago [-]
I would not call it trust because in this kind of consensus protocols, namely permissioned consensus protocols, you assume trust between participants, you assume you know the participants and they are not malicious. Otherwise you have to go down a road more like Bitcoin, where you do not know the participants and do not trust them.
What you need are rules that define a safe behavior in face of arbitrary communication breakdowns. You need a default behavior that guarantees that you are still doing the right thing no matter which other nodes you can no longer talk to. And safe behavior means two things. First, you need continuity, you must not lose any commits, so at least one node in the current round must have participated in the latest round. Second, you have to ensure that no two groups of nodes independently decide that they should continue to make commits, otherwise the state would diverge.
And from this comes the tensions, you want many nodes in the quorum to make continuity easy, you want few nodes in the quorum to tolerate many failures, you kind of want many nodes in the quorum to prevent multiple quorums.
MathiasPius 1 days ago [-]
I really enjoy when it when someone injects a dose of "wacky" into something that is taken more or less for granted (Raft) to challenge the standard way of thinking about it.
This article flipped my understanding of split-brain or network partitions on its head: You don't actually have to have a majority to ensure progress, you just have to design your quorum selection criteria in such a way that no other partition believes they are authoritative, and these finite projection planes are an interesting way of proving that (with caveats).
Really enjoyed this, and learning a little bit of combinatorics at the same time.
As danbruc mentions below we also would really like our networks to only ever split into sets such that there is at most one set which could include a leader; otherwise we might have a more durable consensus split.
That said, algebraic structures are a tool for working with consensus problems, but there’s also process. Together we get consensus protocols. So, for example, you could have a healing process step that privileges the larger group and forces a merge even if at some moment you had two candidates that believed they were a valid leader for their own split network view.
danbruc 1 days ago [-]
Just to be clear, this is not a problem with this construction. As any two blocs overlap, there can be no split with a bloc on each side. But that is also the problem, a subset containing a bloc is relatively rare property. So while at first it seems that this is all great because you only need a few live nodes to potentially form a bloc, it turns out that it is just too rare for a random set of nodes to contain a bloc to buy you much if anything. In the worst case you could have 99 of 100 nodes live but not have a bloc in case you choose your blocs naively.
And for the merging, if you can do that, then why bother with consensus to begin with? The problem is that things that got committed are usually not just sitting in a database, they get read and acted upon. Webservice calls made, credit card transaction processed, parcel shipped, ... You can merge and undo commits in one database easily, controlling the ripple effects of those changes in other systems and the real world becomes impossible quickly.
ryanshrott 1 days ago [-]
This is neat, but I wonder how it handles asymmetric partitions. The Fano plane math assumes clean splits. Real networks don’t cooperate like that.
smallerize 1 days ago [-]
Asymmetric meaning that a node can receive messages, but can't send them?
meatmanek 22 hours ago [-]
Presumably asymmetric splits meaning A and B can talk, B and C can talk, but A and C can't talk.
oa335 1 days ago [-]
excellent article.
> The key correctness insight is this: any two majorities of nodes must overlap in at least one node. So between any two consecutive global state changes — whether two commits, two leader elections, or one of each — at least one node participated in both.
intuitively makes sense, but would be nice to see this result explicitly derived or illustrated the same way the fano planes were.
mjb 23 hours ago [-]
This is cool, and a really fun reminder that "majority" isn't required for quorum systems (it just happens to be the simplest way of thinking about it, and optimal in some senses). Moving from majorities to some other definition of quorum isn't super practical all that often, but is an interesting tool when you think about systems that don't have a uniform probability of failure or disconnection. That's not infrequent - large scale networks have very variable amounts of redundancy depending on geography and distance.
The idea of non-MDS erasure codes isn't quite the same, but they're related in the way that MDS codes are the easiest to think about, and non-MDS codes come with interesting complexities while opening up some cool new options for system design and recovery.
> our modified protocol is not guaranteed to make progress whenever a majority of nodes is active
That seems like horrible tradeoff
teraflop 23 hours ago [-]
This isn't being proposed as a serious, useful version of Raft. It's just a thought experiment.
The sentence you quote is inevitably going to be true for any type of Raft quorum that can reach consensus with a minority of nodes. You don't even need to get into the specifics of the math.
Suppose you have a quorum Q. Then its complement Q' must not also be able to form a quorum; if it did, a network partition between Q and Q' would create a split-brain. So if Q is a minority subset, then Q' is a majority that cannot reach consensus on its own.
throwaway27448 1 days ago [-]
Private capital, undermining democracy as is typical.
0 - https://fpaxos.github.io/
1 - https://dl.acm.org/doi/10.1145/3517209.3524040
2 - https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-935.pdf
(Heidi's work is both deeper and more practical; this post is just a really cute observation that there's something mathematically deeper underlying the idea of intersecting quora.)
There is another side to this, it must not be possible for two »majorities« to coexist, otherwise they could independently move on in case of a split cluster. This also rules out allowing consensus by majority in addition to majority by a bloc. In the seven node example, there could be a { 1, 2, 3 } and { 4, 5, 6, 7 } split, the first partition being a bloc and the second one being a majority but not containing a bloc.
What you need are rules that define a safe behavior in face of arbitrary communication breakdowns. You need a default behavior that guarantees that you are still doing the right thing no matter which other nodes you can no longer talk to. And safe behavior means two things. First, you need continuity, you must not lose any commits, so at least one node in the current round must have participated in the latest round. Second, you have to ensure that no two groups of nodes independently decide that they should continue to make commits, otherwise the state would diverge.
And from this comes the tensions, you want many nodes in the quorum to make continuity easy, you want few nodes in the quorum to tolerate many failures, you kind of want many nodes in the quorum to prevent multiple quorums.
This article flipped my understanding of split-brain or network partitions on its head: You don't actually have to have a majority to ensure progress, you just have to design your quorum selection criteria in such a way that no other partition believes they are authoritative, and these finite projection planes are an interesting way of proving that (with caveats).
https://vukolic.com/QuorumsOrigin.pdf
https://link.springer.com/book/10.1007/978-3-031-02007-0
https://dl.acm.org/doi/10.1145/3447865.3457962
As danbruc mentions below we also would really like our networks to only ever split into sets such that there is at most one set which could include a leader; otherwise we might have a more durable consensus split.
That said, algebraic structures are a tool for working with consensus problems, but there’s also process. Together we get consensus protocols. So, for example, you could have a healing process step that privileges the larger group and forces a merge even if at some moment you had two candidates that believed they were a valid leader for their own split network view.
And for the merging, if you can do that, then why bother with consensus to begin with? The problem is that things that got committed are usually not just sitting in a database, they get read and acted upon. Webservice calls made, credit card transaction processed, parcel shipped, ... You can merge and undo commits in one database easily, controlling the ripple effects of those changes in other systems and the real world becomes impossible quickly.
> The key correctness insight is this: any two majorities of nodes must overlap in at least one node. So between any two consecutive global state changes — whether two commits, two leader elections, or one of each — at least one node participated in both.
intuitively makes sense, but would be nice to see this result explicitly derived or illustrated the same way the fano planes were.
The idea of non-MDS erasure codes isn't quite the same, but they're related in the way that MDS codes are the easiest to think about, and non-MDS codes come with interesting complexities while opening up some cool new options for system design and recovery.
Using "majority" as the criterion has been around for a long time (e.g. Gifford in '79 https://pages.cs.wisc.edu/~remzi/Classes/739/Fall2015/Papers..., and Thomas also in '79 https://dl.acm.org/doi/10.1145/320071.320076). Also related is the idea of weighted voting (e.g. Peleg and Wool in '95 https://www.sciencedirect.com/science/article/pii/S089054018...).
That seems like horrible tradeoff
The sentence you quote is inevitably going to be true for any type of Raft quorum that can reach consensus with a minority of nodes. You don't even need to get into the specifics of the math.
Suppose you have a quorum Q. Then its complement Q' must not also be able to form a quorum; if it did, a network partition between Q and Q' would create a split-brain. So if Q is a minority subset, then Q' is a majority that cannot reach consensus on its own.