Isn’t that what we need with Raft, in order to guarantee reaching eventual consistency after reconciling any pending changes when a division into multiple partitions is healed?
For example, suppose we have two nodes and they are placed into separate partitions. If we allow an exact 50% share to count as a “majority” then both nodes can potentially continue to accept and acknowledge updates during the division, because each is capable of updating a majority of the nodes in the cluster (in this case, just itself, but the same argument applies if we have larger numbers of nodes in each half). However, when the division is healed, a simple check to see which leader node has the higher term count and allow its log to take precedence is no longer sufficient.
The problem is that we have violated the State Machine Safety guarantee described in the Raft paper (that is, we can have different nodes that have different committed entries at the same index in their respective logs). We now have no way to resolve the conflict, because those updates were already committed by some of the nodes in the cluster and have been acknowledged by their respective leader nodes, so now we can’t just append any extra committed log entries from one node to another node’s log to get back to a consistent state.
> My first thought for solving that problem would be to always have an odd number of nodes, but I'm interested in how Raft actually handles it, too.
You don't need an odd number of nodes. Raft requires that a majority of the cluster be able to communicate in order to make forward progress. In the event that a majority can't communicate, no changes to the cluster can be made. This isn't particularly unique to Raft; any distributed consensus algorithm that wishes to maintain consistency has this: if your cluster is split in such a way that a majority can't communicate, you can't both accept writes and prevent the cluster from becoming "split-brained", i.e., having two states, one on each side of the split. If a majority of nodes can communicate, you know that your side of the split is unique in this regard, and can thus keep going. All other splits, not having a majority, will not be able to accept writes.
A note about majorities: The definition of majority is "_More than half_ (50%) of some group"; in a cluster of size 5, this is at least 3. In a cluster of size 6, this is at least _4_. Because it is more than half (and not just half), even sized clusters are just fine in Raft: in a 3/3 split in a size 6 cluster, neither side has majority by definition.
Yeah, so that's the problem we're talking about: the presentation implied that a split network can still accept writes, because one side will have a majority. But what if neither side does? Is the cluster just in read-only mode?
But, as I rephrase the problem, I think I observe that splitting the network in two is a rather general case of partitioning. If we partition each node to be on its own, then of course nobody has a majority, so an odd number of nodes doesn't solve the general partition problem.