Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to beat the CAP theorem (2011) (nathanmarz.com)
28 points by gautamsomani on Sept 11, 2023 | hide | past | favorite | 19 comments


Discussed at the time:

How to beat the CAP theorem - https://news.ycombinator.com/item?id=3108087 - Oct 2011 (77 comments)


And reposting the main point from ye olde discussion:

.. nathanmarz on Oct 13, 2011 | parent | next [–]

I never said anywhere that it provides strong consistency.

..

Click-bait then, same now.


Google Spanner beat the CAP Theorem years ago. It focused on C+P, put in some atomic clocks at every datacenter for synchronization, and then simply accepted the supposition that Google's global network simply never goes down (is always Accessible). Definitely at least five nines.

C+P + Google-implied A. (Or C+A + chance of P approaching zero.)

https://static.googleusercontent.com/media/research.google.c...


Fauna[0] which if I recall correctly, also upends Cap Theorum. They implemented Calvin[1] which differs from Spanner

[0]: https://fauna.com/

[1]: https://fauna.com/blog/distributed-consistency-at-scale-span...


wow.

> https://www.cs.umd.edu/~abadi/

> He is best-known for the development of the storage and query execution engines of the C-Store (column-oriented database) prototype, which was commercialized by Vertica and eventually acquired by Hewlett-Packard in 2011, for his HadoopDB research on fault tolerant scalable analytical database systems which was commercialized by Hadapt and acquired by Teradata in 2014, and deterministic, scalable, transactional, distributed systems such as Calvin which is currently being commercialized by Fauna.

That is as an impressive of a resume as I have ever seen. Dude spawned 3 successful commercial database offerings.


Really not sure why this got downvoted. Not usually one to ask but I did link out to their information and papers


The post could use a 2011 in the title.


The NoSQL movement overstated how important the A in CAP was. Relatively few systems need _all_ nodes to be available, compared to those that benefit from strong consistency, especially those that live primarily in datacenters/the cloud.

But this strict notion of Availability (all nodes must be available), was conflated with being available at all, leading to CP systems being disfavored.

When we introduced Fauna, it took quite some time (and a Jepsen report) to convince others that building a CP system without exotic hardware was possible, and that in practice, access to multi-region strongly consistent replication is a far better availability experience than the typical single region deployment topology which is still most common today.


CAP's A isn't a very useful thing at all. There's nothing stopping a database from being both Consistent and (common-sense definition) Available on the majority side of a network partition. It just can't be available on both sides, which is unfortunately the definition from the CAP proof from Gilbert and Lynch. The proof isn't wrong, but it is extremely misleading if you go into it expecting the common-sense definitions.

There's also nothing stopping systems from accepting commutative writes, and hence being available for writes on both sides. Similarly, if the system accepts no writes it can accept reads on both sides without breaking the law.


I don’t agree with that characterization about availability, this is exactly what Spanner is designed for since replication can be cross-region for very good reasons (to improve latency, save on bandwidth, improve e2e availability) and you actually do want to maintain availability in the case of a network paritition between regions (which is a thing that actually happens)


Spanner does not give you availability in the case of a network partition between regions.


I’m pretty sure it does give you read-availability, just not write availability?

I could be wrong but I think it’s: To serve a read from a replica in region X, where the write replica is in region Y s.t. there is a partition between X and Y, I’m pretty sure in the normal case X does not need to wait for Y to tell it to let the read go. Instead Y locks X on-write to implement consistency. So in the case of a partition X still does not wait for Y to execute a read, but writes become unavailable.


This only covers OLAP which is way, way simpler to handle than a system to implement OLTP. Like, sure if you don’t need mutability but do need large scale, go for OLAP. Thats not actually a solution for many use cases though and it’s certainly an exaggeration to say it lets you beat the CAP theorem, when by far the hardest problem with the CAP theorem is handling the consistency and availability of replicated mutable data


Good OLAP needs very similar properties as OLTP, they just don’t as it is extra hard


This is less about cap theorem and more about immutable data stores.

It does make me wonder about log append only stores. Why is it a linear log at all? That is one of the problems OP is talking about when the partition heals and you need to merge two disparate logs.

But why? Why not use an order independent data structure like a set? Adding facts to two independent sets and then union the two sets results in the same set. I suspect the reason is, we haven’t found a good way to store sets. We only figured out how to store things linearly


> It does make me wonder about log append only stores. Why is it a linear log at all? That is one of the problems OP is talking about when the partition heals and you need to merge two disparate logs.

> But why? Why not use an order independent data structure like a set? Adding facts to two independent sets and then union the two sets results in the same set. I suspect the reason is, we haven’t found a good way to store sets. We only figured out how to store things linearly

No, the reason is the opposite. Storing sets is easy, but encoding everything we might want to store as a set is difficult. E.g. it's very common to want to store a list and want to be able to append items to the list and retrieve it in order. Doing that on top of a linear log store is trivial; doing it on top of a set store is hard if not impossible.


The system described here (if I understand correctly) could store sets if care is taken to make the set writes commutative. The CRDT literature contains several solutions to this problem.


Partitions don't actually exist, only arbitrarily high latency and arbitrarily low bandwidth.


Eventual consistency? Try arbitrary availability instead!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: