I don’t know. People spend a lot of time thinking about network partitions. What...

01HNNWZ0MV43FF · 2024-07-25T16:14:39 1721924079

"A good driver sometimes misses a turn, a bad driver never misses a turn"

A good engineer knows that all real-time systems are turn-based, and the network is _always_ partitioned

smilliken · 2024-07-25T18:08:39 1721930919

This is exactly the right way to think about it. One should be happy when the network works, and not expect any packet to make it somewhere on any particular schedule. Ideally. In practice it's too expensive to plan this way for most systems, so we compensate by pretending the failure modes don't exist.

tptacek · 2024-07-25T18:26:40 1721932000

I don't know what the formalisms here are, but as someone intermittently (thankfully, at this point, rarely) responsible for some fairly large-scale distributed systems: the "network that is really slow" case is much scarier to me than the "total network partition". Routine partial failure is a big part of what makes distributed systems hard to reason about.

rubiquity · 2024-07-25T21:44:50 1721943890

Even more fun are asymmetric network degradations or partitions.

toast0 · 2024-07-25T16:47:03 1721926023

You've got to at least consider what is going to happen.

The whole point of the CAP theorem is that you can't have a one size fits all design. Network partitions exist, and it's not always a full partition where you have two (or more) distinct groups of interconnected nodes. Sometimes everybody can talk to everybody, except node A can't reach node B. It can even be unidirectional, where node A can't send to B, but node B can send to A. That's just how it is --- stuff breaks.

There's designs that highlight consistency in the face of partitions; if you must have the most recent write, you need a majority read or a source of truth --- if you can't contact that source of truth or a majority of copies in a reasonable time (or at all). And you can't confirm a write unless you can contact the source of truth or do a majority write. As a consequence, you're more likely to hit situations where you reach a healthy frontend, but can't do work because it can't reach a healthy backend; otoh, that's something you should plan for anyway.

There's designs that highlight availability in the face of partitions; if you must have a read and you can reach a node with the data, go for it. This can be extended to writes too, if you trust a node to persist data locally, you can use a journal entry system rather than a state update system, and reconcile when the partition ends. You'll lose the data if the node's storage fails before the partition ends, of course. And you may end up reconciling into an undesirable state; in banking, it's common to pick availability --- you want customers to be able to use their cards even when the bank systems are offline for periodic maintenance / close of day / close of month processing, or unexpected network issues --- but when the systems come back online, some customers will have total transaction amounts that would have been denied if the system was online. Or you can do a last state update wins too --- sometimes that's appropriate.

Of course, there's the underlying horror of all distributed systems. Information takes time to be delivered, so there's no way for a node to know the current status of anything; all information from remote nodes is delayed. This means a node never knows if it can contact another node, it only knows if it was recently able to or not. This also means unless you do some form of locking read, it's not unreasonable for the value you've read to be out of date before you receive it.

Then there's even further horrors in that even a single node is actually a distributed system. Although there's significantly less likelyhood of a network partition between cpu cores on a single chip.

mordae · 2024-07-25T16:56:46 1721926606

There is also one more level of horror people often forget. Sometimes the history unwinds itself. Some outages will require restoration from a backup and some amount of transactions may be lost. It does not happen often, but it may happen. Such situations may lead to e.g. sequentially allocated IDs being reused or originally valid pointers to records now point nowhere and so on.

doctorpangloss · 2024-07-25T17:14:48 1721927688

Everything you're saying is true. It's also true that you can ignore these things, have some problems with some customers, and if you're still good, they'll still shop with you or whatever, and life will go on.

I am really saying that I don't buy into the mythology that someone at AWS or Stripe or whatever knows more about the theoretical stuff than anyone else. It's a cloud of farts. Nobody needs these really detailed explanations. People do them because they're intellectually edifying and self-aggrandizing, not because they're useful.

mordae · 2024-07-25T16:51:56 1721926316

Suppose we all live under socialism and capitalist rules no longer apply. Only physics.

You run a vacation home exchange/booking site, which you run in 3 instances in America, Europe and Asia because base you found that people are happier when the site is snappy without those pesky 50ms delays.

Now suppose it's the middle of the night in one of those 3 regions, so no big deal, but the rollout of new version brings down the database that's holding the bookings for two hours before it's fixed. Yeah, it was just the simplest solution to have just a one database with the bookings, because you wouldn't have to worry about all that pesky CAP stuff.

But people then start to ask if it would be possible to book from regions that are not experiencing the outage while double-bookings would still not happen. So you give it a consideration and then you figure out a brilliant, easy solution. You just

ndriscoll · 2024-07-25T17:35:13 1721928913

Incidentally, I'm highly suspicious of the claim that those pesky 50ms delays matter much. Checking DOMContentLoaded timings on a couple sites:

Reddit: 1.31 s

Amazon: 1.51 s

Google: 577 ms

CNN: 671 ms

Facebook: 857 ms

Logging into Facebook: 8.37 s

As long as you have TLS termination close to the end user, and your proxy maintains connections to the backend so that extra round-trips aren't needed, the amount of time large popular sites take to load suggests that people wildly overstate how much anyone cares about a fraction of a blink of an eye. A lot of these sites have a ~20 ms time to connect for me. If it were so important, I'd expect to see page loads on popular sites take <100 ms (a lot of that being TCP+TLS handshake), not 800 ms or 8 seconds.

doctorpangloss · 2024-07-25T20:57:33 1721941053

It’s hard to say. It’s common sense that it doesn’t matter that much for rational outcomes. Maybe it matters a lot if you are selling something psychological - like in a sense, before Amazon, people with shopping addiction were poorly served, and maybe 50ms matters to them.

You can produce a document that says these load times matter - Google famously observed that SERP load times have a huge impact on a thing they were measuring. You will never convince those analytics people that the pesky 50ms doesn’t matter, because they operate at a scale where they could probably observe a way that it does.

The database people are usually sincere. But are the retail people? If you work for AWS, you’re working for a retailer. Like so what if saving 50ms makes more money off shopping addicts? The AWS people will never litigate this. But hey, they don’t want their kids using Juul right? They don’t want their kids buying Robux. Part of the mythology I hate about these AWS and Stripe posts is, “The only valid real world application of my abstract math is the real world application that pays my salary.” “The only application that we should have a strictly technical conversation about is my application.”

Nobody would care about CAP at Amazon if it didn’t extract more money from shopping addicts! AWS doesn’t exist without shopping addicts!

ndriscoll · 2024-07-25T22:53:21 1721948001

What I mean is that Amazon seems to think it doesn't matter that much if they're taking hundreds of ms (or over 1 s) to show me a basic page. The TCP+TLS handshake takes like 60 ms, plus another RTT for the request is 80 ms total. If it takes 10 ms to generate the page (in fact it seems to take over 100), that should still be under 100 ms total on a fresh connection. But instead they send me off to fetch scripts from multiple other subdomains (more TCP+TLS handshakes). That alone completely ruins that 50 ms savings.

So the observation is that apparently Amazon does not think 50 ms is very important. If they did, their page could be loading about 5-10x faster. Likewise with e.g. reddit; I don't know if that site has ever managed to load a page in under 1 s. New reddit is even worse. At one point new reddit was so slow that I could tap the URL bar on my phone, scroll to the left, and change www to old in less time than it took for the page to load. In that context, I find people talking about globally distributed systems/"data at the edge" to save 50 ms to be rather comical.

hot_gril · 2024-07-25T18:48:33 1721933313

and travel sites will be a lot slower for other reasons