> The rest of the features postgres (and friends) give easily (ACID etc) are ver...

kbenson · on Oct 15, 2021

Probably less calculated and more "that's what's available to offer stably right now that we can feasibly deliver, so that's what we'll do." Distributed RDBMS were not exactly cheap or common in open source a couple decades back. I don't think there was much of a choice to make.

spookthesunset · on Oct 15, 2021

I mean it is a trade off though. You cannot beat the speed of light. The further apart your database servers are, the more lag you get between them.

If you want a transactional, consistent datastore you are gonna have to put a lock on something while writes happen. And if you want consistency it means those locks need to be on all systems in the cluster. And the entire cluster needs to hold that lock until the transaction ends. If your DB’s are 100ms apart… that is a pretty large, non negotiable overhead on all transactions.

If you toss out being fully consistent as a requirement, things get much easier in replication-land. In that case you just fucking write locally and let that change propagate out. The complexity then becomes sorting out what happens when writes to the same record on different nodes conflict… but that is a solvable problem. There will be trade offs in the solution, but it isn’t going against the laws of physics.

rstuart4133 · on Oct 17, 2021

> If you want a transactional, consistent datastore you are gonna have to put a lock on something while writes happen. And if you want consistency it means those locks need to be on all systems in the cluster.

FWIW, it's not as bad as that sounds. There are traditional locks, and there is optimistic locking. If a there are two conflicting transactions, a traditional lock detects this before it happens (by insisting a lock is obtained before any updates are done) and if there is any chance of conflict the updates are run serially (meaning one is stopped while the other runs).

Optimistic locks let updates run with lock or blocking at all, but then at the end they check if the data they depended on (ie, data that would have been locked by the traditional mechanism) has changed. If it has they throw it all away. (Well, perhaps not quite - they may apply one of the conflicting updates to ensure forward progress is made.) The upside of this is there is if there are no conflicting updates everything runs at full speed - because there is no expensive communication about why has what lock going. The downside is a lot of work may be thrown away by what amounts to speculative execution.

Most monolithic databases use traditional locking. Two CPU's in the same data centre (or more likely on the same board) can rapidly decide who owns what lock, but cycles and I/O on a high end server are precious. Distributed ACID databases like spanner, cockroachdb and yugabytedb favour opportunistic because sending messages half way across the planet to decide who owns what lock before allowing things to proceed takes a lot of time, whereas the CPU cycles and I/O's on the low end replicated hardware are cheap.

While opportunistic locks allow an almost unlimited number of non-conflicting updates to happen concurrently, their clients still have to pay a time penalty. The decision about whether there was a conflicting update still has to be made, and it still requires packets to cross the planet, and while all this happens the client can't be sure if their data has been committed. But unlike the traditional model, they are never blocked by what any other client is doing - providing it doesn't conflict.

kbenson · on Oct 15, 2021

Yes, but my point was there wasn't really a choice to make at that time, therefore no trade off.

Even if I won $100 in the lotto today and had the money in hand, I wouldn't describe my choice which house I bought years ago as a calculated trade off between what I bought and some $10 million dollar mansion. That wasn't a feasible choice at that time. Neither was making a distributed RDBMS as an open source project decades ago, IMO.

lazide · on Oct 15, 2021

Wasn’t MySQL (pre-Oracle) an open source distributed RDBMS decades ago? At least I remember running it using replication in early 2000’s

kbenson · on Oct 15, 2021

MySQL replication isn't really what I would consider a distributed RDBMS in the sense we're talking about, but it is in some senses. The main distinction being that you can't actually use it as a regular SQL interface. You have to have a primary/secondary and a secondary can't accept writes (if you did dual primary you had to be very careful about updates), etc. Mainly that you had to put rules and procedures in place for how it was used in your particular environment to allow for sharding or multiple masters, etc, because the underlying SQL system wasn't deterministic otherwise (also, the only replication available then was statement based replication, IIRC).

More closely matching would be MySQL's NDB clustered storage engine, which was released in late 2004.[1] Given that Postgres and MySQL both started ab out 1996, that's quite a time after initial release.

I spent a while in the early to mid 2000's researching and implementing dual master/slave or DRBD backed MySQL HA systems as a consultant, and the options available were very limited from what I remember. There's also probably a lot better tooling these days for developers to make use of separated read/write environments, whereas is seemed fairly limited back then.

1: https://en.wikipedia.org/wiki/MySQL_Cluster