Yes again the common refrains - just throw hardware at it. I/we of course know t...

ericbarrett · on April 27, 2023

I have managed and written tooling for RDBMS from dinky GB-sized up to the multi-thousand-shard PB-scale. What you're saying is absolutely true. What a small team with vision can do when they see the ramp coming pays off 100-fold just a year or two in the future.

I think this kind of anticipation was part of Pinterest's early success, for example. They got ahead of their database scaling early and were able to focus on the product and UX.

zie · on April 27, 2023

I think we are basically in agreement about everything, but coming from different perspectives. There is no "right" answer, but pre-mature optimization is almost always the wrong answer.

re-thc · on April 27, 2023

And what's a more scalable solution? (in your mind)

dalyons · on April 27, 2023

unfortunately i have no direct experience with anything that i would consider a direct replacement for the generic utility of postgres(or any RDBMS). Mostly I have been involved in moving specific domains to storage technology that has opinions that work well with the problem at hand.

eg if it looks key-value ish, or key + timestamp (eg user transaction table), dynamodb is incredible. Scales forever, never have to think about operations. But not generally queryable like pg.

if it looks event-ish or log-ish, offload to a redshift/snowflake/bigtable. But append only & eventually consistent.

if you really need distributed global mutations, and are willing to pay with latency, spanner is great.

if you can cleanly tenent or shard your data and theres little-to-no cross-shard querying then vitess or some other RDBMS shard automation layer can work.

There are a few "postgres but distributed" dbs maturing now, like cockroach - i havent personally used them at a scale that i could tell you if it actually works or not though. AFAIU these systems still have tradeoffs around table layout and access patterns that you have to think about.