> By its nature, K8s is designed to host stateless apps. It is fantastic at doing this, to be clear. I love K8s. But a system where the nodes can (and should, if you're taking advantage of spot pricing) disappear with a few minutes' warning is not a great host for an RDBMS.
Why is this a problem? A typical deployment will have multiple replicas, with (hopefully) small replication lag. Those should be able to be promoted to be the new primary within a minute.
> A typical deployment will have multiple replicas, with (hopefully) small replication lag. Those should be able to be promoted to be the new primary within a minute.
What happens within that minute to database writes?
Well, the problem is a little bit more complicated than just having replicas.
You cannot build operational procedures based on “hope”.
High replication lag occurs for many many reasons (and they are not a rare event, or something that you can prevent). As well as network partitions.
Replication and binary logs can get corrupted, there can be deadlocks, duplicated row errors, etc.
The thing is that database administration is a broad and complicated topic, a small mistake or the lack of understanding how these systems work can easily lead to huge data losses.
Ah yes, HN. You know there are billions of sites(wp mostly), LoB apps etc that run on 1 mysql/pg/etc instance right? Replicas are not typical and a tiny minority.
OP was talking about 'typical database setup' being a replicated db. It's not typical. Nor is the use of k8s for stuff outside HN and massive companies. Not that I mentioned k8s anyway.
Why is this a problem? A typical deployment will have multiple replicas, with (hopefully) small replication lag. Those should be able to be promoted to be the new primary within a minute.