I think it's even simpler: 1. Know what it takes to run a database (including st...

bogomipz · on Feb 19, 2017

>"If you know both of those, running databases on Kubernetes (can't speak for swarm or mesos) is not hard or surprising,"

Are you speaking from experience when you say it is not hard? Could you elaborate on what databases your are currently running on Kubernetes and how they are configured? Also are these production?

If I know number 1 and number 2 does that mean that I automatically understand all of the of the potential failure modes I might experience from combining 1 and 2? I certainly wouldn't think so.

smarterclayton · on Feb 19, 2017

I'm one of the engineers on OpenShift, and there have been different production databases (sql and nosql alike) running on OpenShift in very large companies for almost 2 years now, as well as many databases in staging and test configurations.

Your point about 1/2 is fair, I was trying to convey that Kube follows certain rules w.r.t. process termination, storage, and safety that can be relied on when you internalize them. What's lacking today is the single doc that walks people through the tradeoffs and is easily approachable (although the stateful set docs do a pretty good job of it). In addition, we've made increasing effort at ensuring that behavior is predictable (why StatefulSets exist, and the changes in 1.5 to ensure terminating pods remain even if the node goes down).

Storage continues to be the most important part of stateful apps in general. On AWS/gce/azure you get safe semantics for fencing storage (as long as you don't bend the rules). On metal you'll need a lot more care - the variety of NAS storage comes with lots of tradeoffs, and safe use assumes a level of sophistication that I wouldn't expect unless folks have made an investment in storage infrastructure. I expect that to continue to improve, with things like Ceph and Glusters direct integration, VMWare storage, and NetApp / other serious NFS integration.

And it's always possible to treat nodes like pets on the cloud and leverage their local storage if you have good backups - at scale that can be fairly effective, but when doing one-off DBs using RDS and Aurora and others is hard to beat.

bogomipz · on Feb 20, 2017

I am not very clear on the differences between running Kubernetes via OpenShift vs metal or a cloud provider. I even just looked at the RH page and it still wasn't that clear to me. Can you elaborate? Is there a different story for stateful things like running datastores on K8+OopenShift?