Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I work in a big corp myself. If you worked on cloud foundry you probably worked with some of my colleagues. ;-) Please don't mention the name though, so I can speak openly.

I see this in my corp every day. Very smart people, super interested in actual feedback, and super interested in really providing great software. But in the end they still fail. I'm not 100% sure about the reasons, but I have a guess.

Usually people who move up the ranks and get into positions where they can make decisions, tend to build relationships with other such people. So more and more they change from "why the hack is dns not working on server-13 anymore?" to "this is the big problem the market has, let's attack that". And slowly that thought croaches in that you should only think about the big picture problems and not worry about the nitty-gritty details of every day getting-things-to-run.

And that's probably the problem. While it's clear that you can't solve all the details and work on big picture problems at the same time, you should not devaluate the details. Great solutions solve big picture problems BY SOLVING THE DETAILS.

That's why K8S for instance is a mediocre solution. Sure they have great big picture ideas, but usually people spend their time with having DNS not working, having servers hanging, having etcd nodes not talking to each other without explaining why, having deployments say "SUCCESS" while actually not running. K8S provides a standardized API for every detail and every use-case, cool. But in the end you need to be an expert in everything to make it run around 50% of the time (outside of AWS and GCP). Nobody really wants to develop on k8s when the underlying platform is only available 50% of the time. And that's the current experience.

So please, if you are one of these people who is really interested in creating great software, consider finding solutions to the detailed low-level problems an important part of the goals and tasks you define. Usually the core low-level problems are not a big amount. For instance with k8s it usually is networking, dynamic storage or security related. Have one person in your team really become an expert in one of these areas (by checking out problems and solutions that exist, not just by making up ppt slides with plans that have nothing to do with the real world) and allow them to influence your task planning.



> And that's the current experience.

Maybe that's your experience? I haven't had any of those problems in moderately sized on-prem k8s clusters.


I think I've had that experience. I wouldn't generalize and say it's "the current experience," and certainly not 50% of the time, but I can say I have had this experience at least once on every cloud provider that I've used Kubernetes with to any degree of depth. Something goes wrong, and it's outside of your control, you just have to wait for them to fix it (or sometimes I guess, you can just pick a different AZ and try again.)

It's almost just enough to make you wish for a Hybrid cloud. If you don't have Kubernetes experts on staff, you shouldn't be trying to manage your own Kubernetes on-prem. It won't be surprising that this is the experience 50% of the time – honestly that has to be why managed services for Kubernetes are evidently becoming so popular.

Managed services have these issues some of the time too, and if your managed service has those issues often enough that you'd want to talk about them, you better find a different managed service. I think AKS and GKE have had those issues, but they are rare. Don't know personally about EKS.

I think some people interpret the promise of managed K8s services to be that you don't need to have someone anymore on-staff who is the expert in how things might go sideways on Kubernetes.

On the contrary, you still need that person to be able to take advantage of Kubernetes with confidence, but maybe now thanks to (whatever vendor you chose for Kubernetes) you simply don't need to have an actual department of 6-12 of those experts focused on only doing that (managing Kubernetes.)

A result of using a real, stable managed K8s service should be that on a day-to-day basis, those people won't actually need to run around with their hair on fire doing ops things just to keep the business going. Automation.

With on-prem, maybe just don't expect it to come like that out-of-the-box; if your whole team is still new at this they're going to need training and planning and ramp-up to get it to that stage. This is exactly how you get "it might be better for us to start with a managed k8s service."


I was replying to a specific list of problems:

> usually people spend their time with having DNS not working, having servers hanging, having etcd nodes not talking to each other without explaining why, having deployments say "SUCCESS" while actually not running

I'm not saying k8s is free of problems.


Not disagreeing with you, and those are all "noob problems" from my perspective as I've done all of those things wrong before myself, and I myself am a noob. But they are some actually very complex problems (and you can even have them on your managed platforms once in a while, too.)

Self-healing is great when it works, and those items you listed are all real problems. But they are not problems that you should expect to encounter on a managed service, at least hopefully not more than once. (YMMV, amiright?)


> I haven't had any of those problems in moderately sized on-prem k8s clusters.

Just out of curiosity, how are you managing your on-prem k8s clusters? (Is there a toolkit you'd recommend using? Kubespray?)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: