We've seen a multitude of issues, like jobs failing to start, getting too delayed (also the infamous "if your cronjob fails too much it will stop working forever ")
This is just not true anymore. I went through the pain of using it early and there were times to feel like that, but t brings far more to the table than it costs anymore…
It's fine if you run it at a cloud provider. Setting up a k8s cluster yourself is painful though and at a cloud provider it costs far more than using just bare metal and/or docker (we do not, as it's another thing to manage and very boring). We have auto deploy scripts in Perl since the 90s and never had any needs for any of this stuff; we now host for less than $100/mo, with millions of users and a lot of profit with very little maintenance. I wonder why IT people like burning money so much, especially here on HN.
There is no need for almost any sites; sure facebook/google, but you are not running those nor is it likely you (not specifically you obviously, me neither) ever will. VPSs are robust these days and have no downtime besides kernel updates. I cannot phantom why you would want to burn humans or money on this kind of complexity. But then again, we like profit (not growth because of growth) and it seems that most here really are not that interested in that. If I cannot be Gates or Musk (and I cannot, nor can you, again, no attack on you; just statistics) then I rather have little work or headache with millions $ of profit/mo coming in instead of 'growth'. Maybe i'm odd, but I am free for the past 30+ years because of these choices (currently; common lisp, apache, perl, php, mysql, haproxy, redis, wireguard; hopefully I get this down to just common lisp + wireguard before I pass). We don't use libraries or tech less than 10 years old unless it's really needed and we contribute to everything we use (so we use very few things otherwise we have to hire and that's a waste of $); I sleep very well at night knowing nothing is going to happen.
> It's fine if you run it at a cloud provider. Setting up a k8s cluster yourself is painful though and at a cloud provider it costs far more than using just bare metal
I think it's almost exactly the opposite: I'd rather use cloud-specific tooling on clouds but k8s is a Better OpenStack on bare metal. It provides a standardized layer upon which generally-reasonable tools can operate without thinking about it much. There is a cost factor--it doesn't need to be a high one, though, and it's also a forcing function into stuff like "actually thinking about redundancy" ahead of time.
I've deployed in production everything you described and unless I was optimizing, as you are, for cut-to-the-bone opex and personal stress when it breaks bad (which is not a judgment call but it is certainly not the only reasonable decision to make; investing more in operations to have more "bounce" when things goes bad is not a bad thing), a reasonably thought-out k8s environment is going to be easier than shell scripts from the 90s once I need to have anyone who isn't me take over a problem.
No stress. Definitely less than most people I have met who spend their life doing this kind of over architected nonsense. But he, if you say it's easy (it is definitely not though; and it does go spectacularly wrong even at big companies where no one knows why, because complex) then do whatever: I am guessing your income depends on complexity for clients/your employer/devops gigs while mine depends on things being simple and never (again; it's been 30 years) breaking. Things don't go bad: there is enough 'bounce' here, we just refuse to spend money or time there as we do not need it. I rather work on features than stuff that should be invisible in the first place.
My income depends on no such thing, if anything it depends on reducing complexity where it doesn't provide value, but it's telling that that's the only place your mind goes. And because of that, I think there's not much value in continuing this conversation.
God forbid if you had to think and know about your infrastructure and how it worked, and whether it was as minimal and simple as possible whilst delivering results. Best to just use abstractions upon abstractions you don't know well and hope for the best.
This is the biggest issue; if something truly goes wrong with k8s, your only way it is destroy everything and redeploy; you will have no clue at all (well very likely; of course there are people who do, just not very many) what happened. This started with AWS roughly 2 decades ago when they simply said; assume it will break and architect for it: don't try to figure why things break, just restore and move on. This was absolutely brilliant: now people deploy million$ projects without actually understanding too much of the environment and pay $$$ to make sure they never have to. Well done Werner.
...I do know them pretty well, which kind of puts a hole in this kind of snooty nonsense. Because I know the abstractions and what's under them, I don't have to think about it much, because I've internalized what it's going to do.
I've built systems that exist today both ways. There are reasonable arguments for both. Please don't be weird.
From a technology point of view k8s doesn't do anything better than what Perl scripts used to do 25 years ago.
(Not that Perl scripts are any good. They're crap technology, but unfortunately so is k8s.)
K8s doesn't solve a technical problem. It solves two contradictory social problems:
a) It gives sysadmins a job creation program, full of expensive and opaque stuff that requires expensive sysadmins.
b) It makes sysadmin stuff fungible and replaceable for developers.
Solving both problems is probably an important social issue if you're running a Google scale organization. But it's solving a social and organizational problem, not fulfilling a technological need.
K8s is a fantastic development tool. You can't ask for a better self-service tool to enable developers to ramp up on a platform and develop or test their apps across teams and orgs in a standard, portable, safe way. Its biggest problem arguably is that it's too configurable, and doesn't have enough abstractions to hide the complexity.