> which is quite a bit off from 20k. True. I certainly admit that I don't know t...

ecnahc515 · on May 1, 2018

You just said you don't understand the code well enough to discern which files matter and don't, then go to say it takes thousands of lines of code.

If you remove the generated code, and discount the tests, its barely a thousand. Much of it is the test code, and Go is fairly verbose for testing code.

It generally maybe takes a few hundred lines of code to write simple to moderately complex operators. A lot if it's generated and boilerplate. I would say a lot of that's due to lack of generics in Go, but I wouldn't say it's very much code overall. Additionally, the framework being presented here aims to reduce that down by removing the boilerplate, and making it easier to express the end goal (eg: self-healing, auto-rebalancing, etc) using less code.

It's certainly not much more code to implement an operator than what you would see in a well written Puppet module/Chef cookbook/Ansible playbook, and it does a lot more. You certainly could try to do self-healing using these tools, but it's significantly more difficult in my experience.

I agree that there's no free lunch and that you won't just necessarily get self-healing applications by using Kubernetes. But it's certainly easier to build them when using Kubernetes. The only thing that's really changed is instead of writing to a particular cloud provider API to handle this, you're able to leverage something more agnostic to the specific cloud/vendor you're using for your infrastructure.

justinsaccount · on May 1, 2018

> But we've had "rolling deployments" and "self-healing applications" before, without having to write 10k+ lines of code to manage the platform deployment

You continue to be disingenuous and imply that every application requires 10k lines of code to run on k8s.

I recently used k8s to deploy an application. To configure 2 services with exterior and interior load balancing and health checks for rolling deployments and self healing took 115 lines of yaml, maybe 40 of which was specific to my application.

115 lines. Not 10,000+

Then once things were working I created a 2nd namespace for production and deployed an entire 2nd copy of everything. This took me 10 minutes and 2 kubectl commands.

> "If I use Kubernetes I will have self-healing applications". It doesn't work that way.

But that's exactly how it worked. I wrote 115 lines of yaml and had multiple environments, load balancers, health checks, and rolling deployments.

I know how to do this using "traditional setups", and I know it takes a lot more than 115 lines of generic yaml.

cookiecaper · on May 1, 2018

> You continue to be disingenuous and imply that every application requires 10k lines of code to run on k8s.

Let me clarify and state unambiguously that it won't necessarily take 10k lines of code to run any random application on Kubernetes.

You can, in fact, deploy Prometheus without using the Prometheus operator and you'll technically be "running your monitoring" within k8s. It just isn't likely to be very reliable or useful. :)

> But that's exactly how it worked. I wrote 115 lines of yaml and had multiple environments, load balancers, health checks, and rolling deployments.

If you already had a fully "stateless", self-healing capable application running on not-k8s, and your layout is as simplistic as "2 services with load balancers", you can probably move to Kubernetes with a comparatively small amount of fuss. If your existing setup was pretty tiny, this may have been a worthwhile project.

If you didn't already have a stateless, self-healing-capable system, and you didn't change your application to accommodate it as part of the port, then regardless of what Kubernetes reports about your pod state, you don't have a self-healing application.

The barrier between application and platform is artificial. They must work together. It's sort of a convenient fantasy that you can try to demarcate these areas. You can't just take any random thing and throw it on Kubernetes and say it's all good now because you can watch k8s cycle your pods.

Maybe you think this is implicit, but as someone who has spent the last 2.5 years building out k8s clusters for software written by average developers, I can assure you that there are a great deal of people who aren't getting this message.

I went full-time freelance about a month ago. One of the last in-house k8s services I deployed, the guy told me, "Oh yeah, we can't run more than one instance of this, or it will delete everything." Yet, these people are very proud of the "crazy scalability" they get from running on Kubernetes. Hope the next guy reads the comments and doesn't nudge that replicas field!

If you already had a non-trivial system that worked well for failover, recovery, self-healing, etc., why'd you replace it with something that is, for example, still just barely learning how to communicate reliably with non-network-attached-storage, as a beta feature in 1.10 [0], released last month? There are many things that sysadmins take for granted that don't really work well within k8s.

I accept that at first glance and with superficial projects, it can be easy to throw the thing over the fence and let k8s's defaults deal with everything. This is definitely the model and the demographic that Google has been pursuing. But if you have something more serious going on, you still have to dig into the internals of nginx and haproxy within your k8s cluster. You still have to deal with DNS. You have to deal with all the normal stuff that is used in network operations, but now, you're just dealing with a weirdly-shaped vaguely-YAMLish version of it, within the Great Googly Hall of Mirrors.

Once you do that enough, you say "Well, why am I not just doing this through real DNS, real haproxy, real nginx, like we used to do? Why am I adding this extra layer of complication to everything, including the application code that has to be adapted for Kubernetes-specific restrictions, and for which I must write <INSERT_ACCEPTABLE_LINE_NO_HERE> lines of code as an operator to ensure proper lifecycle behavior?"

Most people aren't willing to give themselves an honest answer to that question, partially because they don't really ask it. They just write some YAML and throw their code over the fence, now naively assured that the system is "self-healing". Then they get on HN and blast anyone who dares to question that experience.

[0] https://github.com/kubernetes/features/issues/121

justinsaccount · on May 1, 2018

> If your existing setup was pretty tiny, this may have been a worthwhile project.

What existing setup? I wrote and deployed this application to k8s in the span of like 4 days. If I was using "real DNS, real haproxy, real nginx" I'd probably still be trying to work out how to do zero downtime rolling deployments, and then how to clone the whole thing so I could have a separate production environment.

cookiecaper · on May 1, 2018

Yeah, so there's the crux. If you're starting from scratch and you design something explicitly to fit within Kubernetes's constraints and demands, and those constraints and demands work well with the specific application you're designing, it will, of course, be a pleasant experience to deploy on the targeted platform. The same is true for anything else.

If you make your goal to "build something that runs great on Platform", it shouldn't be a surprise that the new thing you made runs great on it. I've been talking about Real Things That Already Exist and Run Real Businesses. That's usually what we're talking about when we talk about infrastructure and servers, and that's where we see this dangerous cargo culting where people don't realize "Just use an Operator" means "just write thousands of lines of highly-specific lifecycle management code so that Kubernetes knows how to do your things".

justinsaccount · on May 1, 2018

Well, that's not really what I did.

It was a variation of an earlier project that I had deployed to EC2.. just on EC2 I had a mess of fragile boto/fabric stuff to get the test/prod machines provisioned and the application installed. It "worked" but I had no redundancy and deploys were hard cut-overs.. and if the new version didn't come up for whatever reason, it was just down.

I didn't do anything in the application itself to design it to run on k8s, I was able to re-use some existing test code to define things like

        livenessProbe:
          exec:
            command:
            - /app/healthcheck.py
          initialDelaySeconds: 
          periodSeconds: 300
          timeoutSeconds: 5

so, 7 lines of yaml and I had self healing and rolling deployments. I could have built this out on EC2.. probably would have taken me a few hundred lines of terraform/ansible/whatever and never worked as well. It's the kind of thing where I could have maybe gotten rolling deployments working, but I would have just ended up with an "ad-hoc, informally-specified, bug-ridden, slow implementation of half of k8s"

I would have been perfectly happy to just run this whole thing on EB/heroku/lambda/whatever but the application was doing low level tcp socket stuff, not http, and almost every PaaS these days wants you to speak http.