And system calls and filesystems and sockets and LVM and...
Sure at some point there are too many layers to count but I wouldn't say any of this is "Kubernetes". What people tend to be hung about is the difficulty of Kubernetes compared to `docker run` or `docker compose up`. That is what I am surprised about.
I never had any issue with kubelet, or kube-proxy, or CSI plugins, or CNI plugins. That is after years of running a multi-tenant cluster in a research institution. I think about those about as much as I think about ext4, runc, or GRUB.
But you just said that you had issues with ceph? How is that not a CSI problem?
And CNI problems are extremely normal. Pretty much anyone that didn't just use weavenet and called it a day has had to spend quiet a bit of time to figure it out. If you already know networking by heart it's obviously going to be easier, but few devs do.
Never had a problem with the CSI plugin, I had problems with the Ceph cluster itself. No, I wouldn't call Ceph part of Kubernetes.
You definitely can run Kubernetes without running Ceph or any storage system, and you already rely on a distributed storage system if you use the cloud whether you use Kubernetes or not. So I wouldn't count this as added complexity from Kubernetes.
I'm not sure I can agree with that interpretation. CSI is basically an interface that has to be implemented.
If you discount issues like that, you can safely say that it's impossible to have any issues with CSI, because it's always going to be with one of it's implementation.
That feels a little disingenuous, but maybe that's just me.
In the context of this discussion, which is about the complexity of the k8s stack: yes.
Youre ultimately gonna have to use a storage of some form unless you're just a stateless service/keep the services with state out of k8s. That's why I'd include it, and the fact that you can use multiple storage backends, each with their own challenges and pitfalls makes k8s indeed quiet complex.
You could argue that multinode PaaS is always going to be complex, and frankly- I'd agree with that. But that was kinda the original point. At least as far as I interpreted it: k8s is not simple and you most likely didn't need it either. But if you do need a distributed PaaS, then it's probably a good idea to use it. Doesn't change the fact that it's a complex system.
So you're comparing Kubernetes to what? Not running services at all? In that case I agree, you're going to have to set up Linux, find a storage solution, etc as part as your setup. Then write your app. It's a lot of work.
But would I say that your entire Linux installation and the cloud it runs on is part of Kubernetes? No.
> So you're comparing Kubernetes to what? Not running services at all?
Surprisingly there were hosted services on the internet prior to kubernetes existing. Hell, I even have reason to believe that the internet may possibly predate Docker
That is my point! If you think "just using SystemD services in a VM" is easy but "Kubernetes is hard", and you say "Kubernetes is hard" is because of Linux, cgroups, cloud storage, mount namespaces, ... Then I can't comprehend that argument, because those are things that exist in both solutions.
Let's be clear on what we're comparing or we can't argue at all. Kubernetes is hard if you have never seen a computer before, I will happily concede that.
ah I apologize for my snark then, I interpreted your sentence as _you_ believing that the only step simpler than using Kubernetes was to not have an application running
I see how you were asking the GP that question now
Various options around for simple alternatives, the simplest is probably just running single node.
Maybe with fail over for high availability.
Even that's fine for most deployments that aren't social media sites, aren't developed by multiple teams of devs and don't have any operations people on payroll.
CSI and CNI do about as much magic as `docker volume` and `docker network`.
People act like their web framework and SQL connection pooler and stuff are so simple, while Kubernetes is complex and totally inscrutable for mortals, and I don't get it. It has a couple of moving parts, but it is probably simpler overall than SystemD.
I was genuinely surprised that k8s turned out to actually be pretty straightforward and very sensible after years of never having anything to do with it and just hearing about it on the net. Turns out opinions are just like after all.
That being said, what people tend to build on top of that foundation is a somewhat different story.
Unfortunately people (cough managers) think k8s is some magic that makes distrusted systems problems go away, and automagically enables unlimited scalability
In reality it just makes the mechanics a little easier and centralized
Getting distributed systems right is usually difficult
I asked chatgpt the other day to explain to me Kubernetes. I still don't understand it. Can you share with me what clicked with you, or resources that helped you?
Controller in charge of a specific type of object watches a database table representing the object type. Database table represents the desired state of things. When entries to the table are CRUD-ed, that represents a change to the desired state of things. Controller interacts with the larger system to bring the state of things into alignment with the new desired state of things.
"The larger system" is more controllers in charge of other object types, doing the same kind of work for its object types
There is an API implemented for CRUD-ing each object type. The API specification (model) represents something important to developers, like a group of containers (Pod), a load balancer with VIP (Service), a network volume (PersistentVolume), and so on.
Hand wave hand wave, Lego-style infrastructure.
None of the above is exactly correct (e.g. the DB is actually a k/v store), but it should be conceptually correct.
No, there are many controllers. Each is in charge of the object types it is in charge of.
>What happens if [it] goes down?
CRUD of the object types it manages have no effect until the controller returns to service.
>If multiple controllers, how do they coordinate ?
The database is the source of truth. If one controller needs to "coordinate" with another, it will CRUD entries of the object types those other controllers are responsible for. e.g. Deployments beget ReplicaSets beget Pods.
The k/v store offers primitives to make that happen, but for non-critical controllers you don't want to deal with things like that they can go down and will be restarted (locally by kubelet/containerd) or rescheduled. Whatever resource they monitor will just not be touched until they get restarted.
What clicked with me is having ChatGPT go line by line through all of the YAML files generated for a simple web app—WordPress on Kubernetes. Doing that, I realized that Kubernetes basically takes a set of instructions on how to run your app and then follows them.
So, take an app like WordPress that you want to make “highly available.” Let’s imagine it’s a very popular blog or a newspaper website that needs to serve millions of pages a day. What would you do without Kubernetes?
Without Kubernetes, you would get yourself a cluster of, let’s say, four servers—one database server, two worker servers running PHP and Apache to handle the WordPress code, and finally, a front-end load balancer/static content host running Nginx (or similar) to take incoming traffic and route it to one of the two worker PHP servers. You would set up all of your servers, network them, install all dependencies, load your database with data, and you’d be ready to rock.
If all of a sudden an article goes viral and you get 10x your usual traffic, you may need to quickly bring online a few more worker PHP nodes. If this happens regularly, you might keep two extra nodes in reserve and spin them up when traffic hits certain limits or your worker nodes’ load exceeds a given threshold. You may even write some custom code to do that automatically. I’ve done all that in the pre-Kubernetes days. It’s not bad, honestly, but Kubernetes just solves a lot of these problems for you in an automated way. Think of it as a framework for your hosting infrastructure.
On Kubernetes, you would take the same WordPress app and split it into the same four functional blocks. Each would become a container. It can be a Docker container or a Containerd container—as long as it’s compatible with the Open Container Initiative, it doesn’t really matter. A container is just a set of files defining a lightweight Linux virtual machine. It’s lightweight because it shares its kernel with the underlying host it eventually runs on, so only the code you are actually running really loads into memory on the host server.
You don’t really care about the kernel your PHP runs on, do you? That’s the idea behind containers—each process runs in its own Linux virtual machine, but it’s relatively efficient because only the code you are actually running is loaded, while the rest is shared with the host. I called these things virtual machines, but in practice they are just jailed and isolated processes running on the host kernel. No actual hardware emulation takes place, which makes it very light on resources.
Just like you don’t care about the kernel your PHP runs on, you don’t really care about much else related to the Linux installation that surrounds your PHP interpreter and your code, as long as it’s secure and it works. To that end, the developer community has created a large set of container templates or images that you can use. For instance, there is a container specifically for running Apache and PHP—it only has those two things loaded and nothing else. So all you have to do is grab that container template, add your code and a few setting changes if needed, and you’re off to the races.
You can make those config changes and tell Kubernetes where to copy and place your code files using YAML files. And that’s really it. If you read the YAML files carefully, line by line, you’ll realize that they are nothing more than a highly specialized way of communicating the same type of instructions you would write to a deployment engineer in an email when telling them how to deploy your code.
It’s basically a set of instructions to take a specific container image, load code into it, apply given settings, spool it up, monitor the load on the cluster, and if the load is too high, add more nodes to the cluster using the same steps. If the load is too low, spool down some nodes to save money.
So, in theory, Kubernetes was supposed to replace an expensive deployment engineer. In practice, it simply shifted the work to an expensive Kubernetes engineer instead. The benefit is automation and the ability to leverage community-standard Linux templates that are (supposedly) secure from the start. The downside is that you are now running several layers of abstraction—all because Unix/Linux in the past had a very unhealthy disdain for statically linked code. Kubernetes is the price we pay for those bad decisions of the 1980s. But isn’t that just how the world works in general? We’re all suffering the consequences of the utter tragedy of the 1980s—but that’s a story for another day.
Well, and the fact that in addition to Kubernetes itself, there are a gazillion adjacent products and options in the cloud-native space. Many/most of which a relatively simple setup may not need. But there's a lot of complexity.
But then there's always always a lot of complexity and abstraction. Certainly, most software people don't need to know everything about what a CPU is doing at the lowest levels.
Sure at some point there are too many layers to count but I wouldn't say any of this is "Kubernetes". What people tend to be hung about is the difficulty of Kubernetes compared to `docker run` or `docker compose up`. That is what I am surprised about.
I never had any issue with kubelet, or kube-proxy, or CSI plugins, or CNI plugins. That is after years of running a multi-tenant cluster in a research institution. I think about those about as much as I think about ext4, runc, or GRUB.