Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I keep seeing this opinion and I don't understand it. For various reasons, I recently transitioned from a dev role to running a 60+ node, 14+ PB bare metal cluster. 3 years in, and the only thing ever giving me trouble is Ceph.

Kubernetes is etcd, apiserver, and controllers. That's exactly as many components as your average MVC app. The control-loop thing is interesting, and there are a few "kinds" of resources to get used to, but why is it always presented as this insurmountable complexity?

I ran into a VXLAN checksum offload kernel bug once, but otherwise this thing is just solid. Sure it's a lot of YAML but I don't understand the rep.



“etcd, apiserver, and controllers.”

…and containerd and csi plugins and kubelet and cni plugins and kubectl and kube-proxy and ingresses and load balancers…


And system calls and filesystems and sockets and LVM and...

Sure at some point there are too many layers to count but I wouldn't say any of this is "Kubernetes". What people tend to be hung about is the difficulty of Kubernetes compared to `docker run` or `docker compose up`. That is what I am surprised about.

I never had any issue with kubelet, or kube-proxy, or CSI plugins, or CNI plugins. That is after years of running a multi-tenant cluster in a research institution. I think about those about as much as I think about ext4, runc, or GRUB.


But you just said that you had issues with ceph? How is that not a CSI problem?

And CNI problems are extremely normal. Pretty much anyone that didn't just use weavenet and called it a day has had to spend quiet a bit of time to figure it out. If you already know networking by heart it's obviously going to be easier, but few devs do.


Never had a problem with the CSI plugin, I had problems with the Ceph cluster itself. No, I wouldn't call Ceph part of Kubernetes.

You definitely can run Kubernetes without running Ceph or any storage system, and you already rely on a distributed storage system if you use the cloud whether you use Kubernetes or not. So I wouldn't count this as added complexity from Kubernetes.


I'm not sure I can agree with that interpretation. CSI is basically an interface that has to be implemented.

If you discount issues like that, you can safely say that it's impossible to have any issues with CSI, because it's always going to be with one of it's implementation.

That feels a little disingenuous, but maybe that's just me.


So if you run Kubernetes in the cloud, you consider the entire cloud provider's block storage implementation to be part of Kubernetes too?

For example you'd say AWS EBS is part of Kubernetes?


In the context of this discussion, which is about the complexity of the k8s stack: yes.

Youre ultimately gonna have to use a storage of some form unless you're just a stateless service/keep the services with state out of k8s. That's why I'd include it, and the fact that you can use multiple storage backends, each with their own challenges and pitfalls makes k8s indeed quiet complex.

You could argue that multinode PaaS is always going to be complex, and frankly- I'd agree with that. But that was kinda the original point. At least as far as I interpreted it: k8s is not simple and you most likely didn't need it either. But if you do need a distributed PaaS, then it's probably a good idea to use it. Doesn't change the fact that it's a complex system.


So you're comparing Kubernetes to what? Not running services at all? In that case I agree, you're going to have to set up Linux, find a storage solution, etc as part as your setup. Then write your app. It's a lot of work.

But would I say that your entire Linux installation and the cloud it runs on is part of Kubernetes? No.


> So you're comparing Kubernetes to what? Not running services at all?

Surprisingly there were hosted services on the internet prior to kubernetes existing. Hell, I even have reason to believe that the internet may possibly predate Docker


That is my point! If you think "just using SystemD services in a VM" is easy but "Kubernetes is hard", and you say "Kubernetes is hard" is because of Linux, cgroups, cloud storage, mount namespaces, ... Then I can't comprehend that argument, because those are things that exist in both solutions.

Let's be clear on what we're comparing or we can't argue at all. Kubernetes is hard if you have never seen a computer before, I will happily concede that.


ah I apologize for my snark then, I interpreted your sentence as _you_ believing that the only step simpler than using Kubernetes was to not have an application running

I see how you were asking the GP that question now


Next you’re going to claim the internet existed before Google too.


Various options around for simple alternatives, the simplest is probably just running single node.

Maybe with fail over for high availability.

Even that's fine for most deployments that aren't social media sites, aren't developed by multiple teams of devs and don't have any operations people on payroll.


Because CSI is just a way to connect a volume to a pod.

Ceph is its own cluster of kettles filled with fishes


Very fair, although with managed services which are increasingly available, you don't typically need to think about CSI or CNI.


Hence

> Kubernetes is not the first thing that comes to mind when I think of "understanding where their code is running and what it's doing"...


CSI and CNI do about as much magic as `docker volume` and `docker network`.

People act like their web framework and SQL connection pooler and stuff are so simple, while Kubernetes is complex and totally inscrutable for mortals, and I don't get it. It has a couple of moving parts, but it is probably simpler overall than SystemD.


I was genuinely surprised that k8s turned out to actually be pretty straightforward and very sensible after years of never having anything to do with it and just hearing about it on the net. Turns out opinions are just like after all.

That being said, what people tend to build on top of that foundation is a somewhat different story.


it’s not k8s. It’s distrusted systems

Unfortunately people (cough managers) think k8s is some magic that makes distrusted systems problems go away, and automagically enables unlimited scalability

In reality it just makes the mechanics a little easier and centralized

Getting distributed systems right is usually difficult


I asked chatgpt the other day to explain to me Kubernetes. I still don't understand it. Can you share with me what clicked with you, or resources that helped you?


Controller in charge of a specific type of object watches a database table representing the object type. Database table represents the desired state of things. When entries to the table are CRUD-ed, that represents a change to the desired state of things. Controller interacts with the larger system to bring the state of things into alignment with the new desired state of things.

"The larger system" is more controllers in charge of other object types, doing the same kind of work for its object types

There is an API implemented for CRUD-ing each object type. The API specification (model) represents something important to developers, like a group of containers (Pod), a load balancer with VIP (Service), a network volume (PersistentVolume), and so on.

Hand wave hand wave, Lego-style infrastructure.

None of the above is exactly correct (e.g. the DB is actually a k/v store), but it should be conceptually correct.


Is there only a single controller ? What happens if goes down?

If multiple controllers, how do they coordinate ?


>Is there only a single controller ?

No, there are many controllers. Each is in charge of the object types it is in charge of.

>What happens if [it] goes down?

CRUD of the object types it manages have no effect until the controller returns to service.

>If multiple controllers, how do they coordinate ?

The database is the source of truth. If one controller needs to "coordinate" with another, it will CRUD entries of the object types those other controllers are responsible for. e.g. Deployments beget ReplicaSets beget Pods.


The k/v store offers primitives to make that happen, but for non-critical controllers you don't want to deal with things like that they can go down and will be restarted (locally by kubelet/containerd) or rescheduled. Whatever resource they monitor will just not be touched until they get restarted.


What clicked with me is having ChatGPT go line by line through all of the YAML files generated for a simple web app—WordPress on Kubernetes. Doing that, I realized that Kubernetes basically takes a set of instructions on how to run your app and then follows them.

So, take an app like WordPress that you want to make “highly available.” Let’s imagine it’s a very popular blog or a newspaper website that needs to serve millions of pages a day. What would you do without Kubernetes?

Without Kubernetes, you would get yourself a cluster of, let’s say, four servers—one database server, two worker servers running PHP and Apache to handle the WordPress code, and finally, a front-end load balancer/static content host running Nginx (or similar) to take incoming traffic and route it to one of the two worker PHP servers. You would set up all of your servers, network them, install all dependencies, load your database with data, and you’d be ready to rock.

If all of a sudden an article goes viral and you get 10x your usual traffic, you may need to quickly bring online a few more worker PHP nodes. If this happens regularly, you might keep two extra nodes in reserve and spin them up when traffic hits certain limits or your worker nodes’ load exceeds a given threshold. You may even write some custom code to do that automatically. I’ve done all that in the pre-Kubernetes days. It’s not bad, honestly, but Kubernetes just solves a lot of these problems for you in an automated way. Think of it as a framework for your hosting infrastructure.

On Kubernetes, you would take the same WordPress app and split it into the same four functional blocks. Each would become a container. It can be a Docker container or a Containerd container—as long as it’s compatible with the Open Container Initiative, it doesn’t really matter. A container is just a set of files defining a lightweight Linux virtual machine. It’s lightweight because it shares its kernel with the underlying host it eventually runs on, so only the code you are actually running really loads into memory on the host server.

You don’t really care about the kernel your PHP runs on, do you? That’s the idea behind containers—each process runs in its own Linux virtual machine, but it’s relatively efficient because only the code you are actually running is loaded, while the rest is shared with the host. I called these things virtual machines, but in practice they are just jailed and isolated processes running on the host kernel. No actual hardware emulation takes place, which makes it very light on resources.

Just like you don’t care about the kernel your PHP runs on, you don’t really care about much else related to the Linux installation that surrounds your PHP interpreter and your code, as long as it’s secure and it works. To that end, the developer community has created a large set of container templates or images that you can use. For instance, there is a container specifically for running Apache and PHP—it only has those two things loaded and nothing else. So all you have to do is grab that container template, add your code and a few setting changes if needed, and you’re off to the races.

You can make those config changes and tell Kubernetes where to copy and place your code files using YAML files. And that’s really it. If you read the YAML files carefully, line by line, you’ll realize that they are nothing more than a highly specialized way of communicating the same type of instructions you would write to a deployment engineer in an email when telling them how to deploy your code.

It’s basically a set of instructions to take a specific container image, load code into it, apply given settings, spool it up, monitor the load on the cluster, and if the load is too high, add more nodes to the cluster using the same steps. If the load is too low, spool down some nodes to save money.

So, in theory, Kubernetes was supposed to replace an expensive deployment engineer. In practice, it simply shifted the work to an expensive Kubernetes engineer instead. The benefit is automation and the ability to leverage community-standard Linux templates that are (supposedly) secure from the start. The downside is that you are now running several layers of abstraction—all because Unix/Linux in the past had a very unhealthy disdain for statically linked code. Kubernetes is the price we pay for those bad decisions of the 1980s. But isn’t that just how the world works in general? We’re all suffering the consequences of the utter tragedy of the 1980s—but that’s a story for another day.


> People act like their web framework and SQL connection pooler and stuff are so simple

I'm just sitting here wondering why we need 100 billion transistors to move a piece of tape left and right ;)


Well, and the fact that in addition to Kubernetes itself, there are a gazillion adjacent products and options in the cloud-native space. Many/most of which a relatively simple setup may not need. But there's a lot of complexity.

But then there's always always a lot of complexity and abstraction. Certainly, most software people don't need to know everything about what a CPU is doing at the lowest levels.


These components are very different in complexity and scope. Let's be real: a seasoned developer is mostly familiar with load balancers and ingress controllers, so this will be mostly about naming and context. I agree though once you learn about k8s it becomes less mysterious but that also means the author hasn't pushed it to the limits. Outages in the control plane could be pretty nasty and it is easy to have them by creating an illusion everything is kind of free in k8s.


A really simple setup for many smaller organisations wouldn't have a load balancer at all.


No load balancer means... entering one node only? Doing DNS RR over all the nodes? If you don't have a load balancer in front, why are you even using Kubernetes? Deploy a single VM and call it a day!

I mean, in my homelab I do have Kubernetes and no LB in front, but it's a homelab for fun and learn K8s internals. But in a professional environment...


No code at all even - just use excel


typical how to program an owl:

step one: draw a circle

step two: import the rest of the owl


... and kubernetes networking, service mesh, secrets management


You arent' forced to use service mesh and complex secrets management schemes. If you add them to the cluster is because you value what they offer you. It's the same thing as kubernetes itself - I'm not sure what people are complaining about, if you don't need what kubernetes offers, just don't use it.

Go back to good ol' corsync/pacemaker clusters with XML and custom scripts to migrate IPs and set up firewall rules (and if you have someone writing them for you, why don't you have people managing your k8s clusters?).

Or buy something from a cloud provider that "just works" and eventually go down in flames with their indian call centers doing their best but with limited access to engineering to understand why service X is misbehaving for you and trashing your customer's data. It's trade-offs all the way.


> …and containerd and csi plugins and kubelet and cni plugins (...)

Do you understand you're referring to optional components and add-ons?

> and kubectl

You mean the command line interface that you optionally use if you choose to do so?

> and kube-proxy and ingresses and load balancers…

Do you understand you're referring to whole classes of applications you run on top of Kubernetes?

I get it that you're trying to make a mountain out of a mole hill. Just understand that you can't argue that something is complex by giving as your best examples a bunch of things that aren't really tied to it.

It's like trying to claim Windows is hard, and then your best example is showing a screenshot of AutoCAD.


How’s kubelet and cni are “optional components”? What do you mean by that?


CNI is optional, you can have workloads bind ports on the host rather than use an overlay network (though CNI plugins and kube-proxy are extremely simple and reliable in my experience, they use VXLAN and iptables which are built into the kernel and that you already use in any organization who might run a cluster, or the basic building blocks of your cloud provider).

CSI is optional, you can just not use persistent storage (use the S3 API or whatever) or declare persistentvolumes that are bound to a single or group of machines (shared NFS mount or whatever).

I don't know how GP thinks you could run without the other bits though. You do need kubelet and a container runtime.


kubelet isn't, but CNI technically is (or can be abstracted to minimum, I think old network support might have been removed from kubelet nowadays)


Because the root comment is mostly but not quite right: there are indeed a large subset of developers that aren't interested in thinking about infrastructure, but there are many subcategories of those people, and many of them aren't fly.io customers. A large number of people who are in that category aren't happy to let someone else handle their infra. They're not interested in infra in the sense that they don't believe it should be more complicated than "start process on Linux box and set up firewall and log rotation".

For some applications these people are absolutely right, but they've persuaded themselves that that means it's the best way to handle all use cases, which makes them see Kubernetes as way more complex than is necessary, rather than as a roll-your-own ECS for those who would otherwise truly need a cloud provider.


Feels like swe engineers are talking past each other a lot about these topics.

I assume everyone wants to be in control of their environment. But with so many ways to compose your infra that means a lot of different things for different people.


I use k8s, wouldn't call it simple, but there are ways to minimize the complexity of your setup. Mostly, what devs see as complexity is k8s packages a lot of system fundamentals, like networking, storage, name resolution, distributed architectures, etc, and if you mainly spent your career in a single lane, k8s becomes impossible to grasp. Not saying those devs are wrong, not everyone needs to be a networking pro.

K8s is meant to be operated by some class of engineers, and used by another. Just like you have DBAs, sysadmins, etc, maybe your devops should have more system experience besides terraform.


"Kubernetes is etcd, apiserver, and controllers....Sure it's a lot of YAML but I don't understand the rep."

Sir, I upvoted you for your wonderful sense of humour.


I consider a '60+ node' kubernetes cluster is very small. Kubernetes at that scale is genuinely excellent! At 6000, 60000, and 600000 nodes it becomes very different and goes from 'Hey, this is pretty great' to 'What have I done?' The maintenance costs of running more than a hundred clusters is incredibly nontrivial especially as a lot of folks end up taking something open-source and thinking they can definitely do a lot better (you can.... there's a lot of "but"s there though).


OK but the alternative if you think Kubernetes is too much magic when you want to operate hundreds of clusters with tens of thousands of nodes is?

Some bash and Ansible and EC2? That is usually what Kubernetes haters suggest one does to simplify.


At a certain scale, let's say 100k+ nodes, you magically run into 'it depends.' It can be kubernetes! It can be bash, ansible, and ec2! It can be a custom-built vm scheduler built on libvirt! It can be a monster fleet of Windows hyper-v hosts! Heck, you could even use Mesos, Docker Swarm, Hashicorp Nomad, et al.

The main pain point I personally see is that everyone goes 'just use Kubernetes' and this is an answer, however it is not the answer. It steamrolling all conversations leads to a lot of the frustration around it in my view.


Hashicorp Nomad, Docker Swarm, Apache Mesos, AWS ECS?

I love that the Kubernetes lovers tend to forget that Kubernetes is just one tool, and they believe that the only possible alternative to this coolness is that sweaty sysadmins writing bash scripts in a dark room.


I’m absolutely not a Kubernetes lover. Bash and Andible etc. is just a very common suggestion from haters.

I thought Mesos was kinda dead nowadays, good to hear it’s still kicking. Last time I used it it the networking was a bit annoying, not able to provide virtual network interfaces but only ports.

It seems like if you are going to operate these things, picking a solution with a huge community and in active development feels like the smart thing to do.

Nomad is very nice to use from a developer perspective, and it’s nice to hear infrastructure people preferring it. From outside the reason people pick Kubernetes seems to be the level of control of infra and security teams want over things like networking and disk.


Can you describe who a Kubernetes hater is? Or show me an example. It's easy to stigmatise someone as a Kubernetes lover or hater. Then use it to invalidate their arguments.

I would argue against Kubernetes in particular situations, and even recommend Ansible in some cases, where it is a better fit in the given circumstances. Do you consider me as a Kubernetes hater?

Point is, Kubernetes is a great tool. In particular situations. Ansible is a great tool. In particular situations. Even bash is a great tool. In particular situations. But Kubernetes even could be the worst tool if you choose unwisely. And Kubernetes is not the ultimate infrastructure tool. There are alternatives, and there will be new ones.


HashiCorp Nomad?


The wheels fall off kubernetes at around 10k nodes. One of the main limitations is etcd from my experience, google recently fixed this problem by making spanner offer an etcd compatible API: https://cloud.google.com/blog/products/containers-kubernetes...

Etcd is truly a horrible data store, even the creator thinks so.


At that point you probably need a cluster of k8s clusters, no?

For anyone unfamiliar with this the "official limits" are here, and as of 1.32 it's 5000 nodes, max 300k containers, etc.

https://kubernetes.io/docs/setup/best-practices/cluster-larg...


Yes this is what I'm referring too. :)

Maintaining a lot of clusters is super different than maintaining one cluster.

Also please don't actually try to get near those limits, your etcd cluster will be very sad unless you're _very_ careful (think few deployments, few services, few namespaces, no using etcd events, etc).


Hey fellow k8s+ceph on bare metaler! We only have a 13 machine rack and 350tb of raw storage. No major issues with ceph after 16.x and all nvme storage though.


Genuinely curious about what sort of business stores and processes 14 PB on a 60 node cluster.


Research institution.

The department saw more need for storage than Kubernetes compute so that's what we're growing. Nowadays you can get storage machines with 1 PB in them.


Yeah, that's an interesting question, because it sounds like a ton of data vs not enough compute, but, aside from this all being in a SAN or large storage array:

The larger Supermicro or Quanta storage servers can easily handle 36 HDD's each, or even more.

So with just 16 of those with 36x24TB disks, that meets the ~14PB capacity mark, leaving 44 remaining nodes for other compute task, load balancing, NVME clusters, etc.


We have boxes with up to 45 drives yes.


Yeah, I'm sure there are tricky details as in anything but the core idea doesn't sound that complicated to me. I've been looking into it a bit after seeing this fun video a while ago where a DOS BBS is ran on kubernetes.

https://youtu.be/wLVHXn79l8M?si=U2FexAMKd3zQVA82




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: