I keep seeing this opinion and I don't understand it. For various reasons, I rec...

docandrew · 2025-02-15T02:41:20 1739587280

“etcd, apiserver, and controllers.”

…and containerd and csi plugins and kubelet and cni plugins and kubectl and kube-proxy and ingresses and load balancers…

remram · 2025-02-15T04:00:00 1739592000

And system calls and filesystems and sockets and LVM and...

Sure at some point there are too many layers to count but I wouldn't say any of this is "Kubernetes". What people tend to be hung about is the difficulty of Kubernetes compared to `docker run` or `docker compose up`. That is what I am surprised about.

I never had any issue with kubelet, or kube-proxy, or CSI plugins, or CNI plugins. That is after years of running a multi-tenant cluster in a research institution. I think about those about as much as I think about ext4, runc, or GRUB.

ffsm8 · 2025-02-15T04:48:54 1739594934

But you just said that you had issues with ceph? How is that not a CSI problem?

And CNI problems are extremely normal. Pretty much anyone that didn't just use weavenet and called it a day has had to spend quiet a bit of time to figure it out. If you already know networking by heart it's obviously going to be easier, but few devs do.

remram · 2025-02-15T15:17:02 1739632622

Never had a problem with the CSI plugin, I had problems with the Ceph cluster itself. No, I wouldn't call Ceph part of Kubernetes.

You definitely can run Kubernetes without running Ceph or any storage system, and you already rely on a distributed storage system if you use the cloud whether you use Kubernetes or not. So I wouldn't count this as added complexity from Kubernetes.

ffsm8 · 2025-02-15T17:23:20 1739640200

I'm not sure I can agree with that interpretation. CSI is basically an interface that has to be implemented.

If you discount issues like that, you can safely say that it's impossible to have any issues with CSI, because it's always going to be with one of it's implementation.

That feels a little disingenuous, but maybe that's just me.

remram · 2025-02-15T20:18:31 1739650711

So if you run Kubernetes in the cloud, you consider the entire cloud provider's block storage implementation to be part of Kubernetes too?

For example you'd say AWS EBS is part of Kubernetes?

ffsm8 · 2025-02-15T21:13:21 1739654001

In the context of this discussion, which is about the complexity of the k8s stack: yes.

Youre ultimately gonna have to use a storage of some form unless you're just a stateless service/keep the services with state out of k8s. That's why I'd include it, and the fact that you can use multiple storage backends, each with their own challenges and pitfalls makes k8s indeed quiet complex.

You could argue that multinode PaaS is always going to be complex, and frankly- I'd agree with that. But that was kinda the original point. At least as far as I interpreted it: k8s is not simple and you most likely didn't need it either. But if you do need a distributed PaaS, then it's probably a good idea to use it. Doesn't change the fact that it's a complex system.

remram · 2025-02-15T21:34:05 1739655245

So you're comparing Kubernetes to what? Not running services at all? In that case I agree, you're going to have to set up Linux, find a storage solution, etc as part as your setup. Then write your app. It's a lot of work.

But would I say that your entire Linux installation and the cloud it runs on is part of Kubernetes? No.

lovich · 2025-02-16T08:25:50 1739694350

> So you're comparing Kubernetes to what? Not running services at all?

Surprisingly there were hosted services on the internet prior to kubernetes existing. Hell, I even have reason to believe that the internet may possibly predate Docker

remram · 2025-02-16T15:29:36 1739719776

That is my point! If you think "just using SystemD services in a VM" is easy but "Kubernetes is hard", and you say "Kubernetes is hard" is because of Linux, cgroups, cloud storage, mount namespaces, ... Then I can't comprehend that argument, because those are things that exist in both solutions.

Let's be clear on what we're comparing or we can't argue at all. Kubernetes is hard if you have never seen a computer before, I will happily concede that.

lovich · 2025-02-17T06:45:20 1739774720

ah I apologize for my snark then, I interpreted your sentence as _you_ believing that the only step simpler than using Kubernetes was to not have an application running

I see how you were asking the GP that question now

techpression · 2025-02-16T10:47:57 1739702877

Next you’re going to claim the internet existed before Google too.

ffsm8 · 2025-02-15T22:06:17 1739657177

Various options around for simple alternatives, the simplest is probably just running single node.

Maybe with fail over for high availability.

Even that's fine for most deployments that aren't social media sites, aren't developed by multiple teams of devs and don't have any operations people on payroll.

p_l · 2025-02-15T14:38:44 1739630324

Because CSI is just a way to connect a volume to a pod.

Ceph is its own cluster of kettles filled with fishes

freedomben · 2025-02-15T04:54:21 1739595261

Very fair, although with managed services which are increasingly available, you don't typically need to think about CSI or CNI.

cuu508 · 2025-02-15T06:59:33 1739602773

Hence

> Kubernetes is not the first thing that comes to mind when I think of "understanding where their code is running and what it's doing"...

remram · 2025-02-15T15:22:50 1739632970

CSI and CNI do about as much magic as `docker volume` and `docker network`.

People act like their web framework and SQL connection pooler and stuff are so simple, while Kubernetes is complex and totally inscrutable for mortals, and I don't get it. It has a couple of moving parts, but it is probably simpler overall than SystemD.

formerly_proven · 2025-02-15T17:11:51 1739639511

I was genuinely surprised that k8s turned out to actually be pretty straightforward and very sensible after years of never having anything to do with it and just hearing about it on the net. Turns out opinions are just like after all.

That being said, what people tend to build on top of that foundation is a somewhat different story.

DrFalkyn · 2025-02-15T18:15:06 1739643306

it’s not k8s. It’s distrusted systems

Unfortunately people (cough managers) think k8s is some magic that makes distrusted systems problems go away, and automagically enables unlimited scalability

In reality it just makes the mechanics a little easier and centralized

Getting distributed systems right is usually difficult

abustamam · 2025-02-15T17:30:32 1739640632

I asked chatgpt the other day to explain to me Kubernetes. I still don't understand it. Can you share with me what clicked with you, or resources that helped you?

amazingman · 2025-02-15T19:33:55 1739648035

Controller in charge of a specific type of object watches a database table representing the object type. Database table represents the desired state of things. When entries to the table are CRUD-ed, that represents a change to the desired state of things. Controller interacts with the larger system to bring the state of things into alignment with the new desired state of things.

"The larger system" is more controllers in charge of other object types, doing the same kind of work for its object types

There is an API implemented for CRUD-ing each object type. The API specification (model) represents something important to developers, like a group of containers (Pod), a load balancer with VIP (Service), a network volume (PersistentVolume), and so on.

Hand wave hand wave, Lego-style infrastructure.

None of the above is exactly correct (e.g. the DB is actually a k/v store), but it should be conceptually correct.

DrFalkyn · 2025-02-16T03:46:29 1739677589

Is there only a single controller ? What happens if goes down?

If multiple controllers, how do they coordinate ?

amazingman · 2025-02-17T09:07:28 1739783248

>Is there only a single controller ?

No, there are many controllers. Each is in charge of the object types it is in charge of.

>What happens if [it] goes down?

CRUD of the object types it manages have no effect until the controller returns to service.

>If multiple controllers, how do they coordinate ?

The database is the source of truth. If one controller needs to "coordinate" with another, it will CRUD entries of the object types those other controllers are responsible for. e.g. Deployments beget ReplicaSets beget Pods.

chronid · 2025-02-16T12:20:24 1739708424

The k/v store offers primitives to make that happen, but for non-critical controllers you don't want to deal with things like that they can go down and will be restarted (locally by kubelet/containerd) or rescheduled. Whatever resource they monitor will just not be touched until they get restarted.

x0054 · 2025-02-15T23:17:43 1739661463

What clicked with me is having ChatGPT go line by line through all of the YAML files generated for a simple web app—WordPress on Kubernetes. Doing that, I realized that Kubernetes basically takes a set of instructions on how to run your app and then follows them.

So, take an app like WordPress that you want to make “highly available.” Let’s imagine it’s a very popular blog or a newspaper website that needs to serve millions of pages a day. What would you do without Kubernetes?

Without Kubernetes, you would get yourself a cluster of, let’s say, four servers—one database server, two worker servers running PHP and Apache to handle the WordPress code, and finally, a front-end load balancer/static content host running Nginx (or similar) to take incoming traffic and route it to one of the two worker PHP servers. You would set up all of your servers, network them, install all dependencies, load your database with data, and you’d be ready to rock.

If all of a sudden an article goes viral and you get 10x your usual traffic, you may need to quickly bring online a few more worker PHP nodes. If this happens regularly, you might keep two extra nodes in reserve and spin them up when traffic hits certain limits or your worker nodes’ load exceeds a given threshold. You may even write some custom code to do that automatically. I’ve done all that in the pre-Kubernetes days. It’s not bad, honestly, but Kubernetes just solves a lot of these problems for you in an automated way. Think of it as a framework for your hosting infrastructure.

On Kubernetes, you would take the same WordPress app and split it into the same four functional blocks. Each would become a container. It can be a Docker container or a Containerd container—as long as it’s compatible with the Open Container Initiative, it doesn’t really matter. A container is just a set of files defining a lightweight Linux virtual machine. It’s lightweight because it shares its kernel with the underlying host it eventually runs on, so only the code you are actually running really loads into memory on the host server.

You don’t really care about the kernel your PHP runs on, do you? That’s the idea behind containers—each process runs in its own Linux virtual machine, but it’s relatively efficient because only the code you are actually running is loaded, while the rest is shared with the host. I called these things virtual machines, but in practice they are just jailed and isolated processes running on the host kernel. No actual hardware emulation takes place, which makes it very light on resources.

Just like you don’t care about the kernel your PHP runs on, you don’t really care about much else related to the Linux installation that surrounds your PHP interpreter and your code, as long as it’s secure and it works. To that end, the developer community has created a large set of container templates or images that you can use. For instance, there is a container specifically for running Apache and PHP—it only has those two things loaded and nothing else. So all you have to do is grab that container template, add your code and a few setting changes if needed, and you’re off to the races.

You can make those config changes and tell Kubernetes where to copy and place your code files using YAML files. And that’s really it. If you read the YAML files carefully, line by line, you’ll realize that they are nothing more than a highly specialized way of communicating the same type of instructions you would write to a deployment engineer in an email when telling them how to deploy your code.

It’s basically a set of instructions to take a specific container image, load code into it, apply given settings, spool it up, monitor the load on the cluster, and if the load is too high, add more nodes to the cluster using the same steps. If the load is too low, spool down some nodes to save money.

So, in theory, Kubernetes was supposed to replace an expensive deployment engineer. In practice, it simply shifted the work to an expensive Kubernetes engineer instead. The benefit is automation and the ability to leverage community-standard Linux templates that are (supposedly) secure from the start. The downside is that you are now running several layers of abstraction—all because Unix/Linux in the past had a very unhealthy disdain for statically linked code. Kubernetes is the price we pay for those bad decisions of the 1980s. But isn’t that just how the world works in general? We’re all suffering the consequences of the utter tragedy of the 1980s—but that’s a story for another day.

jrockway · 2025-02-15T18:44:09 1739645049

> People act like their web framework and SQL connection pooler and stuff are so simple

I'm just sitting here wondering why we need 100 billion transistors to move a piece of tape left and right ;)

ghaff · 2025-02-15T13:28:29 1739626109

Well, and the fact that in addition to Kubernetes itself, there are a gazillion adjacent products and options in the cloud-native space. Many/most of which a relatively simple setup may not need. But there's a lot of complexity.

But then there's always always a lot of complexity and abstraction. Certainly, most software people don't need to know everything about what a CPU is doing at the lowest levels.

igmor · 2025-02-15T03:39:08 1739590748

These components are very different in complexity and scope. Let's be real: a seasoned developer is mostly familiar with load balancers and ingress controllers, so this will be mostly about naming and context. I agree though once you learn about k8s it becomes less mysterious but that also means the author hasn't pushed it to the limits. Outages in the control plane could be pretty nasty and it is easy to have them by creating an illusion everything is kind of free in k8s.

nicoburns · 2025-02-15T03:47:55 1739591275

A really simple setup for many smaller organisations wouldn't have a load balancer at all.

darkwater · 2025-02-15T07:56:03 1739606163

No load balancer means... entering one node only? Doing DNS RR over all the nodes? If you don't have a load balancer in front, why are you even using Kubernetes? Deploy a single VM and call it a day!

I mean, in my homelab I do have Kubernetes and no LB in front, but it's a homelab for fun and learn K8s internals. But in a professional environment...

dilyevsky · 2025-02-15T17:15:34 1739639734

No code at all even - just use excel

zeroq · 2025-02-15T03:40:41 1739590841

typical how to program an owl:

step one: draw a circle

step two: import the rest of the owl

donutshop · 2025-02-16T07:17:30 1739690250

... and kubernetes networking, service mesh, secrets management

chronid · 2025-02-16T12:30:04 1739709004

You arent' forced to use service mesh and complex secrets management schemes. If you add them to the cluster is because you value what they offer you. It's the same thing as kubernetes itself - I'm not sure what people are complaining about, if you don't need what kubernetes offers, just don't use it.

Go back to good ol' corsync/pacemaker clusters with XML and custom scripts to migrate IPs and set up firewall rules (and if you have someone writing them for you, why don't you have people managing your k8s clusters?).

Or buy something from a cloud provider that "just works" and eventually go down in flames with their indian call centers doing their best but with limited access to engineering to understand why service X is misbehaving for you and trashing your customer's data. It's trade-offs all the way.

motorest · 2025-02-15T08:41:24 1739608884

> …and containerd and csi plugins and kubelet and cni plugins (...)

Do you understand you're referring to optional components and add-ons?

> and kubectl

You mean the command line interface that you optionally use if you choose to do so?

> and kube-proxy and ingresses and load balancers…

Do you understand you're referring to whole classes of applications you run on top of Kubernetes?

I get it that you're trying to make a mountain out of a mole hill. Just understand that you can't argue that something is complex by giving as your best examples a bunch of things that aren't really tied to it.

It's like trying to claim Windows is hard, and then your best example is showing a screenshot of AutoCAD.

allarm · 2025-02-15T11:21:18 1739618478

How’s kubelet and cni are “optional components”? What do you mean by that?

remram · 2025-02-15T15:30:01 1739633401

CNI is optional, you can have workloads bind ports on the host rather than use an overlay network (though CNI plugins and kube-proxy are extremely simple and reliable in my experience, they use VXLAN and iptables which are built into the kernel and that you already use in any organization who might run a cluster, or the basic building blocks of your cloud provider).

CSI is optional, you can just not use persistent storage (use the S3 API or whatever) or declare persistentvolumes that are bound to a single or group of machines (shared NFS mount or whatever).

I don't know how GP thinks you could run without the other bits though. You do need kubelet and a container runtime.

p_l · 2025-02-15T14:39:52 1739630392

kubelet isn't, but CNI technically is (or can be abstracted to minimum, I think old network support might have been removed from kubelet nowadays)

lolinder · 2025-02-15T02:30:06 1739586606

Because the root comment is mostly but not quite right: there are indeed a large subset of developers that aren't interested in thinking about infrastructure, but there are many subcategories of those people, and many of them aren't fly.io customers. A large number of people who are in that category aren't happy to let someone else handle their infra. They're not interested in infra in the sense that they don't believe it should be more complicated than "start process on Linux box and set up firewall and log rotation".

For some applications these people are absolutely right, but they've persuaded themselves that that means it's the best way to handle all use cases, which makes them see Kubernetes as way more complex than is necessary, rather than as a roll-your-own ECS for those who would otherwise truly need a cloud provider.

worldsayshi · 2025-02-15T03:40:05 1739590805

Feels like swe engineers are talking past each other a lot about these topics.

I assume everyone wants to be in control of their environment. But with so many ways to compose your infra that means a lot of different things for different people.

figassis · 2025-02-15T09:57:40 1739613460

I use k8s, wouldn't call it simple, but there are ways to minimize the complexity of your setup. Mostly, what devs see as complexity is k8s packages a lot of system fundamentals, like networking, storage, name resolution, distributed architectures, etc, and if you mainly spent your career in a single lane, k8s becomes impossible to grasp. Not saying those devs are wrong, not everyone needs to be a networking pro.

K8s is meant to be operated by some class of engineers, and used by another. Just like you have DBAs, sysadmins, etc, maybe your devops should have more system experience besides terraform.

lenkite · 2025-02-15T19:51:04 1739649064

"Kubernetes is etcd, apiserver, and controllers....Sure it's a lot of YAML but I don't understand the rep."

Sir, I upvoted you for your wonderful sense of humour.

chucky_z · 2025-02-15T17:00:14 1739638814

I consider a '60+ node' kubernetes cluster is very small. Kubernetes at that scale is genuinely excellent! At 6000, 60000, and 600000 nodes it becomes very different and goes from 'Hey, this is pretty great' to 'What have I done?' The maintenance costs of running more than a hundred clusters is incredibly nontrivial especially as a lot of folks end up taking something open-source and thinking they can definitely do a lot better (you can.... there's a lot of "but"s there though).

fmbb · 2025-02-15T17:10:01 1739639401

OK but the alternative if you think Kubernetes is too much magic when you want to operate hundreds of clusters with tens of thousands of nodes is?

Some bash and Ansible and EC2? That is usually what Kubernetes haters suggest one does to simplify.

chucky_z · 2025-02-16T17:07:11 1739725631

At a certain scale, let's say 100k+ nodes, you magically run into 'it depends.' It can be kubernetes! It can be bash, ansible, and ec2! It can be a custom-built vm scheduler built on libvirt! It can be a monster fleet of Windows hyper-v hosts! Heck, you could even use Mesos, Docker Swarm, Hashicorp Nomad, et al.

The main pain point I personally see is that everyone goes 'just use Kubernetes' and this is an answer, however it is not the answer. It steamrolling all conversations leads to a lot of the frustration around it in my view.

zsoltkacsandi · 2025-02-15T21:09:58 1739653798

Hashicorp Nomad, Docker Swarm, Apache Mesos, AWS ECS?

I love that the Kubernetes lovers tend to forget that Kubernetes is just one tool, and they believe that the only possible alternative to this coolness is that sweaty sysadmins writing bash scripts in a dark room.

fmbb · 2025-02-17T06:03:16 1739772196

I’m absolutely not a Kubernetes lover. Bash and Andible etc. is just a very common suggestion from haters.

I thought Mesos was kinda dead nowadays, good to hear it’s still kicking. Last time I used it it the networking was a bit annoying, not able to provide virtual network interfaces but only ports.

It seems like if you are going to operate these things, picking a solution with a huge community and in active development feels like the smart thing to do.

Nomad is very nice to use from a developer perspective, and it’s nice to hear infrastructure people preferring it. From outside the reason people pick Kubernetes seems to be the level of control of infra and security teams want over things like networking and disk.

zsoltkacsandi · 2025-02-17T10:23:13 1739787793

Can you describe who a Kubernetes hater is? Or show me an example. It's easy to stigmatise someone as a Kubernetes lover or hater. Then use it to invalidate their arguments.

I would argue against Kubernetes in particular situations, and even recommend Ansible in some cases, where it is a better fit in the given circumstances. Do you consider me as a Kubernetes hater?

Point is, Kubernetes is a great tool. In particular situations. Ansible is a great tool. In particular situations. Even bash is a great tool. In particular situations. But Kubernetes even could be the worst tool if you choose unwisely. And Kubernetes is not the ultimate infrastructure tool. There are alternatives, and there will be new ones.

ukuina · 2025-02-15T19:05:28 1739646328

HashiCorp Nomad?

__turbobrew__ · 2025-02-16T18:55:58 1739732158

The wheels fall off kubernetes at around 10k nodes. One of the main limitations is etcd from my experience, google recently fixed this problem by making spanner offer an etcd compatible API: https://cloud.google.com/blog/products/containers-kubernetes...

Etcd is truly a horrible data store, even the creator thinks so.

pas · 2025-02-15T21:04:05 1739653445

At that point you probably need a cluster of k8s clusters, no?

For anyone unfamiliar with this the "official limits" are here, and as of 1.32 it's 5000 nodes, max 300k containers, etc.

https://kubernetes.io/docs/setup/best-practices/cluster-larg...

chucky_z · 2025-02-16T17:09:06 1739725746

Yes this is what I'm referring too. :)

Maintaining a lot of clusters is super different than maintaining one cluster.

Also please don't actually try to get near those limits, your etcd cluster will be very sad unless you're _very_ careful (think few deployments, few services, few namespaces, no using etcd events, etc).

cullenking · 2025-02-15T16:13:21 1739636001

Hey fellow k8s+ceph on bare metaler! We only have a 13 machine rack and 350tb of raw storage. No major issues with ceph after 16.x and all nvme storage though.

spratzt · 2025-02-15T11:18:07 1739618287

Genuinely curious about what sort of business stores and processes 14 PB on a 60 node cluster.

remram · 2025-02-15T15:14:04 1739632444

Research institution.

The department saw more need for storage than Kubernetes compute so that's what we're growing. Nowadays you can get storage machines with 1 PB in them.

superq · 2025-02-15T13:55:18 1739627718

Yeah, that's an interesting question, because it sounds like a ton of data vs not enough compute, but, aside from this all being in a SAN or large storage array:

The larger Supermicro or Quanta storage servers can easily handle 36 HDD's each, or even more.

So with just 16 of those with 36x24TB disks, that meets the ~14PB capacity mark, leaving 44 remaining nodes for other compute task, load balancing, NVME clusters, etc.

remram · 2025-02-15T15:13:11 1739632391

We have boxes with up to 45 drives yes.

jimmaswell · 2025-02-15T02:23:19 1739586199

Yeah, I'm sure there are tricky details as in anything but the core idea doesn't sound that complicated to me. I've been looking into it a bit after seeing this fun video a while ago where a DOS BBS is ran on kubernetes.

https://youtu.be/wLVHXn79l8M?si=U2FexAMKd3zQVA82