Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> How does systemd handle a network partition?

The same way K8s does... it doesn't. K8s management nodes just stop scheduling anything until the worker nodes are back. When they're back the job controller just does exactly what it was doing before, monitoring the jobs and correcting any problems.

> How does systemd mount storage

Systemd has a Mount unit configuration that manages system mounts.

> or direct network connections to the right nodes

The same way K8s does... use some other load balancer. Most people use cloud load balancers pointed at an Ingress (Nginx & IPTables) or NodePort. The latter would simply be an Nginx service (Ingress) listening on port 80/443 and load-balancing to all the other nodes that the service was on, which you can get by having the services send Nginx an update when they start via dynamic config.

> How do you migrate physical hosts without downtime?

The same way K8s does.... stop the services on the node, remove the node, add a new node, start the services. The load balancer sends connections to the other node until services are responding on the new node.



> The same way K8s does... it doesn't. K8s management nodes just stop scheduling anything until the worker nodes are back

This is not entirely accurate. K8s makes it possible to structure your cluster in such a way that it can tolerate certain kinds of network partitions. In particular, as long as:

* there is a majority of etcd nodes that can talk to each other

* there is at least one instance of each of the other important daemons (e.g. the scheduler) that can talk to the etcd quorum

then the control plane can keep running. So the cluster administrator can control the level of fault-tolerance by deciding how many instances of those services to run, and where. For instance, if you put 3 etcd's and 2 schedulers in different racks, then the cluster can continue scheduling new pods even if an entire rack goes down.

If you assign the responsibility for your cluster to a single "parent" node, you're inherently introducing a point of failure at that node. To avoid that point of failure, you have to offload the state to a replicated data store -- which is exactly what K8s does, and which leads to many of the other design decisions that people call "complicated".




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: