Hacker News new | past | comments | ask | show | jobs | submit login
Canonical introduces high-availability Micro-Kubernetes (zdnet.com)
210 points by sandGorgon on Oct 15, 2020 | hide | past | favorite | 117 comments



This is very interesting, we're seeing a lot of Kubernetes "flavors" coming out that remove the etcd requirement. It's no secret that etcd is a key part of why Kubernetes is complex--etcd scaling/securing/recovery is really hard.

I think Kubernetes would do well to make swapping out the storage backend possible without having all of these forks. Kubernetes is too tightly coupled to etcd, and for little benefit. I would wager a lot of customers would trade the guarantees etcd provides for a simpler deployment.


To some extent, the distributed consensus is kind of the important part. If you have a bunch of components that don't know what state they're supposed to be in, you don't really have a cluster. Looking at the marketing documentation (I didn't read the code), they just wrote their own consensus and datastore instead of using etcd. So the underlying fundamental computer science problems still exist, but nobody has been burned by this particular implementation yet. That doesn't inspire much confidence for me; if etcd blows up, at least someone else's cluster has blown up in the same way before. Now if it happens to you, you get to be the first person to debug it. Fun.

In general, I'm not blown away with Canonical's track record. I used microk8s in its early days and it kind of blew up my coworker's computer. Networking stopped working. Making a request to localhost:5000 would return an nginx error page, even though nginx wasn't running anywhere on that machine or the network. It even persisted itself through reboots! We just reinstalled the machine eventually. It was weird stuff and I haven't touched it again. (I prefer kind locally and k3s for small "real" clusters.) Then there's Snap, major breaking changes to Ubuntu for no reason, etc. I just don't trust Canonical much, and I'm not quite ready to take a leap of faith on their new distributed database. But maybe it's great, and we'll all be using this in a year. A stopped clock is right twice a day.


> I just don't trust Canonical much...

They lost me when I've tried their ubuntu server when I was lazy one day and greeted with their Landscape advertisement in the MOTD display.

Then I installed armbian's ubuntu version because Debian version was not ready and found out that MOTD was downloaded from web every time I log in.

Add analytics (now opt-in), forcing snaps and their silent-ish efforts to monopolize the landscape, I avoid them altogether.


I feel an urge to note that ubuntu server is a decent operating system.

We use it everywhere in my current place of employment and are very satisfied with it.

(maybe it's because we deploy servers using pxe — it makes good ubuntu installs, with no ads in motd & no snapd)


Ubuntu server is decent but, Debian is better IMHO. I'm using stable for servers and testing for desktops for 15+ years (eh, wow. time flies).

Deploying servers with PXE is fun though. We use PXE and XCAT to manage our fleet. Commissioning ~200 servers in 15 minutes with three commands while sipping coffee is really satisfying.


I work in this space (being purposefully vague), you are getting a customized image of some sort - PXE really isn't related in this way, that's just a method to get the image onto the destination disk. The official Canonical Ubuntu Cloud Images produced for use this way have the MOTD behaviour and snapd in place based on my exposure to them. https://cloud-images.ubuntu.com/


we are using this thing — http://archive.ubuntu.com/ubuntu/ubuntu/dists/focal/main/ins... , it is loaded with PXE.

It doesn't put a "customized image of some sort" on servers, it literally iterates through installer steps with the autoanswers provided by preseed file.

You might say that it is customized since we wrote a preseed file. Well maybe, but it's not a "customized image". Also we didn't write any specific code to remove unwanted packages. It just doesn't install them. No weird packages like byobu :)


nod I'll go out on a limb and say "I think you may be somewhat in a minority in 2020" as many in my travels have moved over to an image based deployment methodology - it's not absolute and of course I've not been in every company, just the landscape in general. I wasn't aware Canonical had an "online PXE" flavour that you linked, TIL.

I say all this above, but I'm sure there are still shops out there burning ISOs to CDs (maybe USB now!) and hand installing everything - if I've learned anything, it's that nobody seems to agree on how to do stuff the same way. :) Each company is it's own beautiful unique snowflake full of their own design and deployment patterns.


You may be in for a shock using Ubuntu Server 20.04 and later as it has switched to using cloud-init. Preseed files no longer work.


You should be able to use the "Legacy" (their new words) ISO installer. I'd be interested in your results if you get a chance to followup here. http://cdimage.ubuntu.com/ubuntu-legacy-server/releases/20.0...


The new format does seem much nicer to my eyes but I suppose tastes vary. A pain to have to change but it looks like good work and a new release is usually required some pressed Hackett anyway.


They do work. Old installer is marked as legacy but still works.


Same story with IBM OpenShift and their telemetry.


As the person who has worked on making sure telemetry is useful and valuable for users and customers (helping drive insight into quality of kube and close the loop on fixing persistent issues), can you provide some more details about how we’ve let you down? We tried to be as responsible as possible but obviously we’ve failed along the way - what can be done to improve it?


Make the telemetry opt-in to begin with.

The comment I am replying to is complaining about Ubuntu calling home, IBM OpenShift is the same story.


that's what they call 'opinionated default configuration'; then they tie in subscription management with telemetry so that opting out of it isn't an option to begin with; https://docs.openshift.com/container-platform/4.1/telemetry/... Several further tie ins and the system as a whole becomes barely usable. But then integration is the whole point of openshift, isn't it?


Wow, I’d didn’t realize the 4.1 docs didn’t get updated - newer doc versions are correct in that disconnected subscriptions just manually entered via OCM. 4.1 didn’t include the support for disconnected clusters and at the time the docs were correct. Newer versions are more clear.

https://docs.openshift.com/container-platform/4.5/support/re...

I will follow up on opt-out being the default for the evaluation version of OpenShift. It is already opt-in for OKD.


Not to be snarky, but that's easy. Just disconnect your cluster from the internet. We have hundreds of customers that are running disconnected clusters.


And that applies to Ubuntu just as well. All I'm saying IBM is no better than Canonical in this regard.

Not complaining - If you want me to complain about IBM cluster technology that would be a whole different story.

And no I'm not gonna ask about disconnected clusters :).


Haha. I still work for Red Hat, and Consulting at that. My pay is determined by how many billable hours I put in and my customer reviews. Not directly by how much you buy.


I thought there is no Red Hat anymore.


Red Hat is a subsidiary of IBM, not acquired and absorbed. It is very much its own thing within the company as a whole.


I really thought it was a merger. Is Red Hat still a seperate legal entity?


Yes, separate legal entity, combined financial entity (although I know little about the financial details).


Putting a single commercial, once, on the MOTD file is a sure way to tarnish the reputation of a whole distribution and its derivatives forever.


The worst thing about Linux are its users.


Care to elaborate?


It's more of a last straw than a sole reason to be honest. There's a lot of stuff Canonical is doing which can be considered as embrace, extend and extinguish.

It's funny when a corporation uses tactics of another corporation it wants to beat.

[0]: https://bugs.launchpad.net/ubuntu/+bug/1


It’s easy enough to configure motd to pull your own notices for a company, and useful too. Nothing there seems particularly weird to me in this day and age, and Ubuntu has a much better track record of actual security maintenance over the long term.


> Ubuntu has a much better track record of actual security maintenance over the long term.

Debian has 24 hour security fixes for 15+ years. It also supports "OldStable" in terms of security & backports. For some time it also has "Long Term Support" teams which supports older releases.

Debian is "the original" install & forget distro. Ubuntu has some commercial sauce over it but, unless you have a special need, Debian can handle everything you throw at it.

We have so much servers so that we sometimes forget some of our background service servers' and they hum all-along with all security updates applied.


Microk8s uses dqlite as a distributed database, which uses RAFT for consensus. They have their own implementation called C-raft. So if their implementation is decent it should be fairly comparable to etcd’s raft implementation.


If etcd blows up, your cluster should stay in the same state it was in before etcd blew up.


Presuming, of course, that your cluster is completely isolated from the outside world. Otherwise, the interaction of various controllers, deployment stacks, and application usage conspire to generate endless amounts of cluster state change requests, which can quickly pileup if etcd is unavailable or behaving erratically and cause all manner of havoc. Containers can fail to restart after crashing; jobs won't run, etc. Plus, in k8s etcd is typically used for more than just the barebones state required for basic node and pod status. Metrics, API requests, DNS, etc, often depend on etcd--the same etcd--for storage and communication, which can compound the effect of a flakey etcd and even instigate it.


> In general, I'm not blown away with Canonical's track record.

Yeah, they really wanna be like Apple, but end up being more like Google, starting a lot of projects and abandoning them in couple of years.


You don't need distributed consensus to have a cluster of reliable services. We've had large-scale clusters of applications for what, 30 years? And distributed consensus being generally used by a few players for about 10 of those. Etcd is neat, but in no way mandatory for what it was even invented for. 99% of people just don't need it.


I don't know of any clustering system that works without some sort of leader. In many cases, the leader is you. If your computer dies, you can walk to another one and still deploy software. If you disappear, that's a Big Problem, however.

No modern clustering system is much different. Services aren't killed off if a leader can't be elected. It's just that having all your servers alive but uncontrollable is almost as scary as them being down.


Distributed consensus doesn't require a leader. The same process used to elect a leader (i.e. change leader state) can simply be used to change state generally. See, e.g., There Is More Consensus in Egalitarian Parliaments: https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf

In fact, for something like what etcd is used for in k8s, as the root and lynchpin of all state, but not direct application I/O (pods use their own data stores, and even for cluster management etcd usually just contains pointers to external data), a leaderless architecture makes the most sense. Protocols to choose a leader are latency and throughput optimizations, but the core state for a cluster (members, pod metadata, etc) shouldn't require very much state and therefore should be relatively low traffic as compared to most database applications. And as shown in the above paper, for certain scenarios (including, arguably, the k8s scenario) a leaderless architecture can have better latency and throughput, not to mention availability.


If you run etcd as a singleton you already get this behavior - consensus is shortcircuited and you only have to pay the cost to write to durable storage (which is already heavily batched). And you need durable writes so you can crash recover (kube has a gaping hole today in that a restore from an earlier time point breaks many controllers until a reconcile is performed).

Note that Kubernetes requires a total ordering of writes (which simplifies how hard it is for us puny humans to reason about) AND requires strong consistency in order to provide guarantees like “this pod only runs on one machine at a time” and “PVs aren’t released until the pods are really stopped”. Leaderless is a simple tradeoff - singleton or three instances. That’s the best possible choice in the world and it’s etcd and it’s relative simplicity that make it possible.

I’ve never seen a production Kube system with HA etcd go down due to non-human error, so I don’t believe single instance is going to give you better availability when single machine faults happen (they happen frequently; about 0.5-1% of the machines in the OpenShift fleet - cloud and on-premise - are down at any one time due to power, software, or hardware issues). Almost all of those clusters tick along fine when they lose that machine.


> I’ve never seen a production Kube system with HA etcd go down due to non-human error

I have, with OpenShift, several times. The worst one took hours to get it partially back online, days to fully recover. At least I got to have one free drink before I had to leave our holiday party to start fixing it. (Ok, technically it was non-prod; luckily prod was ECS... long story)

> I don’t believe single instance is going to give you better availability when single machine faults happen

Depends on your definition and context of availability. It is possibly the simplest thing to destroy any existing nodes (and in the event that a node can't be destroyed, null-route it and/or disable its network port) and bring up a new node. So single instance is fairly easy to recover. But with multiple nodes, for each function of each instance that is required to reach consensus, the likelihood of consensus failure increases; you actually need more nodes and variety in the cluster to resist consensus failure. Yet ironically, the more nodes you have, the higher the probability of failure. In the end, the real-world reliability of the system is based on additional factors besides the network model.


The database in question, dqlite, is raft-around-sqlite, and that’s just about the most widely used sql database anywhere, albeit pretty thin. Recovering files on disk won’t be an issue with dqlite. It would be interesting to see a jepsen analysis of dqlite to assess resilience in the face of trouble.


fwiw I think k3s also has an option to use canonical's dqlite.

Edit: they deprecated the option https://rancher.com/docs/k3s/latest/en/installation/ha-embed...


Bummer. Would love to know why though.


A long wished for feature, but closed wontfix a long time ago: https://github.com/kubernetes/kubernetes/issues/1957 :-(


That's a pretty soft wontfix though. Might be reopenable at this point.


We talked about it recently, and certainly people want to do this, but I don’t see it happening anytime soon in core. Downstreams are free to carry patches though and many do.

It’s more that the core team doesn’t have the time to support these, and we already have a fairly tight contract with storage, and we still find edge cases that need to be fixed, so the extra cost for the committers to support this is high.


For single node sure, but the moment you have some set of jobs requiring a multi node setup, the chance of wanting to find yourself in a situation where the cluster has essentially bricked itself due to a momentary power outage reduces to zero really quickly.

The only place etcd might not pay off is in disposable dev environments or something, but do you really want your prod setup to page only to discover a complete cluster rebuild is necessary to resolve the problem?


There are various levels of outages you may encounter with an etcd outage, but it shouldn't take your workloads completely offline. The cluster API will be offline, but the workloads will keep on chugging. I think with the decoupling of etcd and the expectation that this might happen, we'd see more improvements on how to (gracefully) handle these situations.

Also, there was a bug with kube api where it wouldn't failover to other etcd members during an outage. So I would say most customers have ran kubernetes with a "single backend" at some point.


If K8S can be made turn-key, it will become possible for some deployments to realize massive cost savings by deploying across clusters of bare metal machines from hosts like OVH, packet.net, datapacket.com, and so on. The cost of bandwidth and processing power is just so much lower.

The only remaining headache would be database. Setting up reliable mission-critical HA Postgres is kind of a pain, and it's nice to not have to worry about it. If that can also be canned, then this kind of rig would be a really compelling alternative to the managed clouds... unless you need other things like S3, etc.


The one thing I wish upstream K8s would implement is embedded etcd so that the operational complexity becomes a bit more manageable. Luckily k3s has that :)


K8s does have a well-defined storage abstraction that wraps the operations it performs.

Surprisingly, Etcd is neither scalable or high performance in our small cluster with <10 nodes on GKE. It had been the single most frequent source of prod outages.

But it appears that k8s community do not section a list of compatible storage engine that can plug and play with k8s. Or I might have missed some recent development.


k3s recently (v1.19) deprecated built-in dqlite support. (it was replaced with built-in etcd.)

though with kine (kine is not etcd) you can easily use mysql/sqlite/dqlite/postgresql.


https://github.com/rancher/kine provides a translation layer for etcd that supports various SQL flavors. It is currently used only by k3s afaik


Microk8s just works, and it's a big reason that I am running a ubuntu derivative rather then RHEL or Centos for my homelab. It's got the right level of feature enablement with the addons, and supports the whole set out of the box.

ETCD is a nice piece of nick, but I am still disappointed that other options haven't displaced it, such as consul. That said, I am not sure I am ready to trust a new distribute store for this. Its hard to get that right.


I am all for this trend. Microk8s and k3s are both a joy to develop with (though I have ran into a few bugs with k3s so tend to prefer Microk8s).

How many folks are running self-managed k8s in production though, out of curiosity?

It seems so economical to deploy k3s or use something like Rancher on dirt cheap VPS's from some place like Hetzner -- but what's the ops burden and failure risk like?

Never tried it myself because it seems intimidating. Use managed k8s services.


My employer has over 2000 clusters running k3s (I'm on the team that manages them), it's been...okay. We're on an old version though, so a lot of the k3s-specific issues we're running into are mostly fixed in more recent releases. The issues we run into more often are network or power related than k3s itself. Sometimes with the OS image the devices are running.


> My employer has over 2000 clusters running k3s

Oh wow.. Why so many clusters, and what kind of resources per cluster? Are you running 100 000s of physical servers?

Or are you just using k3s over three physical nodes as a way to achieve hardware redundancy and rolling hardware replacement?


We're providing a "smart kitchen" solution in restaurants.

Each restaurant has three Intel NUCs, so a little over 6K-7K devices. We're using k3s partly for the hardware redundancy, and partly for the ease of managing the services running at the edge - there's a local OAuth provider, MQTT server, along with some other applications that need to be up a majority of the time.

There's a cloud component to all of the software running at the edge as well, and since that's run on k8s, we wanted something similar but more lightweight at the edge.


Interesting. Do you happen to have a blog post or something discussing how you arrived at that architecture?

Do you frequently see a nuc dying, but the cluster staying up?


Yeah, there are some posts out there. I'll have to find them and edit them in.

Yes, in fact - we've seen clusters still running when two of the NUCs have disappeared from the cluster.


Maybe not the answer you are looking for, but for small and not that critical workloads, I've so far been happy with docker swarm workloads on cheap cloud/VPS.


We run our platform in different cities and they are all on-prem(custom server or managed hypervisor). After hearing horror stories about kubernetes on prem, we are happy that we decided to go with docker swarm.


K8s is the new platform, you're just delaying the inevitable. Get with the program or suffer when your lack of knowledge makes you deficient in the platform uptake.


However, being an early adopter is not usually wise. Better to wait until lots of other people have found the bugs and improved the documentation.


> (though I have ran into a few bugs with k3s so tend to prefer Microk8s).

i tried both in my homelab and this is my experience as well. microk8s seems to has less bug for me.


Nice to see. Always thought that the API server should just be part of the cluster since it's nothing more than a proxy service over the already distributed etcd datastore.


Anyone know how this compares to k3s? I’ve been using k3s for a while, and there have been a few bugs that made me a bit annoyed


This doesn't answer your question, but I'm piggy-backing with comments on k3s.

For me, using k3s for development (not prod), the killer feature is running it in --docker mode where the node uses the local dockerd to run containers (vs. managing its own containerd instance).

This allows building images locally with `docker build` and immediately using them in kubernetes pods _without_ first pushing to a (possibly local) image repository and having k3s pull the images.

Last time I investigated, none of kind, microk8s, or minikube supported this mode. For large images (gigabyte or more, and I've got a handful of these), it's very space-inefficient to have a copy in my docker and in a local registry _and_ in k3s's containerd at the same time. How is this problem typically solved?

(I note the kubernetes included in Docker Desktop on macOS works in the same way: images built with `docker build` are available to kubernetes without going through a registry.)


[I work on kind amongst other things...]

If you want to turn your host into a node you can do this with --vm-driver=none in minikube or better yet just use `kubeadm init` directly. In KIND we point people to the latter -- the main thing we're doing is running inside a disposable container "node" of which you can have many.

Assuming you don't actually want to turn your host machine into a node managed by Kubernetes, you'll want to stick Kubernetes in a VM or container. If the rest of Kubernetes is in this container or VM, it doesn't make sense to be running containers out on the host, things like mounting volumes won't work, you need a consistent filesystem between kubelet and the container runtime.

With kind it's also important that we simulate multi-node and multi-cluster, which is not viable with a single container runtime instance.

Without actually running Kubernetes against the hosts's runtime you can't share storage. The way docker desktop does this is to run Kubernetes with docker as the node's container runtime while only supporting a single node/cluster in a VM and expose the same runtime for building.

For our test workloads it's important to have different clean clusters constantly for different tests / projects.

KIND and microk8s have made a bet on containerd, as kubernetes is actively moving away from dockershim towards CRI, so even if we exposed a node runtime you can't build with it.

It's indeed space-inefficient, but it's a tradeoff in isolation between projects etc. For multi-node you're going to wind up with multiple copies anyhow, and a lot of projects we work with wind up needing some multi-node testing.


FYI minikube supports this via minikube docker-env

https://minikube.sigs.k8s.io/docs/handbook/pushing/#1-pushin...


We're using img [1] with a helper script. The helper script runs a pod in the (potentially) remote Kubernetes cluster that takes the container build context as stdin, then runs img to build the image, and then imports it into the host's containerd instance. It also maps in the right host volumes so that the image build is cached between subsequent runs.

I was surprised to be unable to otherwise find a good local / remote container development workflow, but this was built to replace what we were previously doing, which is setting DOCKER_HOST to point to the remote (single-node) cluster's Docker daemon (over SSH), so that docker CLI commands would execute on that remote box.

In both cases, you'll still want to take steps to minimize the size of your container image build context, but the size of the images doesn't matter. I'm not sure if it'd fit your needs or not, though.

[1] https://github.com/genuinetools/img


it's a bit more troublesome with microk8s https://microk8s.io/docs/registry-images


A couple of years ago minikube supported this with --vm-driver=none



The first thing I think of with microk8s is their available addons, which are fairly comprehensive, opinionated, "just works" solutions for microk8s. https://microk8s.io/docs/addons#heading--list

As a k3s user (for local dev & my very small personal prod envs), I end up having to assemble a lot of these solutions myself. I prefer my own picks, my own solutions, and learning, but microk8s having these instantly available would be really good for a lot of folks. Until now though it has never felt like those advantages would be useful in a real prod environment, that microk8s was not interested in being in prod, but HA signals to me that they are interested in broader adoption.


I'm running k3s in production. K3s has hooks to setup Prometheus, autoscaling (for spot instances), etc.

I don't see all of these in microk8s. I'm not sure if this is on the roadmap.

I'm also not sure how customisable microk8s is. We run k3s with haproxy ingress (which is not the default) and calico for network (again nog the default)


microk8s has a Prometheus add-on that you can enable with one command: https://microk8s.io/docs/addons

I'm using it and it's been great so far.


sadly you need to run with kube-proxy and can't use calico's own sauce. but I already raised an issue for that, if I would use it more instead of kubespray (which I will deprecate soon) than I maybe fix it on my own time.


Several commenters here say they have fewer bugs with MicroK8s, makes sense since it’s less of a modification than k3s. Slightly bigger since it keeps api and worker processes as separate binaries albeit in a single package.


It uses distributed sqlite (dqlite) vs etcd/RDBMS (postgresql and friends). It's also not part of the CNCF like k3s is (yet?).


dqlite is a RDBMS, it's just sqlite + raft


k3s swaps etcd for kine, which is an app that implements enough of the etcd API to support kubernetes. It uses dqlite under the hood, but you can use a different database if you want (though it's not clear to me if you still get Raft if you do so).

It looks like Canonical is just using dqlite directly.


Same question, but with minikube.


I think minikube is for local usage. I use K3S for deploying an actual production cluster, so I’m more curious about how micro-k8s acts on that scale


> I use K3S for deploying an actual production cluster

You mean when you want to run a small cluster of let's say less than 10 nodes (anything in single digits)?

Why doesn't normal k8 work this way? like same tech but just less scale?

(I should probably read more on the side myself as well, new to this K8 world).


It does. Upstream kubeadm - authentic kubernetes - can even run on a single node. Not sure why people choose k3s or microk8s when you can just as easily deploy the real thing.


For smaller needs, K3s runs fast/stable on servers with 1-2 GB of RAM, whereas K8s proper tends to be a little shaky until you go to 2-4 GB, minimum.


I hear this a lot but have any actual, tangible comparisons been done with k3s vs kubeadm arm resource use?


That's exactly what i am confused about when only reading these things in isolation.


kubeadm is very bare bones though; it gives you a running node but you are still responsible for configuring network and storage providers, an ingress controller and probably a load balancer.


Yep, it’s a small self hosted cluster on my own server hardware (VMs on Proxmox) I’m sure k8s would work, but k3s made it really easy for me to get going so now just has inertia.


I see, I assumed logically Kubernetes as a community would want to keep thing simple and keep it same across scale. For large scale there could be different variations (like switching storage backend) but for small/low scale tier keep it dead simple.

I am guessing there were valid reasons for the offshoots (k3s, microk8s etc.).


Is there a run down on how much memory these Kubernetes flavors use?

I run Docker Swarm at home because my cluster is older, decommissioned machines. The Swarm daemon uses about 50MB of resident memory, which leaves a lot of room to run containers with little overhead.


In my experience the memory usage from the underlying process are pretty negligible, and then you also need the controller/scheduler pods running.

It doesn't use a ton, but in order of least-to-most lightweight IME it's Minikube -> Microk8s -> k3s

But again, the overhead is marginal so it's not a world of difference.

If you can run Swarm you can definitely run one of the lightweight k8s distributions.


Some Microsoft teams also dislike k8s complexity for .NET backend development, hence Project Tye.

https://devblogs.microsoft.com/aspnet/introducing-project-ty...


Would this be a good use case for docker swarm replacement? Right now I have about 100 containers on prem and need to find something to replace swarm


It's what we switched to from Swarm and it works great on anything larger than the equivalent of a t3.micro instance.


How is this different from multi-node k3s (aside from minor differences in CLI and, obviously, different development teams)?


MicroK8s is great. The only issue I have with it is I have to use snap - things get needlessly complicated and failure prone.


Have they pulled it out of snap yet? Not being able to pin or control when upgrades happen is a nonstarter for us.


Just pin to a channel.


If they update the channel and it causes me a production problem, no.


Just defer updates and do them when it suits you


writing one application was hard so instead let's assume writing lots of tiny applications will be easier (???) but isolating them is hard so let's put them in containers but running containers is complicated so we need orchestration managers like kubernetes or something but running those things is too hard for someone whose job isn't to run kubernetes for a living so microk8s or minikube or k3s or whatever? that's the gist of the industry? that's really what we're doing these days?


No, that's not what we're doing at all. I agree that "microservices" is mostly useless hype but running applications reliably and efficiently is an ongoing challenge which Kubernetes solves pretty well.

Of course it does shift the complexity into the underlying K8S so installing and operating it can be difficult but there are manys to avoid that as a user. Overall you gain much more in productivity and usability, which is why it's taken off so much.


> Running applications reliably and efficiently is an ongoing challenge

Is it? I can take a program written for Windows 95 and run it on Windows 7 (maybe even newer) just fine and it will run reliable and efficiently and integrate better than containers.

It is problem only on Linux because user space ABI keeps breaking.

Notice that the containers run on the same linux kernel and not in VMs, why? Because "we do not break userspace!".


How is that related? Environments and languages change. That's entirely different to running programs with zero-downtime deployments, load-balancing and traffic management, health monitoring, logging and observability, secret and config management, storage volumes, security roles, and much more.

What is your replacement for all that?


> How is that related?

I thought we were talking about running applications reliably. Why is complete linux userspace bundled separately in each container?

> Environments and languages change. Yes.

> That's entirely different to running programs with zero-downtime deployments, load-balancing and traffic management, health monitoring, logging and observability, secret and config management, storage volumes, security roles, and much more.

That's a lot of new requirements in addition to "running applications reliably". Most applications simply do not need that. And I believe this is the point of the original comment you replied to.

- zero-downtime deployments -> not needed for most applications (for example, twitter outages are not a big deal either). Btw how do you do zero-downtime of (websocket) streams with kubernetes? ;)

- load-balancing and traffic management -> in standard k8s you are pushing all traffic through one active LB (nginx) anyway => strip most of the extra layers and you dont even need that LB

- health monitoring, logging etc. -> you can use an existing solution that provides only the functionality you need, most of the work will be in your app anyway (every application needs different metrics ..)

The argument is that most applications do not need to scale at this level (until you need anycast DNS returning per-node IPs or at least geodns, you are not scaling that much anyway) and can be implemented in simpler manner hence easier and cheaper to maintain, audit and secure.

I do not want security roles, storage volumes and config management or observability I want my application to reliably and quickly serve my customers and be easy to maintain and debug.

If you want to discuss how to design architecture for scalable applications which keep all state in distributed databases but based on a platform with stable ABI that would surely be an interesting debate as well.


It's a container and can be as thin or fat as you want with it's contents. You don't need to include linux inside if you don't want to. I've built containers with nothing more than a few native executables. It's just a packaging format, but easier to build and deploy than other formats like tarballs.

If you don't need K8S then don't use it. What's the problem? Run your app on your server and ignore everything else.

But most of these features have nothing to do with scale and are more about usability, reliability and consistency. Sure you can do it yourself but that's less efficient than just letting K8S do it all in one standardized way and interface.

> "I want my application to reliably and quickly serve my customers and be easy to maintain and debug."

That's what K8S helps with. I've spent 10 years running large distributed applications handling billions of requests per day in multiple regions. I don't care about the ABI and don't see why that's relevant, but I do know that K8S has made many things easier in actually running these apps.


> If you don't need K8S then don't use it. What's the problem? Run your app on your server and ignore everything else.

See the original comment you replied to, he is clearly complaining about the whole infrastructure and solutions getting too complex for very little benefit, I just elaborated on that point because there is some truth to it. It is not the case for your scenario handling billions of requests per day in multiple regions - that's where it makes a lot of sense to use k8s! But very few applications need that.

> It's a container and can be as thin or fat as you want with it's contents.

But you can't rely on the platform, except for the kernel because linux kernel ABI is stable hence why the containers are done in this way. I am not complaining about it, I am exaplning the reasoning. Now imagine if you could rely on and share more services provided by the platform that just the kernel ;).

> I don't care about the ABI and don't see why that's relevant

Fair enough but then I don't understand why you replied to my comment saying the containers are designed in this way because of unstable userpace ABI if you don't care about this.

> "I want my application to reliably and quickly serve my customers and be easy to maintain and debug." >> That's what K8S helps with

For certain solutions, absolutely! For other solutions a simple stateful applications is simpler and easier to maintain and debug (again, that's how I read the first comment in this thread).


What are you talking about? The Linux ABI is very stable.


Linux kernel ABI is stable as rock.

To answer your question, I'm talking about Linux userspace ABI - you can't rely on ABI of essential libraries, openssl for example. That's why docker was born back then.


I have only gotten as far as setting up K3s and am still figuring out how I want to deploy code to it.

Gotta say, the docker ps output on those machines looks like line noise to me.

Not all movement is progress. Young developers are incentivized to embrace new technology because it levels a playing field where time with a tool is your most important asset. That playing field is not real, but a lot of managers seem to think it is, so the strategy works. I don't know how we sell a different version of reality there, but we need to figure it out.

At my first big job, the oldest developer told me shortly before he left that essentially we keep facing the same set of problems in a loop, and that if I watch for it I'll see it happening. That was in many ways a very big shoulder to stand on.

That challenge, added to historical information I had learned in a class on distributed computing, changed my perception of my first loop, like I'd found a shortcut. In my second loop, I found myself having productive conversations with people on their third loop, while my coworkers were still chirping on about how it's going to be different this time.

We just keep playing a game where the rules (like cost inequalities between resource classes) get tweaked every game, but quite often they revert back to the previous rules in the following game (because the people who make hardware for that resource finally figure out how to fix their bottleneck). But instead of recycling or democratizing the tech that worked last time the rules looked that way, we reinvent it badly with new names.

Elixir is a rare exception in this case, which is part of what attracts me to it. It's essentially recycling 25% of Erlang and 45% of Rails and being transparent about it, creating a new recipe out of old ingredients that have worked well in the previous 4 tech cycles.


everybody who doesn't writes thousands of shell scripts/ansible/puppet whatever stuff to emulate a small version of kubernetes


Which is exactly what I did in my previous job before Kubernetes existed. When you're cobbling things together yourself you can leave features out of the design of your deployment / orchestration system. That is a blessing and a curse. A lot of the time you just end up ducktaping that feature on later because "oh shit, it keeps breaking". Eventually your "simpler" system is just as complex a design, but the system has lots of major flaws:

* It doesn't work as well

* Nobody outside your company has heard of it

* Most of the code is once-off hacky scripts that nobody really maintains

* Breakages will occur when underlying dependencies get upgraded when you patch your OS.

I'll even wager that nobody inside your company completely understand how it works. And when it breaks, you are 100% on the hook for it. There's no chance of official documentation or training, and no point asking for help on stackoverflow.


well I did this as well that's why I'm pro k8s. heck setting up k8s and then deploying with that is certainly 1000 times easier than my ansible script was. worse my ansible script had a small downtime. k3s does not have that. the problem is a lot of people try to solve a problem with k8s that k8s does not solve. k8s is mostly a deployment scheduler with application configuration and an api for secrets. thats it.

it comes with an api for loadbalancer and ingress but it does not have one built in, but you can schedule these things with it. so basically even your low level primitives can be scheduled with the same api.


You are missing the business opportunities to do conference talks, write blog posts, best practices books, and sell consultancy services on how to sort out those problems.

That is why we are stuck in fashion industry nowadays.


Not to be confused with https://github.com/micro/micro




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: