Hacker News new | past | comments | ask | show | jobs | submit login

Anyone solved properly the CPU Throttling issues they are seeing with kubernetes ? Is this release solved it? We are seeing a lot of throttling on every deployments, what impact our latency, even when setting a really high cpu limit. The solutions seems to be:

- remove the limit completely. Not a fan of this one since we really don't want a service going over a given limit...

- using the static management cpu policy [2] Not a fan because some services doesn't need a "whole" cpu to run...

Anyone has any other solutions? Thanks!

[1] https://github.com/kubernetes/kubernetes/issues/67577

[2] https://kubernetes.io/docs/tasks/administer-cluster/cpu-mana...




It's a bug in the linux kernel that was introduced in 4.18 and later fixed. You might be ok if you're on a later or newer version.

The symptom primarily seems to manifest that you get heavy throttling even though you haven't actually gotten to your cpu limit yet.

If you're just seeing heavy throttling AND its pegged at the limit, you haven't necessarily hit the issue and should raise the limit first and observe.

Also don't forget to eliminate other possibilities. We initially thought we were experiencing this issue and later discovered the app was just memory constrained and extremely heavy GC during certain operations was causing the throttling.


1. It makes no sense to quota your cpu with the exception of very specific cases (like metered usage). You’re just throwing away compute cycles

2. Same applies to dedicate cores for pretty much same reasons

Having said that if you really really want quota but don’t want shit tail latency I suggest setting cfs_quota_period to under 5ms via kubelet flag


> It makes no sense to quota your cpu with the exception of very specific cases (like metered usage).

This is not true at all. Autoscaling depends on CPU quotas. More importantly, if you want to keep your application running well without getting chatty neighbors or getting your containers redeployed around for no apparent reason, you need to cover all resources with quotas.


Agree re noisy neighbours, but autoscaling depends on _requests_ rather than _limits_, so you could define requests for HPA scaling but leave out the limits and have both autoscaling and no throttling.


The problem with having no throttling is that the system will just keep on running happily, until you get to the point where resources become more limited. You will not get any early feedback that your system is constantly underprovisioned. Try doing this on a multi-tenant cluster, where new pods spawned by other teams/people come and go constantly. You won't be able to get any reliable performance characteristics in environments like that.

For such clusters, it's necessary to set up stuff like the LimitRanger (https://kubernetes.io/docs/concepts/policy/limit-range/) to put a hard constant bound between requests and limits.


And how will you get feedback on being throttled other than shit is randomly failing e.g connection timeouts?


Effective monitoring. Prometheus is free and open source. There are other paid options.


That was a trick question actually - use your Prometheus stack to alert on latency sensitive workload with usage over request and ignore everything else.


Of course, you're missing the point. Depending on your application a little throttling doesn't hurt, and it can save other applications running on the same nodes that DO matter.

In the meantime you can monitor rate of throttling and rate of CPU usage to limit ratio. Nothing stops you from doing this while also monitoring response latency.

On the other hand CPU request DOES potentially leave unused CPU cycles on the table since it's a reservation on the node whether you're using it or not.

Again needs may vary.


You got it completely backwards. Request doesn’t leave unused cpu as it is cpu.shares, limit does being cfs quota that completely prevents your process from scheduling even if nothing else is using cycles. Don’t believe me? here’s one of kubernetes founders saying same thing - https://www.reddit.com/r/kubernetes/comments/all1vg/comment/...


Incorrect. If a node has 2 cores and the pods on it have request of 2000m nothing else will schedule on that node even if total actual usage is 0.

You can overprovision limit.

This is easy to test for yourself.


> Agree re noisy neighbours, but autoscaling depends on _requests_ rather than _limits_, so you could define requests for HPA scaling but leave out the limits and have both autoscaling and no throttling.

I've just checked Kubernetes' docs and I have to say you are absolutely correct. Resource limits are used to enforce resource quotas, but not autoscaling.


Autoscaling depends on requests not limits. Read my explanation on “chatty neighbors” in other thread.


Thanks! wouldn't that be an issue if a pod "take over" the node , if for some reasons a request use too much CPU?


Not really if you ensure every pod sets cpu request (which sets up cgroups cpu.shares) and your kubelet and system services are running in separate top-level cgroups (—kube-reserved and —system-reserved flags) you have reserved enforcement enabled. On full node contention every container will just consume its proportional share. This is not to say that someone malicious wouldn’t be able to dos a node but untrusted workload is a whole separate topic




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: