That's not what Go does though. Go looks at the population of the CPU mask at st...

sethammons · on Nov 7, 2023

We use https://github.com/uber-go/automaxprocs after we joyfully discovered that Go assumed we had the entire cluster's cpu count on any particular pod. Made for some very strange performance characteristics in scheduling goroutines.

jeffbee · on Nov 8, 2023

My opinion is that setting GOMAXPROCS that way is a quite poor idea. It tends to strand resources that could have been used to handle a stochastic burst of requests, which with a capped GOMAXPROCS will be converted directly into latency. I can think of no good reason why GOMAXPROCS needs to be 2 just because you expect the long-term CPU rate to be 2. That long-term quota is an artifact of capacity planning, while GOMAXPROCS is an artifact of process architecture.

sethammons · on Nov 8, 2023

How do you suggest handling that?

n3t · on Nov 8, 2023

> which of problematic in K8s where the visible CPUs may change while your process runs

This is new to me. What is this… behavior? What keywords should I use to find any details about it?

The only thing that rings a bell is requests/limit parameters of a pod but you can't change them on an existing pod AFAIK.

jeffbee · on Nov 8, 2023

If you have one pod that has Burstable QoS, perhaps because it has a request and not a limit, its CPU mask will be populated by every CPU on the box, less one for the Kubelet and other node services, less all the CPUs requested by pods with Guaranteed QoS. Pods with Guaranteed QoS will have exactly the number of CPUs they asked for, no more or less, and consequently their GOMAXPROCS is consistent. Everyone else will see fewer or more CPUs as Guaranteed pods arrive and depart from the node.

n3t · on Nov 8, 2023

If by "CPU mask" you refer to the `sched_getaffinity` syscall, I can't reproduce this behavior.

What I tried: I created a "Burstable" Pod and run `nproc` [0] on it. It returned N CPUs (N > 1).

Then I created a "Guaranteed QoS" Pod with both requests and limit set to 1 CPU. `nproc` returned N CPUs on it.

I went back to the "Burstable" Pod. It returned N.

I created a fresh "Burstable" Pod and run `nproc` on it, got N again. Please note that the "Guaranteed QoS" Pod is still running.

> Pods with Guaranteed QoS will have exactly the number of CPUs they asked for, no more or less

Well, in my case I asked for 1 CPU and got more, i.e. N CPUs.

Also, please note that Pods might ask for fractional CPUs.

[0]: coreutils `nproc` program uses `sched_getaffinity` syscall under the hood, at least on my system. I've just checked it with `strace` to be sure.

jeffbee · on Nov 8, 2023

I don't know what nproc does. Consider `taskset`

n3t · on Nov 8, 2023

I re-did the experiment again with `taskset` and got the same results, i.e. the mask is independent of creation of the "Guaranteed QoS" Pod.

FWIW, `taskset` uses the same syscall as `nproc` (according to `strace`).

jeffbee · on Nov 8, 2023

Perhaps it is an artifact of your and my various container runtimes. For me, in a guaranteed qos pod, taskset shows just 1 visible CPU for a Guaranteed QoS pod with limit=request=1.

  # taskset -c -p 1
  pid 1's current affinity list: 1

  # nproc
  1

I honestly do not see how it can work otherwise.

n3t · on Nov 8, 2023

After reading https://kubernetes.io/docs/tasks/administer-cluster/cpu-mana..., I think we have different policies set for the CPU Manager.

In my case it's `"cpuManagerPolicy": "none"` and I suppose you're using `"static"` policy.

Well, TIL. Thanks!

jeffbee · on Nov 8, 2023

TIL also. The difference between guaranteed and burstable seems meaningless without this setting.

djbusby · on Nov 8, 2023

Even way back in the day (1996) it was possible to hot-swap a CPU. Used to have this Sequent box, 96 Pentiums in there, 6 on a card. Could do some magic, pull the card and swap a new one in. Wild. And no processes died. Not sure if a process could lose a CPU then discover the new set.

dekhn · on Nov 7, 2023

What is the population of the CPU mask at startup? Is this a kernel call? A /proc file? Some register?

EdSchouten · on Nov 7, 2023

On Linux, it likely calls sched_getaffinity().

dekhn · on Nov 7, 2023

hmm, I can see that as being useful but I also don't see that as the way to determine "how many worker threads I should start"

jeffbee · on Nov 7, 2023

It's not a bad way to guess, up to maybe 16 or so. Most Go server programs aren't going to just scale up forever, so having 188 threads might be a waste.

Just setting it to 16 will satisfy 99% of users.

dekhn · on Nov 7, 2023

There's going to be a bunch of missing info, though, in some cases I can think of. For example, more and more systems have asymmetric cores. /proc/cpuinfo can expose that information in detail, including (current) clock speed, processor type, etc, while cpu_set is literally just a bitmask (if I read the man pages right) of system cores your process is allowed to schedule on.

Fundamentally, intelligent apps need to interrogate their environment to make concurrency decisions. But I agree- Go would probably work best if it just picked a standard parallelism constant like 16 and just let users know that can be tuned if they have additional context.

jeffbee · on Nov 7, 2023

Yes, running on a set of heterogenous CPUs presents further challenges, for the program and the thread scheduler. Happily there are no such systems in the cloud, yet.

Most people are running on systems where the CPU capacity varies and they haven't even noticed. For example in EC2 there are 8 victim CPUs that handle all the network interrupts, so if you have an instance type with 32 CPUs, you already have 24 that are faster than the others. Practically nobody even notices this effect.

loxias · on Nov 7, 2023

> in EC2 there are 8 victim CPUs that handle all the network interrupts, so if you have an instance type with 32 CPUs, you already have 24 that are faster than the others

Fascinating. Could you share any (all) more detail on this that you know? Is it a specific instance type, only ones that use nitro? (or only ones without?) This might be related to a problem I've seen in the wild but never tracked down...

jeffbee · on Nov 8, 2023

I've only observed it on Nitro, but I have also rarely used pre-Nitro instances.