You’d worry about 100% CPU because even if the OS was successfully optimizing for throughput (as Linux is very good at), latency/p99 is certain to suffer as spare cycles disappear.
That’s not a concern with typical GPU workloads, which are batch/throughput-oriented.
For low-latency applications, and this can be anything from your computer's UI to a webserver, at around 80%, you start to see latency increase quickly. If it's CPU-intensive on your computer, you can run the CPU hog at a lower priority. If it's a webserver, you're forced to trade off throughput and latency.
Surely you're referring to occasional full load, not 24/7 load? Or 100% usage on some but not all cores? 100% used CPU means unresponsive system usually, and things crashing.
Your point is taken, but if things are crashing because your CPU is running at 100%, you either have an Intel CPU or you have other hardware problems. There should be no issue running a CPU at 100% 24/7/365 indefinitely.
If the load meter ever reads "100%" then that means you were at 100% for long enough to cause problems. It's measuring a period much longer than a millisecond. It depends on use case how big those problems are, and whether you want to pay money to avoid them, but they exist even before you hit 100% and even if it's only briefly that high.
Peaking at 90% over the monitoring interval does not mean you can fit 10% more load without compromises. It does not mean your CPU is oversized.
You're seriously telling people that a CPU working at 90% is processing faster than a CPU working at 100%?
Context switching, memory access, and the ability of the problem to be computed in parallel - sure, but CPU usage as the defining metric - seriously stop smoking that stuff.
> You're seriously telling people that a CPU working at 90% is processing faster than a CPU working at 100%?
If your CPU sits at 100% for several seconds then your task is almost certainly bottlenecked by the CPU part of the time. Therefore, if you use a faster CPU your task will get done faster.
So if we keep everything else the same, same hardware, same workload, same CPU design, only changing the clock speed of the CPU, then the CPU that reads 90% must be a faster CPU. Therefore it will reduce the bottlenecking, and your task will get done faster.
For the CPU upgrade to not matter, you'd have to have a task that is never bottlenecked by the CPU. That is very unlikely to be the case if your CPU is reading 100% for several seconds.
Edit to reply:
> Ok, so your grand solution to this is - faster CPUs process faster than slower CPUs. Wow. Who would have thunk it.
You were the one saying that a fast CPU is "oversized". So I explained in detail why avoiding 100% does not make your CPU oversized.
Yes it's obvious, glad you agree now.
> 100% of a CPU is faster than 90% of that same CPU.
It has more throughput but for most types of software it now has worse latency.
If you care about latency, then you don't want to increase throughput by pegging your CPU at 100%. You want to increase throughput by getting more/better CPUs and keeping them at a lower percent.
Well also people are pretty bad at logistical reasoning though.
From a capital expenditure perspective, you are renting the CPU you bought in terms of opportunity cost.
What people have some sense of is that there's an ascribable value to having a capability in reserve versus discovering you don't have it when you need it.