There are three frequency levels, so-called licenses, from fastest to slowest: L0, L1 and L2. L0 is the "nominal" speed you'll see written on the box: when the chip says "3.5 GHz turbo", they are referring to the single-core L0 turbo. L1 is a lower speed sometimes called AVX turbo or AVX2 turbo, originally associated with AVX and AVX2 instructions. L2 is a lower speed than L1, sometimes called "AVX-512 turbo".
The server chips tend to be sold with a lower base clock than the consumer equivalents. If you underclock your consumer CPU to whatever the server equivalent currently is, you aren't going to see much throttling under any workloads. But you're also going to be the guy that has a 2.2GHz processor when all your friends claim to have a 5GHz processor.
In my experience, the limiting factor tends to be power delivery. I had a 6950x that I overclocked and I could get it to consistently hard crash just by starting an AVX task. My filesystems did not appreciate that! (I eventually spent a ton of time debugging why that happened, and it was just that the power supply couldn't keep the 12V rail at 12V. Turns out that Intel knows what sort of equipment is out on the market, and designed their chips accordingly. I did upgrade the power supply (1200W Corsair -> 850W Seasonic Prime) and got stable AVX without downclocking. But the whole experience killed overclocking for me. It just isn't worth the certainty that your computer will crash for no good reason at random times.)
Server chips have the same license-based limitations. In fact, the limitations first appeared on those chips and in some generations only on those chips.
The term "license" comes from Intel terminology, I didn't invent it.
It has nothing to do with legal/software licensing, but rather the chip having permission (a "license") from the power management unit and voltage regulator to run certain types of instructions that might place a large amount of stress on the power delivery components.
Server chips are sometimes already down clocked for stability or are binned for ability to handle more heat/voltage/clock, and so may have more headroom in some cases.
Doing computation produces heat, doing more computation at once produces more heat. If you build a cpu doing 512bits of math at a time, and get it to max clock, you'll be able to do math 32bits at a time, at a higher clock, because its a bit cooler. It is just physics.
https://stackoverflow.com/a/56861355
https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html
https://lemire.me/blog/2018/09/07/avx-512-when-and-how-to-us...
Excerpted from the Stack Overflow answer:
There are three frequency levels, so-called licenses, from fastest to slowest: L0, L1 and L2. L0 is the "nominal" speed you'll see written on the box: when the chip says "3.5 GHz turbo", they are referring to the single-core L0 turbo. L1 is a lower speed sometimes called AVX turbo or AVX2 turbo, originally associated with AVX and AVX2 instructions. L2 is a lower speed than L1, sometimes called "AVX-512 turbo".