This situation gets absolutely awful when you consider that the Bronze and Silver Xeons do even more aggressive down throttling. Bronze speed plummets if only one core is doing AVX instructions.
Compilers can't hope to realistically handle this. JITs at least have a chance, but adding in handling for this behaviour surely requires a lot more complexity than I'd imagine most runtime developers would want to add to their code.
Since bronze and silver largely only have one AVX-512 FP unit, running AVX-512 in the L2 license is almost totally pointless: you'd often be better off running twice as many AVX/AVX2 instructions on the two 256-bit units since you run at a higher frequency and the FLOP/cycle is the same.
The exception would be if your kernel can make some good use of other wide instructions such as memory access or shuffles.
Compilers can't hope to realistically handle this. JITs at least have a chance, but adding in handling for this behaviour surely requires a lot more complexity than I'd imagine most runtime developers would want to add to their code.