Of course, the scheduler can easily be told to run both benchmark threads on the two hyperthreads of a given core.
The argument to be made is that because of the resources dedicated to SMT, single thread on AMD/Intel versus single thread on Apple is not measuring the true potential performance of the whole core. In principle, some multithreaded workload over all available threads could be a better metric for whole-processor performance.
Fair enough.
But if you're comparing single threaded performance, there isn't a reason to split the workload into two threads for AMD/Intel and have it as a single thread for Apple.
If I had a single threaded application or a single threaded critical path of a multithreaded application.
Benchmarking multicore Performance+Efficiency (Apple) versus SMT (AMD) versus SMT+Wide vectors (Intel) is never going to provide perfect apples-to-apples comparisons. There's an entirely reasonable argument that single-thread performance is oversold as a metric, and that the focus on it advantages some platforms over others.
At the end of the day, benchmarks are inherently only an approximate measure of how real-world code will perform. SMT, basically by definition, is rarely going to benchmark well, but is inherently going to show more of a benefit when running real-world mixed workloads.
SMT is designed to boost performance in multitreaded workloads.
It can be thought as multitasking for a CPU core.
Using two threads for one benchmark and use one thread for another, it is not comparing single core performance.
Because why wouldn't that CPU schedule the load on other cores?