Unfortunately the authors are using AMD machines which behave differently than most others because pairs of cores are conjoined into "modules" that share resources. In AMD processors, a cpu is a core. In almost all other processors, a cpu is a SMT thread (aka Hyperthread).
In figure 1 the levels/shades represent distance from node/socket 1, darker being closer. So node 1 is distance 0 from itself, two other nodes are distance 1, and one node is distance 2.
The only thing shared in a bulldozer core is the FPU (and perhaps some cache, not sure). For all other purposes, a bulldozer module contains two full CPU cores.
Also, what CPUs besides Intel's and IBM's use SMT?
'Dozer, the arithmetic and memory units were pretty much the only thing that were separated. Each core pair shared a scheduler, a dispatcher, branch predictor, cache, etc. It really was an oddly thought out design.
It's probably even more complicated than that as they separated parts more per core during the evolution of dozer, like the decoder which was one unified one at first and became two separate ones later.
Yeah, an oddly thought out design for sure. The idea seems to make sense to me but execution wasn't good enough I guess.
From what I understand from this presentation the 'scheduling domain' abstraction is reused through different layers of the hierarchy. So for example the two hyperthreads on one logical core are also modeled as 'scheduling domain'.
In figure 1 the levels/shades represent distance from node/socket 1, darker being closer. So node 1 is distance 0 from itself, two other nodes are distance 1, and one node is distance 2.