That could actually be a good explanation for why reducing code cache pressure can help even in cases where it doesn't make sense that it would; because the another thread is also using that cache.
Though I wonder if that's true of all SMT chips; I wonder if any chips have dual L1 caches for exactly this reason?
Though I wonder if that's true of all SMT chips; I wonder if any chips have dual L1 caches for exactly this reason?