This isn't really true. Micro-op caches are fairly new, branch predictors are massively improved, caches have gone from 1 level to 3, lots of operations have gotten way more efficient (64 bit division for example has gone from around 60 cycles to 30 cycles between 2012 and now). Out of order execution has also massively improved, which allows for major speed increases.
L3 caches have been in consumer Intel CPUs since 2008 and uop caches were already there in pentium 4 (released in 2000, almost 22 years ago :-)). Hardly new. Of course there are interesting iterative improvements, but nothing earth-shattering.
double-checked and L3 was actually also there in P4 in 2003 ; and P4 itself was in the works since 1998. For me that's closer to late 90s (which is also what I referred to) than today, that's almost as many years as there were between the two world wars...