Most good static compilers can symbolically evaluate even linear systems of equations/recurrences/etc that are solvable, and transform loops that calculate them into constants, etc.
Yeah, it sometimes surprises me how much value the most simple optimizations have and how much they break people's attempts to measure performance.
There is another side to this medal which is best expressed in a quote I picked up from a relatively old paper:
"A survey of the literature on optimization uncovers many techniques that are hard to understand, harder to implement, more general than necessary and of marginal value. Our approach was to do the most profitable machine independent optimizations in the most profitable way" (A Portable Optimizing Compiler for Modula-2 by Michael L. Powel)
In some sense the more microbenchmarks an optimization breaks the bigger its impact on real world code is :)
While that paper makes a generally true statement, trying to do the most profitable optimizations in the most profitable way is what dead-ended gcc for many many years :)
At some point, you have to step back from what you are doing and actually design and engineer optimizations coherently, not just "do this little thing and that little thing"
LICM is indeed very simple to do, I wrote it for a JIT emulator I was working on a while ago. It turns out that it's not always fast. There's a bunch of weird cases that happen that can destroy your performance.
LICM is highly dependent on the quality of your trace recording or method inlining.
Not disagreeing with what you wrote at all, just wanted to provide an anecdote.
Indeed there are some relatively well understood negative consequences to LICM and various other redundancy elimination optimizations: e.g. they increase life-time of values which can have negative impact on register allocation. Another thing is getting LICM right in the presence of conditional control flow within the loop is a non-trivial exercise, in dynamically typed languages this becomes a problem because JIT in general wants to hoist type guards aggressively but it has to account for conditional control flow to avoid weird/unnecessary deopts
> The set of things you are worrying about it actually precisely the set of things CLZ solves :)
I think the set of things I worry about is slightly different, because they operate in a static environment and I operate in a dynamic one. For example, they don't seem to be worried about the transformation from:
loop {
if (pred) {
// o and o.f are loop invariants
Guard(o is X);
v = Load(o.f);
Use(v)
}
}
to
Guard(o is X);
v = Load(o.f);
loop {
if (pred) {
Use(v)
}
}
this tranformation might or might not be "beneficial" depending on relation between predicate and the guard - e.g. it can lead to a code which will just deoptimize on the guard before entering the loop. There is no way of knowing this statically, so you just have to take a bet (unless you have clear indications that guard will fail when hoisted) and then undo your decision later.
Some of the optimizations compilers/jits will perform would blow your mind.
LICM is a very simple thing to do.
There are more advanced things, like http://www.cs.rice.edu/~keith/512/2011/Lectures/L06CLZ-1up.p...
etc
That's just on the code movement side too.
Most good static compilers can symbolically evaluate even linear systems of equations/recurrences/etc that are solvable, and transform loops that calculate them into constants, etc.