LICM is indeed very simple to do, I wrote it for a JIT emulator I was working on a while ago. It turns out that it's not always fast. There's a bunch of weird cases that happen that can destroy your performance.
LICM is highly dependent on the quality of your trace recording or method inlining.
Not disagreeing with what you wrote at all, just wanted to provide an anecdote.
Indeed there are some relatively well understood negative consequences to LICM and various other redundancy elimination optimizations: e.g. they increase life-time of values which can have negative impact on register allocation. Another thing is getting LICM right in the presence of conditional control flow within the loop is a non-trivial exercise, in dynamically typed languages this becomes a problem because JIT in general wants to hoist type guards aggressively but it has to account for conditional control flow to avoid weird/unnecessary deopts
> The set of things you are worrying about it actually precisely the set of things CLZ solves :)
I think the set of things I worry about is slightly different, because they operate in a static environment and I operate in a dynamic one. For example, they don't seem to be worried about the transformation from:
loop {
if (pred) {
// o and o.f are loop invariants
Guard(o is X);
v = Load(o.f);
Use(v)
}
}
to
Guard(o is X);
v = Load(o.f);
loop {
if (pred) {
Use(v)
}
}
this tranformation might or might not be "beneficial" depending on relation between predicate and the guard - e.g. it can lead to a code which will just deoptimize on the guard before entering the loop. There is no way of knowing this statically, so you just have to take a bet (unless you have clear indications that guard will fail when hoisted) and then undo your decision later.
LICM is highly dependent on the quality of your trace recording or method inlining.
Not disagreeing with what you wrote at all, just wanted to provide an anecdote.