Modern CPUs also depend on coherent branching behavior, and conditions like this throw it off. I bet most of the 15X slowdown that the author is experiencing is not from the extra conditions, but from thrashing in the BPU (branch prediction unit)!
You might be underestimating the efficiency of branch prediction on modern CPUs. It's getting close to the point where if you (the human) can easily predict the pattern based on past history, the CPU will perfectly predict it as well. Manufacturers are usually a little cagey about the exact specifications, so one is often limited to empirical testing, but a simple pattern like one in the example is going to be predicted perfectly after the first couple iterations of training. Here's what one authority who's done lots of testing has to say about Haswell, the CPU the post is concerned with:
The Haswell is able to predict very long repetitive jump
patterns with few or no mispredictions. I found no specific
limit to the length of jump patterns that could be
predicted. Loops are successfully predicted up to a count
of 32 or a little more. Nested loops and branches inside
loops are predicted reasonably well.
You might be underestimating the efficiency of branch prediction on modern CPUs. It's getting close to the point where if you (the human) can easily predict the pattern based on past history, the CPU will perfectly predict it as well. Manufacturers are usually a little cagey about the exact specifications, so one is often limited to empirical testing, but a simple pattern like one in the example is going to be predicted perfectly after the first couple iterations of training. Here's what one authority who's done lots of testing has to say about Haswell, the CPU the post is concerned with:
http://www.agner.org/optimize/microarchitecture.pdf