>FizzBuzz has many properties that make it very suitable for these kinds of optimizations that might not be applicable to general purpose code: + extremely small working set (a few registers worth of state) + extremely predictable branching behavior + no I/O
The working set depends on how you organize the data.
I/o can be made in larger chunks most of the times. You have to consider about the data you need and when you need it.
As for branching, that most of the time depends on programmer rather than on the problem being solved. The situation where you need very complicate branching to solve a problem are rare. And I've seen O(n log n) algorithms beating clever O(n) algorithms because they could be better optimized for the CPU.
The working set depends on how you organize the data. I/o can be made in larger chunks most of the times. You have to consider about the data you need and when you need it.
As for branching, that most of the time depends on programmer rather than on the problem being solved. The situation where you need very complicate branching to solve a problem are rare. And I've seen O(n log n) algorithms beating clever O(n) algorithms because they could be better optimized for the CPU.