I wonder where the trade-off is for loop unrolling.
Like, unrolling a `for` loop that only has 5 iterations makes sense. But if you have 100 iterations, then the larger memory footprint of all the code might actually make it slower than just keeping the `for` loop.
Like, unrolling a `for` loop that only has 5 iterations makes sense. But if you have 100 iterations, then the larger memory footprint of all the code might actually make it slower than just keeping the `for` loop.
In some cases, you can use Duff's Device. https://en.wikipedia.org/wiki/Duff%27s_device