I hope that the rust compiler team is considering to integrate BOLT by default too
https://github.com/facebookincubator/BOLT
(I'm aware it's being integrated into llvm but rust could integrate the external BOLT in it's toolchain in the meantime, free performance gain for everyone is nice to have right now)
This was discussed here [1]. The conversation died off around a year ago. Not sure why... Although, it is pretty clear that BOLT's optimizations are not "free". For example:
> BOLT uses enormous amount of memory (about 6GB).
> Perf needs to be ran with LBR support; this is almost always unsupported with VMs (which means you don't want to run the measurement inside CI).
BOLT just permutes link order? The object permutation from a BOLT run should be good for months until the underlying objects have substantially drifted?
At any price? You could throw something like souper at the problem[0]. It makes compilation about 20 times slower in some benchmarks in their paper. This is on the IR level, iirc similar optimizers exist for assembly generation, so you can throw that in as another pass.
Seems to me that raising the ceiling on optimisation is good to have, even if the computational cost is very high. If it works reliably, it wouldn't be necessary to do optimised release builds very often, and they could be offloaded onto a heavyweight build server, perhaps on the cloud.