The article mentions debug build time as a legitimate pain point about the LLVM project. General good advice, which unfortunately isn't enabled by default: use dynamic linking, make sure you're using lld, enable split dwarfs (and -Wl,--gdb-index for faster gdb load times!), enable optimized tablegen[0]. ccache doesn't hurt but isn't actually such a big benefit unless you switch between branches a lot.
Oh, and buy a Threadripper. Seriously, you won't regret it.
[0] TableGen is LLVM's internal DSL swiss army knife. You can separately enable optimizations for TableGen independently of optimizations in the resulting LLVM builds, which you should basically always do unless you're working on TableGen itself.
Do you know why isn't it easier to offload compilation to the cloud yet? Compilation is an embarrassingly parallel problem and shouldn't need take longer than than slowest file plus linking. Compilation is also a pure function (I know it's not quite set up like that in practice but it is in theory) so it's stateless and cacheable. I would have thought developers could share a massive cluster and have thousand-way compilation instantly. Doesn't seem to be a thing people are building or using in practice for some reason.
Developers at my company who work on a large shared C++ codebase all use Icecream[0] to distribute the builds across their workstations. I've only hooked into it to build once since I don't work in that codebase, but it seems to work pretty well scaling it (I think on the order of 50-60 servers and about the same number of clients). As for caching, I think sccache[1] is designed to handle that in a scalable way, although I've personally only used it locally for my Rust builds.
I nearly have this working, using AWS Lambda (a few months ago I got super angry at the compile times for a project I was working on and spent a week getting a proof of concept ready that I have been refining since). Have sadly been busy with some other tasks I have to get done (age old story), but intend to push to the point where you could easily add it to your own project soon (as long as your build system isn't stupid; a lot of people think they are using super advanced build systems, but in practice they handle toolchain hacks extremely poorly :/).
If I was writing a new compiler from scratch today I think I'd make it client-server from the start, even locally, so that the compiler could be transparently made remote. The client would be responsible for doing all IO on behalf of the server so that the server was completely separated from the system it ran on.
That's good to know, and I'm not surprised that there are ways to speed up the build. I just found that enabling release mode made linking reasonably fast and didn't spend more time on it (and experimenting more would involve more complete recompiles :p)
Threadrippers seem really nice, but I did this on my laptop, not a desktop. Plus, when linking with debug symbols, my 16GB of RAM is by far a bigger limit than my four cores; I can't even run two link processes simultaneously without one of them being OOM killed (if I'm lucky).
> Oh, and buy a Threadripper. Seriously, you won't regret it.
How much difference are we talking? Do you have a concrete Intel CPU you can compare against to give ballpark numbers? And is this with various mitigations enabled or disabled?
We have Ryzen 9 3900X workstations at work and I am using a 3950X at home. That's already a huge boost to compile times and they cost way less than the threadrippers
Oh, and buy a Threadripper. Seriously, you won't regret it.
[0] TableGen is LLVM's internal DSL swiss army knife. You can separately enable optimizations for TableGen independently of optimizations in the resulting LLVM builds, which you should basically always do unless you're working on TableGen itself.