How did things get this bad? The Axioms of linking (2019)

klodolph · on Feb 8, 2021

To me, the damning thing is that so many people distribute “header-only libraries”. My experience is that this makes the library worse in many different ways, except one—linking is easier. Developers are so annoyed with dependencies that this is the result. There are much better solutions than just shoving everything in a header.

cout · on Feb 8, 2021

The main reason IME for a header-only C++ library is that when it is template-heavy (as a lot of C++ code is these days), a lot of it ends up in headers anyway, so it's not a big leap to header-only.

I've never seen a widely used header-only C Lucas library, and while it wouldn't surprise me, it seems harder to justify.

nn3 · on Feb 8, 2021

Header only usually means terrible compile times. So what they are saving from writing a few lines of Makefile they will pay forever in far longer feedback loops during development. Also it's well known that if the compile time goes over a specific threshold the development productivity drops dramatically because you can't do interactive debugging and exploration anymore. It's really a terrible tradeoff.

panic · on Feb 8, 2021

At least in C, "header only" libraries typically require a #define LIBRARYNAME_IMPLEMENTATION somewhere in a .c file to define the symbols declared in the header. That way you're not including a copy of every function in every object file, which is what would slow down compile times. C compilers are very fast at skipping #ifdef blocks.

fxtentacle · on Feb 7, 2021

C++ with vcpkg is a bliss to work with. Depending on which triplet you configure, it can build either static or dynamic libraries.

And CMake will take care of transitive dependencies for static libraries for you. Just declare it as a PUBLIC link dependency and all downstream projects will link against it automatically.

What the article fails to mention is that dynamic libraries can also be an attack vector. Users can replace a .so with a variant that does something you don't expect. For statically linked apps, it is much easier to ensure your code has not been tampered with.

Also, the CMake integration in CLion is great. "How did things get this bad?" What are you talking about? It's a great time to be a C++ programmer! The tools are finally stable and powerful (and not either MSVC or ICC) and there's lots of good libraries to choose from.

klodolph · on Feb 8, 2021

> And CMake will take care of transitive dependencies for static libraries for you.

This assumes that everything is correctly expressed as part of the CMake system. If you stick everything in one build system, of course it’s going to be fine.

> Users can replace a .so with a variant that does something you don't expect. For statically linked apps, it is much easier to ensure your code has not been tampered with.

If your users are able to replace libraries, it’s reasonable to assume that they can mess with the code inside your application too. It is generally hard to ensure your code hasn’t been tampered with, unless you’re in control of the hardware.

bobthebuilders · on Feb 8, 2021

I think it's more of an accidental or unintentional threat they're talking about than a malicious one.

DDSDev · on Feb 8, 2021

> "How did things get this bad?" What are you talking about? It's a great time to be a C++ programmer!

I think like most things C++, the modern solutions are great, but a large number of programmers are still stuck with legacy systems that they cannot or will not change for a variety of reasons.

I don't think there is a magic bullet for people experiencing these problems in legacy codebases - it's either update to a more modern build system and use libraries within that ecosystem, or learn how to work around the potential issues outlined in the original article.

accountofme · on Feb 8, 2021

Mind you it has been a few years since I have worked in C++ professionally. I understand why you might like it. But in my opinion, it is a dumpster fire.

My friend who is a music producer has a saying: "You can't polish a turd."

With C++ and CMake, this is apt.

eqvinox · on Feb 8, 2021

The entire post is really indicative of widespread misunderstandings on how linking is supposed to fit into the larger ecosystem.

Dynamic linking — on Linux — isn't intended to work on its own. There's a system around it, notably with a package manager and its dependency handling. It's only a component of the larger system, and it's failing because other pieces are just straight-up missing.

No, you can't build on one OS platform, transfer the executable to another, and expect it to work. Yes, other systems (notably Windows, but also Solaris and to some degree the BSDs) make that guarantee, but Linux doesn't. That's just how it is, and it's different across systems, and you need to drop that expectation. For comparison, on OpenBSD you can't link statically (not across OpenBSD versions at least) because syscalls are not an ABI. You need the system's libc, matching the kernel. (cf. recent Go story) Platforms have their own rules. Learn them.

The package manager is also what will fix some of the not-quite-linking-related problems the post cites. Headers and #defines matching the library version — you get that from the package manager's build dependencies. It'll also apply compiler & linker flags that match the platform.

But the most disingenious part of the article is this:

> If you want to move a binary from one machine (where it was compiled) to another

No. You're not moving a binary from one machine to another. You're moving it from one platform to another. Because if the machines have the same platform, it actually works.

bscphil · on Feb 14, 2021

Very late, but just want to say this is a fantastic comment.

To me it feels like the old ways (dynamic linking to system vendored libraries with dependency handling using a package manager) are being lost and the newer methods lose something of the simplicity and beauty of those approaches. I realize that's just my own emotional reaction to it, without any logical justification, but I just wanted to say that.

CountHackulus · on Feb 8, 2021

If you want to get really deep into this, you can read the book Linkers and Loaders by John Levine. It's useful if you're writing either a compiler or an OS, or if you really want to understand the subtleties of how executables are loaded and executed. It may not help you solve your immediate linking or loading problem, but it'll at least help you understand how you got there and what they computer is trying to do.

SulphurCrested · on Feb 8, 2021

Point 22 in the linked article confuses static linking against the C runtime with static linking against your own libraries.

It is perfectly all right on macOS to statically link your own stuff, or other peoples' stuff. Apple even recommend in a WWDC video that you do it with OpenSSL if you absolutely must use that instead of the framework they provide. (The stated reason is that the OpenSSL API is not stable.)

Like almost every other system except for Linux, it is not perfectly all right to statically link against the C runtime, because that’s what knows how to do system calls. But neither is it ever a problem to dynamically link against it, since it will be kept up to date by system updates. Any normal C++ program will dynamically link against libc++ and libSystem in /usr/lib.

Apple packages its APIs as "Frameworks", and a normal application will dynamically link against those.

rramadass · on Feb 8, 2021

This is a nice list of linking "gotchas". However, i feel that the author is venting somewhat since this has been well known to experienced C/C++ developers. If you follow a few thumb rules, the situation is not bad and quite simple/dumb.

pjc50 · on Feb 8, 2021

And if you want your program to work on windows as well, you face a similar set of problems with different tools and names. It's been interesting watching C# go round this again too; finally with .Net core you can actually link an executable rather than ship a pile of DLLs.

However, one Linux tip: .a files aren't magic, they're actually just an 'ar' archive, and in some cases you can use this to combine existing .a files together if you want to for some reason.

tux3 · on Feb 8, 2021

And, fun fact, Windows .lib files are also ar archives! (sometimes with a little more structure to them). And so with macos .a files, except that they're a slightly incompatible variant of the ar format.

skybrian · on Feb 8, 2021

Do problems linking static libraries happen in languages other than C and C++? It seems like languages with their own build systems (like Go) have solved this.

whomst · on Feb 8, 2021

I've had issues with Rust and OpenSSL, but that's because its effectively just FFI bindings. Rust, by default, links all dependencies statically except for libc, libm and other assumed-to-exist libraries.

amelius · on Feb 7, 2021

If only there was a tool in binutils to convert dynamically linked programs into statically linked ones ...

ramzyo · on Feb 8, 2021

Does this exist? If so, what is it? Genuinely curious.

amelius · on Feb 8, 2021

There are tools, see links below. But they are not in binutils. So they won't be widely used or supported in different build scripts, and they won't be ported to new architectures as naturally as the other tools in binutils. Also some of these tools are not free software.

http://www.magicermine.com/

http://statifier.sourceforge.net/

puzzledobserver · on Feb 8, 2021

I wanted to do something similar a long time ago, and couldn't find anything useful. Is there any reason an "integrate-everything-statically" tool isn't provided with binutils?

gumby · on Feb 8, 2021

Item 29 (“linking” static libraries) is simply the `ar` program.

forrestthewoods · on Feb 8, 2021

Ship your dependencies. Let me repeat that. Ship your damn dependencies!

Semi-related, environment variables are pure evil. Operating systems shouldn’t allow environment variables and programming languages shouldn’t allow global variables.

C++ headers are insane and miserable. Makes me sad. And causing me endless grief.

One distinction the author doesn’t quite make (unless I missed it) is the difference in shipping a pre-compiled static lib and shipping code intended to be compiled in a static lib. The first is almost the same thing as shipping a shared lib. You need a pure C API to ensure compatibility. The second you can do whatever because users will compile with their own settings and types will be consistent.

saagarjha · on Feb 8, 2021

> Semi-related, environment variables are pure evil. Operating systems shouldn’t allow environment variables and programming languages shouldn’t allow global variables.

Defending absolutist opinions is usually difficult and I this one is no different. What do you have against environment variables, which provide a unified interface to expose user-facing tunables? Where do you store shared program state?

jcranmer · on Feb 8, 2021

At my work, the development environment relies way too heavily on environment variables to configure things like "where is the build located," which breaks things like being able to easily use the built products on a different machine. Annoyingly, this also means that the tooling can't be regular scripts, since it has to change the environment variables of your current shell, which also means that you can't write regular scripts to abstract over the tooling because it's actually crazy shell shenanigans and not executables that are doing everything.

Admittedly, that is a bad design, and it is a problem in large part because it is a bad design and not because of using environment variables by itself. In general, though, my preference is that environment variables be the last resort of the programmer for configuration. Providing command-line switches or files in well-known locations is almost always a better idea, and environment variables should largely be reserved only for when the use is of a debugging or other unusual configuration.

fctorial · on Feb 8, 2021

Explicit is better than implicit.

forrestthewoods · on Feb 8, 2021

Global variables are evil. Environment variables are global variables. Lord knows how much time I’ve wasted helping people figure out why something works on one machine but not another. The answer is often because they had something in some obscure envvar that may have been set once upon a time by lord knows what.

The alternative? Users launching a program should explicitly specify configuration information. Choose sane defaults. Let them be configurable via cmdline args or a config file. Ideally both. An explicitly specified config file is so much cleaner than envvar imho. Scripts that locally override global envvar is super sketch, hidden, and error prone.

TBH I think the fact that Docker exists is testament to the INSANITY of Linux environment configuration. It’s so incredibly hard to reliably launch a program that we built ecosystems of containers to encapsulate system wide configuration. And even that is hard and error prone.

pjc50 · on Feb 8, 2021

> It’s so incredibly hard to reliably launch a program

This is just silly.

Global state is inevitable. If it's not in the environment it would be in the filesystem or the Windows registry.

forrestthewoods · on Feb 8, 2021

Windows registry is also evil. Because it’s basically the same thing.

Global state may be inevitable. But that doesn’t mean you should throw the towel and sprinkle global state all over the place. You should try really really hard to have the barest minimum global state. You might be surprised how little you actually need.