Levine's /Linkers and Loaders/ is a great book, but it's still in print, and this is an unauthorized copy.
The author's home page (https://www.iecc.com/linker/) used to host a PostScript version for download, but it no longer does, now saying: "Chapters were available in an excessive variety of formats, but are not any longer due to chronic piracy."
These days there is lots of information about linkers and loaders to be had without violating Levine's copyright; see https://www.toolchains.net/ for many links.
"you're not done actually when you learn how compilers work you also need to
learn how linkers and loaders work and you need to learn how to operating
systems work before all of the magic is gone. So you really want to learn
compilers and operating systems and then get this book that I have called Linkers
and Loaders it's like the only book on linkers and loaders. I should have
brought it and uh and it's really good"
This is a good book, probably still relevant but not up to date. See Advanced C and C++ Compiling [0] for a more modern treatment, including using tools like objdump, nm, and readelf to explore compilation artifacts.
ian lance taylor, author of gold, major binutils contributor, and one of the core golang team, wrote a book on linkers as a series of 20 blog posts in 02008, which of course also includes using tools like objdump, nm, and readelf to explore compilation artifacts https://lwn.net/Articles/276782/
In terms of linkers, the state of the art has definitely been pushed in recent years (mold, the new linker from Apple, and the already mentioned gold linker) but some of these are so new that a book hasn't been written on it yet.
There's a PhD thesis I can't find on the topic that mentions some strategies for concurrent linking that came from gold, the big idea being that a lot of the work can be done in parallel and concurrently with the compiler generating object code (while the compiler is generating object code and before the final link step, scan the object files for their symbol definitions/requirements so the final link can be done very fast - iirc the idea came from gold).
In terms of loading, so few people work on these things and there's not a ton of interest in making it better that there isn't a ton of literature on the subject. The best source for the state of the art is the RTLD source in glibc and ld.so in musl, and equivalents in the BSD/Darwin/Linux kernels.
Loading is a lot "easier" than linking, and easy to make fast enough that I don't think a ton of research has been put into it. It's mostly determined by the executable format, and Linkers and Loaders is an excellent reference on their design. I also think that the people who work on this are doing it in environments that don't lend themselves to publishing papers and textbooks, but application notes and internal documentation at the companies that need to deal with the serious performance and security implications of a loader and executable format. And the executable formats have been proven to be very robust over the last few decades.
Language designers also take this for granted because they need ABI compatibility with system libraries to be useful, so most new language projects just reuse the linker/loader that is on the system instead of inventing a new one. And new projects have to be compatible with the executable formats created by existing toolchains, so you have a chicken/egg problem there.
i don't really know, but risc-v is important now, cheri is deployed as morello, pie for aslr and separation of code pages from read-only data are default, and aarch64 i think was brand new in 02008 if it existed at all. i'd guess c++ compilers have broken abi compatibility once or twice since then. oh also lto is a big deal now, enabled by llvm bitcode in many cases. also ubsan and asan didn't exist i think. not sure if those require linker support. also fuchsia and sel4 exist now and freertos is a lot more mature. so probably there is some new material that could be productively included
https://practicalbinaryanalysis.com/ may not be focused on linking and loading per se, but touches on these topics in a way that felt (to me) more lucid than the book you've linked.
Absolutely start with Levine. There's a lot that hasn't changed and Levine will give you a solid background in the old school. I have a hard copy and it's not going anywhere.
Next up, the Ian Lance Taylor's 20-part series. This will give you some motivation why the Levine world needed to change: performance.
Next, look at LLVM's lld and their Developer Meeting tutorials on it. There have been many but Rui Ueyama 2017 lld: A Fast, Simple and Portable Linker and Peter Smith's How to add a new target to LLD are where to start. Bluntly, Rui utterly solved/crushed/killed the linker performance problem. He probably should get an ACM Software System award but won't.
After that Oracle's (!) Linker and Libraries Guide and their Linker Aliens blog posts have a lot of gritty detail.
Possessing any information on the subject makes one feel like a wizard if one is the only person on the team with such knowledge. Kind of like being able to explain which parts of the network stack are affected by the values emitted from ip route.
The big outlier not listed here is apple. Quick overview from someone who's written binary analysis tools targeting most of these:
Mach-O is the format they use, going back to nextstep. They use it everywhere including their kernels and their internal only L4 variant they run on the secure enclave. Instead of being structured as a declarative table of the expected address space when loaded (like PE and and ELF), Mach-O is built around a command list that has to be run to load the file. So instead of an entry point address in a table somewhere, mach-o has a 'start a thread with this address' command in the command list. Really composable, which means binaries can do a lot of nasty things with that command list.
They also very heavily embrace their ld cache to the point that they don't bother including a lot of the system libraries on the actual root filesystem anymore, and the kernel is ultimately a cache of the minimal kernel image itself as well as the drivers need at least to boot all in one file (and actually all of the drivers I think on iOS with driver loading disabled if it's not in the cache?).
There's a neat wrapper format of Mach-O called "fat binaries" that lets you hold multiple Mach-O images in one file, tagged by architecture. This is what's letting current osx have the same application be a native arm binary and a native x86_64 binary, the loader just picks the right one based on current arch and settings.
I think those are the main points, but I might have missed something; this was pretty off the cuff.
> Mach-O is built around a command list that has to be run to load the file. So instead of an entry point address in a table somewhere, mach-o has a 'start a thread with this address' command in the command list. Really composable, which means binaries can do a lot of nasty things with that command list.
Conceptually not much has changed since the book was written, but in practice there has been a lot of advancement. For example, ASLR and the increase in the number of libraries has greatly increased the pressure to make relocations efficient, modern architecture including PC relative load/store and branch instructions has greatly reduced the cost of PIC code, and code signing has made mutating program text to apply relocations problematic.
On Darwin we redesigned our fixup format so it can be efficiently applied during page in. That did in include adding a few new load commands to describe the new data, as well as a change in how we store pointers in the data segments, but those are not really properties of mach-o so much as the runtime.
I generally find that a lot of things attribute to the file format are actually more about how it is used rather than what it supports. Back when Mac OS X first shipped people argued about PEF vs mach-o, but what all the arguments all boiled down to was the calling convention (TOC vs GOT), either of which could have been support by either format.
Another example is symbol lookup. Darwin uses two level namespaces (where binds include both the symbol name and the library it is expected to be resolved from), and Linux uses flat namespaces (where binds only include the symbol name which is then searched for in all the available libraries). People often act as though that is a file format difference, but mach-o supports both (though the higher level parts of the Darwin OS stack depend on two level namespaces, the low level parts can work with a flat namespace, which is important since a lot of CLI tools that are primarily developed on Linux depend on that). Likewise, ELF also supports both, Solaris uses two level namespaces (they call it ELF Direct Binding).
I don’t disagree about the nature of load commands but Apple has been moving away from, say, LC_UNIXTHREAD for years at this point. For the most part load commands are becoming more and more similar to what ELF has, with a list of segments, some extra metadata, etc.
The mechanisms for Windows DLLs have been changed a lot(like how thread local vars are handled). Besides, this book could not cover C++11's magic statics, or Windows' ARM64X format, or Apple's universal2, because these things are very new. Windows now has the apiset concept, which is very unique. Upon it there are direct forwarding and reverse forwarders.
I think for C/C++ programmers it is more practical to know that:
1. The construction/destruction order for global vars in DLLs(shared libs) are very different between Linux and Windows. It means the same code might work fine on one platform but no the other one. It imposes challenges on writing portable code.
2. On Linux it is hard to get a shared lib cleanly unloaded, and it might affect how global vars are destructed, and might cause unexpected crashes at exit.
3. Since Windows has a DLL loader lock, there are a lot of things you cannot do in C++ classes constructors/destructors if the classes could be used to define a global variable. For example, no thread synchronization is allowed.
4. It is difficult to cleanup a thread local variable if the variable lies in a DLL and the variable's destructor depends on another global object.
5. When one static lib depends on another, a linker wouldn't use this information to sort the initialization order of global vars. It means it could be the case that A.lib depends on B.lib but A.lib get initialized first. The best way to avoid this problem is using dynamic linking.
For Windows related topics I highly recommend the "Windows Internals" book.
I have a hard-copy of this book, and it seems like the PDF isn't the final version, judging by the hand drawn illustrations at least.
The book does dive into some old and arcane object file formats, but that's interesting in its own way. Nice to get a comparison of how different systems did things.
After reading that book and other sources, I examined how people typically create libraries in the UNIX/Linux world. Some of the common practices seem to be lacking, so I wrote an article about how they might be improved: https://begriffs.com/posts/2021-07-04-shared-libraries.html
Edit: That covers the dynamic linking side of things at least.
Still pretty much relevant in terms of introductory understanding. Some specific details seem slightly anachronistic for the general population (segmented memory for example, which still exists in modern PC hardware but is of little import to the great majority).
The concepts are still relevant, but the specifics are mostly outdated. If you read this and then read the relevant standards documents for your platform, you should have a good grounding. I don't know of any other books that cover the topic well.
I remember seeing this book on my boss' desk in the late 90s. I commented something like "There's a whole book on linkers and loaders?!". Turns out he was reading it because he was working on a side project that would eventually become his paid gig (he left about a year later).
The author's home page (https://www.iecc.com/linker/) used to host a PostScript version for download, but it no longer does, now saying: "Chapters were available in an excessive variety of formats, but are not any longer due to chronic piracy."
These days there is lots of information about linkers and loaders to be had without violating Levine's copyright; see https://www.toolchains.net/ for many links.