The big outlier not listed here is apple. Quick overview from someone who's written binary analysis tools targeting most of these:
Mach-O is the format they use, going back to nextstep. They use it everywhere including their kernels and their internal only L4 variant they run on the secure enclave. Instead of being structured as a declarative table of the expected address space when loaded (like PE and and ELF), Mach-O is built around a command list that has to be run to load the file. So instead of an entry point address in a table somewhere, mach-o has a 'start a thread with this address' command in the command list. Really composable, which means binaries can do a lot of nasty things with that command list.
They also very heavily embrace their ld cache to the point that they don't bother including a lot of the system libraries on the actual root filesystem anymore, and the kernel is ultimately a cache of the minimal kernel image itself as well as the drivers need at least to boot all in one file (and actually all of the drivers I think on iOS with driver loading disabled if it's not in the cache?).
There's a neat wrapper format of Mach-O called "fat binaries" that lets you hold multiple Mach-O images in one file, tagged by architecture. This is what's letting current osx have the same application be a native arm binary and a native x86_64 binary, the loader just picks the right one based on current arch and settings.
I think those are the main points, but I might have missed something; this was pretty off the cuff.
> Mach-O is built around a command list that has to be run to load the file. So instead of an entry point address in a table somewhere, mach-o has a 'start a thread with this address' command in the command list. Really composable, which means binaries can do a lot of nasty things with that command list.
Conceptually not much has changed since the book was written, but in practice there has been a lot of advancement. For example, ASLR and the increase in the number of libraries has greatly increased the pressure to make relocations efficient, modern architecture including PC relative load/store and branch instructions has greatly reduced the cost of PIC code, and code signing has made mutating program text to apply relocations problematic.
On Darwin we redesigned our fixup format so it can be efficiently applied during page in. That did in include adding a few new load commands to describe the new data, as well as a change in how we store pointers in the data segments, but those are not really properties of mach-o so much as the runtime.
I generally find that a lot of things attribute to the file format are actually more about how it is used rather than what it supports. Back when Mac OS X first shipped people argued about PEF vs mach-o, but what all the arguments all boiled down to was the calling convention (TOC vs GOT), either of which could have been support by either format.
Another example is symbol lookup. Darwin uses two level namespaces (where binds include both the symbol name and the library it is expected to be resolved from), and Linux uses flat namespaces (where binds only include the symbol name which is then searched for in all the available libraries). People often act as though that is a file format difference, but mach-o supports both (though the higher level parts of the Darwin OS stack depend on two level namespaces, the low level parts can work with a flat namespace, which is important since a lot of CLI tools that are primarily developed on Linux depend on that). Likewise, ELF also supports both, Solaris uses two level namespaces (they call it ELF Direct Binding).
I don’t disagree about the nature of load commands but Apple has been moving away from, say, LC_UNIXTHREAD for years at this point. For the most part load commands are becoming more and more similar to what ELF has, with a list of segments, some extra metadata, etc.
The mechanisms for Windows DLLs have been changed a lot(like how thread local vars are handled). Besides, this book could not cover C++11's magic statics, or Windows' ARM64X format, or Apple's universal2, because these things are very new. Windows now has the apiset concept, which is very unique. Upon it there are direct forwarding and reverse forwarders.
I think for C/C++ programmers it is more practical to know that:
1. The construction/destruction order for global vars in DLLs(shared libs) are very different between Linux and Windows. It means the same code might work fine on one platform but no the other one. It imposes challenges on writing portable code.
2. On Linux it is hard to get a shared lib cleanly unloaded, and it might affect how global vars are destructed, and might cause unexpected crashes at exit.
3. Since Windows has a DLL loader lock, there are a lot of things you cannot do in C++ classes constructors/destructors if the classes could be used to define a global variable. For example, no thread synchronization is allowed.
4. It is difficult to cleanup a thread local variable if the variable lies in a DLL and the variable's destructor depends on another global object.
5. When one static lib depends on another, a linker wouldn't use this information to sort the initialization order of global vars. It means it could be the case that A.lib depends on B.lib but A.lib get initialized first. The best way to avoid this problem is using dynamic linking.
For Windows related topics I highly recommend the "Windows Internals" book.
I have a hard-copy of this book, and it seems like the PDF isn't the final version, judging by the hand drawn illustrations at least.
The book does dive into some old and arcane object file formats, but that's interesting in its own way. Nice to get a comparison of how different systems did things.
After reading that book and other sources, I examined how people typically create libraries in the UNIX/Linux world. Some of the common practices seem to be lacking, so I wrote an article about how they might be improved: https://begriffs.com/posts/2021-07-04-shared-libraries.html
Edit: That covers the dynamic linking side of things at least.
Still pretty much relevant in terms of introductory understanding. Some specific details seem slightly anachronistic for the general population (segmented memory for example, which still exists in modern PC hardware but is of little import to the great majority).
The concepts are still relevant, but the specifics are mostly outdated. If you read this and then read the relevant standards documents for your platform, you should have a good grounding. I don't know of any other books that cover the topic well.