So happy someone are spending time on this issue, it's like a breath of fresh air and intelligence in the midst of all the usual software (security/privacy/etc. take your pick) mayhem. It's worth reading https://reproducible-builds.org/ for a brief (re-)reminder on why this project is important.
Outtake: "With reproducible builds, multiple parties can redo this process independently and ensure they all get exactly the same result. We can thus gain confidence that a distributed binary code is indeed coming from a given source code."
Reproducible builds are extremely useful. There are more benefits. For example, suppose you have a build server compiling software packages. If your builds are not reproducible and you want to debug a core dump, but you have no debug information, you are out of luck (well, you could dive into the assembly code, but it's inconvenient). If you want to keep debug information, you need to store them for every single build (what a waste of storage...) because the binary for each build is different. Not so with reproducible builds, you could simply check out the old version and compile it with debug information!
Another wonderful side effect: making it easier to change out other parts of the build infrastructure while being able to verify that you haven't broken anything. e.g. trying out experimental changes to ccache.
Most huge Debian packages carry a separate -debug package so you can get the symbols without huge recompilation times + having to set up a buildchain and associated dev libraries of all the packages.
If you're interested, the old way was to have an explicit -dbg package which often had to be requested, but now they can mostly autogenerated into -dbgsym packages. These are also uploaded to a separate archive these days too to keep the mirror sizes down.
I'm nitpicking here, but the convention is -dbg or (recently) -dbgsym as the suffix for packages in the Debian repositories. The Debain repository containing the debug packages itself itself is suffixed with -debug though, as in "unstable-debug".
Debug information often is choc-full of environment specific goodies that can easily cause builds not to be reproducible.
For example, debug information for several languages I've worked with embed the full path to where the source code file was originally compiled. This can easily vary across two machines in a way that when debug information is turned on, the builds are different, but when turned off they are the same.
You are correct, but to use a core file with debug info, the binaries don't have to be bit-perfectly identical. All that has to be equal is the offset of each function and statement and that is the case in reproducible builds. Debug info is only "tacked on"; you can strip a binary, generate a core dump and then use the original binary with debug information to debug the core dump.
While the reproducible builds efforts try to make builds path-independent, at the moment they often settle for reproducibility when built at the same common path. That's still enough that you can reproduce the build yourself.
Does path-independent mean what I think it means? One thing I hate about Linux is having to set prefix at build time. You can't tar up a software and move to your home directory on another machine. That and the lack of compatibility between base libraries/kernels means you can't hope to upgrade chrome for yourself easily like anyone on Windows or Mac can trivially do.
those kinds of differences are exactly part of what the reproducible builds project has worked on
however debug symbols and "debug mode" are two different things.. at least in debian packages are generally compiled with debug symbols on, then separated into a separate package automatically and stripped into the final package as a separate step.
Beyond providing security, reproducible builds also provide an important ingredient for caching build artifacts (and thus accelerating build times) across CI and developer machines. They also can form the basis of a much simpler deploy and update pipeline, where the version of source code deployed is no longer as important. Instead a simple (recursive) binary diff can identify which components of a system must be updated, and which have not changed since the last deploy. This means a simpler state machine with fewer edge cases that works more quickly and reliably than the alternative.
I'm very grateful for the work that this project has done and continues to do. Thank you!
Amazing work. Thanks so much to everyone who's contributing. The upstream bugs filed are especially appreciated since they make the whole Linux ecosystem more solid, not just Debian.
In some cases, a very long while. I brought up the question of build reproducibility at BSDCon 2003 because it was relevant to FreeBSD Update, and a lot of my early FreeBSD commits were working on this.
Yeah; I've seen both work and developer mindset about this for a long time in bsd-centric mailing-lists. I tried to keep it short here though, seeing how the debian developers have done a great job and didn't want to shift that focus [in this thread].
I think its great that we have come to a point where packagers shift mindset from "it works" to "we can reproduce the results" in more than one package manager.
Does anyone know if they've made the Packages file (repository metadata file, listing the packages in the repository) build reproducibly yet?
I tripped over this a couple weeks ago and was both amused and annoyed, since it seemed that packages were being listed in the file in a random order. I'm asking here because it might already be fixed; we're using a slightly old version of the package/repository tools.
Thanks! FWIW, I don't think there's any alternative to throwing a sort into here somewhere, since ftw doesn't have any reproducible-order flag. But whether you want to sort the list of packages and then generate the output for each package, or generate the output as you traverse the tree and then sort that output, I don't know.
You're sorting file names, right? Is this guaranteed to DTRT when packages are updated? (i.e., a version number changing can't result in two packages switching order?)
Why do you care about the order of packages in Packages?
In my personal case, so that when I build a repository which has some new packages and some old packages, when I look at the resulting pull request in github I can see that the packages which haven't changed have indeed not changed.
What does "build reproducibly" even mean in this context?
Two repositories with the same packages have identical Packages files. Or for me, slightly more generally, when the Packages file changes, it changes as minimally as possible.
Two repositories with same content should have same metadata. For the same reasons we want reproducible binaries - to spot differences. (Also prevents useless re-download in some cases, I guess)
Guix and Nix are input-reproducible. Given the same input description (input being the source files and any dependencies) an output comes out. Builds are then looked up in a build cache based on the hash of al lathe combined inputs. However. The _output_ of Nix artifacts are not reproducible. Running the same input twice will yield a different result.
Nix does some tricks to improve output reproducibility like building things in sandboxes with fixed time, and using tarballs without modification dates but output bit-by-bit reproducible is not their goal. They also don't have the manpower for this.
Currently, a build is built by a trusted builderver for which you have the public key. And you look up the built by input hash but have no way to check if the thing the builderver is serving is legit. It's fully based on trust.
However, with debian putting so much effort in reproducible output, Nix can benefit too. In the future, we would like to get rid of the 'trust-based' build servers and instead move to a consensus model. Say if 3 servers give the same output hash given an input hash, then we trust that download and avoid a compile from source. If you still don't trust it, you can build from source yourself and check if the source is trustworthy.
Summary: Nix does not do bit-by-bit reproducibility, but we benefit greatly from the work that debian is doing. In the future we will look at setting up infrastructure for buildservers with an output-hash based trust model instead of an input based one. However this will take time.
> output bit-by-bit reproducible is not their goal
I think you are wrong.
The Nix people (and the Guix people, including myself) are also involved in the reproducible builds project. I've met with a couple of them in Berlin last year. It's not just Debian doing this.
I can't speak for Nix but for the Guix project bit-for-bit reproducibility is an explicitly stated goal. It's very important and the reason why Guix is used in several HPC environments as the foundation for reproducible research.
Disclaimer: I'm co-maintainer of GNU Guix and currently at a bioinfo conference where I talked about Guix and reproducibility.
Nix and Guix sound interesting. I run Ubuntu currently, what's the easiest way to get start with one or the other? I hear Guix is more user-friendly, is that so?
Do I need to install a whole other OS, or can I install Guix in Ubuntu?
You can use Guix as a package manager on top of practically any variant of the GNU system. By design it is completely independent of the libraries that your system provides.
At work I'm using the same binaries on a cluster with CentOS 6 and on workstations running a mix of Ubuntu, Fedora, CentOS 7, etc.
GuixSD ("Guix System Distribution") is the variant of the GNU system where the principles of Guix are extended to the operating system, but you don't have to use it if all you want is play with the package manager.
If you are into Lisp you'll feel right at home with extending Guix. If you don't care for Lisp you might at least find the command line interface to be a little easier to understand than that of Nix, but really: that's a personal preference.
Nix is a package manager. It can be installed on many GNU/Linux systems or even on non-Linuxes, such as macOS.
However personally I found it limited to building software (nix-shell for reproducible environments and building images), and weird to use for desktop stuff. It was just weird to have two package managers competing. Maybe someone has some neat ideas how to make it work, but I just went to NixOS.
NixOS is a GNU/Linux distro that uses Nix, and leverages it to build the whole system configuration. There, you have generations, right in the bootloader, and can boot to any of those. Which is extremely nice, as I had just stopped to worry before upgrades. If they fail (unless they break the bootloader, which is quite unlikely), I can just roll back to where I was.
Installation is performed from a shell session and requires some knowledge about how GNU/Linux works. Just like an, e.g., Gentoo or Arch Linux. Essentials are all covered in the manual: https://nixos.org/nixos/manual/index.html#sec-installation and examples of the rest (desktop environments, etc) can be found online.
Nix/NixOS may be not best in terms of UI/UX/user-friendliness (some stuff would look weird, until you get used to it - guess it's the same with any unfamiliar tech), but I have an impression that community there is very nice and very helpful (and maintains a lot of useful documentation).
I haven't used Guix or GuixSD, so can't comment about it.
tl;dr: I went with installing NixOS on a separate partition, and mounting my old /home there. No regrets and not looking back. (Whenever I need Debian or Ubuntu compatibility - e.g. build a .deb package, I just do things in a debootstrapped chroot or in Docker)
Huh, that's interesting, thanks. Especially the part about booting into different configurations, that sounds like it would make the dreaded OS upgrades a breeze. I'm going to have to try it out, thank you.
That was exactly why I've converted. Got fed up with yet another system upgrade breaking sound subsystem, and installed NixOS. Now, whenever there are any regressions, I just roll back, look for a report (usually, there is one already, so I don't have to file anything) and wait for a fix.
One suggestion: if you have a separate /boot - make sure it's large enough to hold a dozen of kernels+initamfses. Like, at least 256MiB.
I had a 128MiB partition and given that every kernel or initramfs change leaves a new copy there, it made things a little inconvenient.
It didn't break anything, just that `nixos-rebuild switch --upgrade` failed every now and then, requiring to clean up old generations even though / still had plenty of disk space.
> One suggestion: if you have a separate /boot - make sure it's large enough to hold a dozen of kernels+initamfses.
This is different in GuixSD. The complete operating system is just another store item. `/boot` hardly grows because all that happens is that a new GRUB menu is installed.
The system is "checked out" from /gnu/store by the init and then booted.
The whole operating system configuration is just a single Scheme file. The configuration is unique, I think, in that GuixSD has a "system services" composition framework that allows for building up a complex graph of "system services".
A system service is not to be misunderstood as just a daemon process. It's much more flexible than that.
Here's a very good introduction to the service composition framework:
Ah, thanks for the tip. That's a problem with Ubuntu as well, as old kernels don't get autoremoved often, so it's nothing worse than what I'm dealing with currently.
What does "reproducibility" mean? I understand and appreciate the importance of reproducibility in the context of scientific experiments, but I don't understand what it means in terms of computer programs. I am guessing it has to do with being able to build on different architectures without issue?
In the context of "reproducible builds", it means that if you compile the same source code with the same compiler and build system, the output will be completely identical, bit by bit. This is surprisingly hard to achieve in practice.
Once they have reproducible builds, they can easily prove that each binary package was built from the corresponding source code package: just have a third party compile the source code again and generate the binary package, and it should be identical (except for the signature). This reduces the need to trust that the build machines haven't been compromised.
Just piggybacking on this comment, you can do a whole lot more than just trust that a few people have automated. Most people in Ubuntu get non-distro packages from Launchpad, for example, which uses their own build servers. With reproducible builds you can require BOTH launchpad and the developer's signature for a package to be valid, which tremendously improves the security situation.
It has a similar meaning to research. What it means is that you can reproduce (compile in most cases) from source code the same bit-for-bit identical binary independently. While this might sound like something that should be trivial to do, it turns out to be far from trivial (timestamps and other environment information leaks into binaries all the time).
There's a website that describes this project in much more detail as well as how they worked around the various problems they found. https://reproducible-builds.org/
In addition to all the siblings; this is also important in research - which increasingly uses computers. If you provide a paper, source code, dataset and description of the system(s) used - can someone reproduce your research?
It would certainly be convenient if you can point to a version/snapshot of Debian (or another distribution) - and it would then be possible to take your (say C) source code and compile and run the same binary used for research.
It's true that often getting the algorithm more-or-less right is enough - but the more research is augmented by computing devices, the more important it becomes to maintain reproducibility - and the more complex and capable these computer devices (say a top-100 supercomputer, software stack in C++ on top of MPI, some Fortran numeric libraries etc) become - the harder it becomes to maintain it.
Imagine verifying research done today by repeating experiments in 50 years.
It has taken, and continues to take, a suprising amount of work to make two builds of a program produce the same output.
There are many sources of issues. For example: date and time stored in output, program's output effected by order in which files are read from a directory (and is not having a fixed ordering), hash tables based on the pointer and the high objects are stored having different ordering on different executions, parallel programs behaving differently on different runs, and others.
It's about being able to reproduce the binary from source. You might think this is pretty much impossible in the Debian context, but things like timestamps, and underspecified dependencies can end up shifting a build's result over time.
If we want to insist that open source code is secure by source code analysis, we need a verifiable build chain, that the code and binaries an analysis uses are the same as what we get later.
It means each time you build the same code in a known setup, you get bit for bit the same binary. That allows you to assure that the code that's shipped actually matches the source code.
It sounds trivial, but the full paths and timestamps that get added at multiple points in the process are enough to screw this up, and those are the easy problems.
I think it's for security. It means that there's a deterministic relationship between the source of a program and its final compiled artifacts.
If software has reproducible builds that means that third-parties can independently verify that artifacts have been built safely from sources, without any sneaky stuff snuck in.
Once we have reproducible builds, will it be possible to have verifiable builds? As in, can we cryptographically show that source + compiler = binary?
Right now we can sign source code, we can sign binaries, but we can't shows that source produced binaries. I would feel much happier about installing code if I knew it was from a particular source or author.
What do you mean by "cryptographically show"? With reproducible builds, anyone that has repeated a build can verify that a claimed binary matches, and could sign a statement saying so. But I don't think there are solutions that don't include someone repeating the build, or a clear way of proving that you actually did repeat the build.
Yes. The first standard for securing computer systems mandated some protections against this. They were partly made by Paul Karger who invented the compiler subversion Thompson wrote about a decade later. Most just focused on that one thing where Karger et al went on to build systems that were secure from ground up with some surviving NSA pentesting and analysis for 2-5 years. Independently, people started building verified compilers and whole stacks w/ CPU's. They were initially simple with a lot to trust but got better over time. Recently, the two schools have been merging more. Mainstream INFOSEC and IT just ignores it all slowly reinventing it piece by piece with knock offs. It's hard, has performance hit, or is built in something other than C language so don't do it. (shrugs)
Well you may want to mention stage0
https://savannah.nongnu.org/projects/stage0/
it starts with just a 280byte hex monitor and builds up to a rather impressive lisp AND forth, while building tools such as a text editor and line macro assembler along the way
rain1 started a page for people interested in bootstrapping or countering Karger's attack. Several of us are putting as many links as we can find to small, human-understandable tools such as compilers. I added some formally-verified or otherwise justifiable ones (eg 25-core CPU ain't gonna be simple).
No. Or at least this doesn't provide that. I think in theory you could make a crypto compiler that proves the binary is isogonal to the source, but I suspect the verification effort wouldn't be much less than recompiling.
On NixOS python is patched so that if the environment variable DETERMINISTIC_BUILD is present the interpreter set the bytecode timestamps to 0. I suppose they did something similar.
One (misguided) counter argument I've heard from otherwise fantastic devs it's the notion of adding randomness to unit tests in the hopes that if there's a bug, at least some builds will fail. In practice, I've seen those builds and developers saying "yeah, sometimes you need to build it twice".
I think the solution is to give those devs who favor such techniques a separate but easy to use fuzzing tool set that they can run just like their unit tests, separate from their usual 'build' command. Give them their ability to discover new bugs, but make it separate from the real build.
Why would the randomness in unit tests affect the binary? RNGs are invoked when the tests are run, not when they're built, and anyway, test code shouldn't be part of the final binary.
The test suite is used to validate the build. Intermittently failing test suite -> intermittently failing build. You can always disable running a problematic test suite, but that doesn't exactly inspire confidence in the result.
My take on that is that those devs are probably fooling themselves about the state of their test suite. I'm working on a codebase like that right now - it's in generally a relatively good state, but just this week I've run into a moderately nasty bug involving global state not getting properly reset which the tests didn't cover because their specific execution order happened to work.
Or just seed the PRNG with some digest of the relevant source code (perhaps as simple as the version number). Doesn’t solve the problem that your tests can suddenly break on unrelated changes, but does solve the problem that your tests can break without any changes.
A "build" often includes running the unit test suite to verify that the software works in the build environment, so randomness in the test suite can indeed lead to non-reproducible output: when a bad RNG seed causes a test failure and thus a failed build. I've spent my fair share of time debugging issues like this.
Compare this to Windows or OSX, where not only are you unable to build packages yourself, but they are installed from downloads you find in disparate places on the web, are not cryptographically signed by people you can trust, and often include spyware anyway.
Has anyone played with the tool they mentioned, diffoscope? Sounds interesting and wonder how good it is at, for example, comparing excel files with VBA code and formulas etc.
Outtake: "With reproducible builds, multiple parties can redo this process independently and ensure they all get exactly the same result. We can thus gain confidence that a distributed binary code is indeed coming from a given source code."