/dev/random is not robust

tptacek · on Oct 21, 2013

This is the third time this post has surfaced on HN. Last time, we got a thread with Ted Ts'o participating:

https://news.ycombinator.com/item?id=6548893

There are things not to like about Linux's kernel CSPRNG; most notably among them, the pointless distinction made between random and urandom.

But it's not clear that this paper has a lot of practical impact. Specifically, it refers to scenarios in which attackers control all the entropy inputs to the CSPRNG; it would be relevant, perhaps, in the case of a CSPRNG design that depended entirely on a closed HWRNG like RDRAND (Linux does not, nor does any other reputable CSPRNG). In reality, though, the only way an attacker can find themselves with the vantage point to launch theoretical attacks like these is if they've thoroughly owned up your kernel.

wtbob · on Oct 21, 2013

I really wish that Ts'o had accepted the Fortuna approach. It's a good, clean design which is constructed from well-regarded parts and is pretty fast.

throwaway2048 · on Oct 21, 2013

it is worthwhile to note OpenBSD got rid of /dev/random and replaced it with /dev/urandom several years ago in response to the issues surrounding it.

tedunangst · on Oct 22, 2013

OpenBSD has done several things with the random device and its userland interface, but nothing like what you have described. The opposite in fact. Until recently, there was no /dev/random node. Now there is.

SEMW · on Oct 21, 2013

Ted Ts'o (maintainer of /dev/random)'s initial thoughts on the paper, in the comments on Bruce Schneier's blog: [1], [2]. (TLDR: "What the authors of this paper seem to be worried about is not even close to the top of my list in terms of things [about /dev/random] I'm worried about").

[1] https://www.schneier.com/blog/archives/2013/10/insecurities_...

[2] https://www.schneier.com/blog/archives/2013/10/insecurities_...

tbrownaw · on Oct 21, 2013

This property states that a good PRNG should be able to eventually recover from compromise even if the entropy is injected into the system at a very slow pace, and expresses the real-life expected behavior of existing PRNG designs.

If someone figures out the internal state of a PRNG that doesn't take input entropy, they can predict all future outputs.

If someone figures out the internal state of a PRNG that does take input entropy, they can mostly predict near-future output but shouldn't be able to predict what the output will be after enough new input entropy has been provided.

I think this paper is saying that that "shouldn't" either isn't true of Linux, or takes more input entropy to happen than it ought to?

berntb · on Oct 21, 2013

This is certainly naive, but I have wondered a long time so I'll stick the foot in my mouth. :-)

Shouldn't an easy way of getting reasonably good random values be to take three-four of the best of breed implementations of algorithms with as much entropy injections as possible -- and then XORing the output from them?

(Or rather, one of multiple alternative ways of mixing the bit streams together where XOR is one, based on a random byte from one of the streams. Another byte tells after how long the mixing algorithm changes.)

Sure, you lose a 1/3 of the random data.

Hell, it should be enough to randomly throw away most of the generated random bits?

It was a long time ago since I did any computer security, but shouldn't this work?

keypusher · on Oct 21, 2013

No. The strength is only as good as your best algorithm. XOR'ing two mediocre algorithms does not give you a better algorithm. In fact, you might be more likely to end up with collisions in the final product if you XOR two results together than you would have had in the primary results themselves, making your random results even worse.

To quote: in any iterated hash function it is relatively easy to find exponential sized multicollisions, and thus the concatenation of several hash functions does not increase their security.

Source: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.9...

See also discussion of why XOR is often a bad idea for combining hashes: http://stackoverflow.com/questions/5889238/why-is-xor-the-de...

And this question about combining different hashes: http://crypto.stackexchange.com/questions/270/guarding-again...

tptacek · on Oct 22, 2013

I think this result may depend on the structure of the compression function of the hash, although I think the general point stands.

berntb · on Oct 22, 2013

Uh, I am stressed at work -- but most of that looked like a discussion of cryptological algorithms, not generation of random numbers?

Your point is obviously true, then. Sorry if I was unclear?

keypusher · on Oct 22, 2013

I am quite new to both topics as well, but it seems like there are some similarities. That is, from the excerpt in the original article, the PRNG used by /dev/random essentially takes noise from some external input, and runs it through "a cryptographic function that outputs random numbers from the continually internal state". In both cases (cryptography and random numbers,) the goal is to transform known data into something that is indistinguishable from random. The way I understand it is that the entropy in the random number generator doesn't come from the algorithms, entropy comes from noise gathered off the hardware signals. The algorithms just make sure that even if you have biased noise generating the input, attackers will still have a very hard time using that to control the end result. I think this a very interesting question by the way, and hope my previous answer didn't come off as dismissive. I am still a novice in the world of security, and so I wanted to put links in instead of trying to explain this myself, as I probably would get it horribly wrong.

berntb · on Oct 23, 2013

Thanks for the time. (I never read the 2nd Knuth book at Univ. :-) )

OK, obviously the obfuscation algorithms using entropy are equivalent to some form of hashing/encryption function.

So if I understand this:

To just keep every Xth bit (where X varies over time) of the generated data doesn't help (it adds a little obfuscation to the algorithm?). And to combine multiple algorithms with their own sources of entropy is not superior to e.g. just using the three entropy sources to get better data?

tptacek · on Oct 21, 2013

That sounds more like a way to to triple the number of software faults that could result in severely biased random numbers.

You should just use /dev/urandom and forget about all the rest of this stuff.