It hadn't occurred to me that even a photo with the lens cap on still contains decent entropy, although it now seems somewhat obvious to me in hindsight after thinking about it.
I really enjoyed how this article covered a variety of different "hacker-spirit" things; real-world entropy into "digital-world" meaningful use cases, plus a whole extra "one more thing" timelock encryption example at the end.
As someone who uses Cloudflare Workers / Pages heavily these days whenever I can, it's quite fun to see both "how the sausage is made" as well as the culture (playfulness?) behind it. Kinda makes me want to go visit the Austin office since I'm local.
Kudos and thanks to the Cloudflare team for writing stuff like this up! One of the more enjoyable tech pieces I've read in the past couple of weeks, and I learned multiple things along the way.
I worked on research that built machine learning models to take that 'entropy' or sensor pattern noise and to match it against photographs, to trace image lineage when EXIF and similar are stripped out.
For a practical application: as you can imagine, there are certain crimes where it really makes a difference if an image is just on a phone, or if it was verifiably taken by that phone. Possession-of vs. Production-of...
That's interesting. But, assuming the research found it possible to verify which device a photo came from based on the sensor noise, doesn't that kind of go against the idea that there is a lot of entropy in sensor pattern noise?
No, because you could have it that the device is always identifiable, but nevertheless producing a randomly varying sequence of images.
Take printers for example, with some known to print a signature. Clearly they can still print a sheet a solid colour (or a number 0-100, or something character or whatever) at random (given some random source & control for doing so I mean) despite the device being identifiable.
I spend a lot of time thinking about randomness, and after running some tests on the entropy of dark images, I have started to believe that there is a lot less entropy in dark CCD images than people think, but there is still enough to get a useful entropy stream.
A substantial portion of the "noise" from a CCD is definitely not random.
I'm curious, how close to raw CCD data did you get from consumer cameras? It wouldn't surprise me if hard-wired camera internal postprocessing often almost immediately regularizes random noise, even with raw images and software postprocessing turned off. Just a wild-ass guess though.
If I were going to take a stab at this, I would guess that most of the "camera" is really unnecessary and that you could do this using just the image sensor.
A lot of the camera is just functionality to make actual pictures better that don't apply here. Eg you don't need to control exposure with shutter speed if it's in a black box.
Having a whole camera might even be counterproductive. Eg actuating the shutter is predictable, so it might reduce entropy if actuating the shutter creates a signal that shows up in the randomness.
Or maybe they just mean pro quality cameras, but I'm not sure why you'd want a whole camera instead of just the sensor. Reasons are not readily apparent, and I don't expect anyone to be immediately ready to correct me on trade secrets.
The sensor contains decent entropy, assuming you're actually getting raw samples out of the sensor, and your sensor isn't being affected by EMF either in the air or via its power supply.
Noise reduction algorithms are going to affect the entropy if you can't get raw values, and the level of AI crap in cell phone cameras makes it even worse these days.
I wonder what the added effectiveness is of such elaborate setups vs an off-the-shelf HRNG dongle plugged into the server (which I assume is what every other company with such requirements uses). Are the lava lamps and pendulums actually that much more functional or just marketing?
The use of lava lamps as a source of randomness is, to use your term, "just marketing" -- it is not fundamentally more secure than other sources of randomness.
The use of a group of trusted randomness generators, a majority of whom would have to collude in order to trick a consumer into thinking an input was random when it was actually staged, offers genuine functionality that cannot be dismissed as "just marketing".
As long as it's done correctly, mixing new entropy sources into an entropy pool will never _decrease_ the entropy. So in the case of LavaRand, even if it only ever returned a string of zeros, systems that mix it's output into their entropy pools wouldn't be any worse off than before. Perhaps we could have made this point more clearly in the post. (I'm one of the authors.)
So, if your source of random data does not reduce entropy to the pool, but generating random numbers with it does reduce entropy from the pool, along a long enough time line, aren't you going to deplete the entropy anyways?
Randomness from a CSPRNG (cryptographically-secure random number generator) never really gets "depleted," since as long as the seed contains enough entropy and isn't compromised, then it's computationally infeasible to learn anything about the internal state of the CSPRNG from it's outputs. See https://research.nccgroup.com/2019/12/19/on-linuxs-random-nu... for a nice overview.
On older systems that have a notion of entropy depletion, you would eventually deplete the entropy counter and /dev/random would start blocking if you aren't feeding new entropy into the system.
I don't think that's what the post says. I understand it as: Hardware rng is not like csrng which is designed to pass validation. If you measure a truly random hardware rng for a long enough time, you'll get a string of 9s, however unlikely that is. It doesn't mean that that hrng is broken in any way.
That isn't it. When hardware rng is initialized it first runs tests on the randomness to prove that the hardware for getting entropy isn't broken. Then it feeds that entropy into a CSPRNG which gets exposed to code trying to use hardware rng.
That person's point was that it's impossible to know for sure if entropy collection is broken. In practice it isn't an issue as you can make the false positive rate very small even if it can never be 0.
I wasn't criticising hwrngs "as such" merely saying that actual evidence exists for some that their entropy pool can be exhausted and that they can degrade. "just use the cpu hwrng" is too-simple advice.
I wouldn’t consider an image to be sufficiently random, because you don’t have a uniform probability of any specific value: the walls aren’t changing colors, I don’t see any blue lava lamps, camera probably spits out an image with some compression, it has predictable changes through the day if the lighting changes, etc
So… I think it’s relying on a cryptographically secure hashing function to become random enough to rely on, correct?
I almost edited that out when I looked at this post but I have a rule that we should allow authors on the Cloudflare blog to have their own style and so I didn't.
There are two links in the comment. They provide the context. The point being made is that pendulums connected to some support structure can exhibit synchornization caused by transmission of energy across the support. We have not observed synchronization in London which may be due to the chaotic nature of the double pendulums and/or the fact that they are very light compared to the structure they are on.
They're also on relatively long stalks, which makes the shared support not very rigid, and the individual stalks are seemingly random heights, which might desynchronize the effect.
I run a randomness company. It's impossible to prove it 100%.
What you can do instead is run code audits, analyze the theoretical basis of the entropy source, and test large amounts of data for its statistical properties. That can get you to near certainty, but it's still empirical.
Both, depending on the desires of customers, but it's currently leaning towards high-end hardware/software. A broad-market randomness-as-a-service offering is coming this month.
You can never prove it absolutely just from the output, because there is always a possibility that a true random number generator would generate some non-random looking output, like a run of one thousand ones.
The piece that helps security-wise is that we're mixing in entropy from a trusted external source, so not solely relying on the local random number generation from a machine in a data center somewhere. Is it likely that local random number generation would be compromised? No. But it does give us a little extra peace of mind.
This is more of a defense in depth for the paranoid, but cryptographic PRNGs (and even hardware RNGs) can be compromised in ways that are not easy to find. Since they generate your keys, a compromise of the RNG chain is very valuable for a threat actor.
Right, so they've gone to all this effort to generate input as close as possible to random, then they undermine it by opening it up to the public?
Would be interesting to see what the camera placement is like and if that's at least secure, otherwise someone's is just going to stick a still picture on the end of it...
If you're going to go to all this effort to create randomness, at least secure it from physical interference...
Sorry, I know I'm being a cynical here. This just feels like a marketing gimmick.
> Knowing the the approximate current state cannot be used to predict future states
The reason these entropy sources are used is because there's no such thing as a perfectly random algorithm. If there's a way to remove the entropy from the system then the whole thing becomes pointless, and you may as well go back to using a pseudorandom algorithm. That's my point.
If you care this much about ensuring true randomness then I'd argue the security of the system should be a primary consideration – perhaps the primary consideration. If you can't guarantee that you're entropy source is random, then you can't be confident of the randomness of the system generally.
I'm not an expert on this though so if someone wants to explain why I'm wrong then please do so.
You're getting downvoted for your cynicism, but your threat assessment is correct. It's possible for someone to put a still photo in front of the lens.
That being said, the randomness on the sensor alone would probably defeat that, but also you could just check to make sure the previous image and the current image don't have the same hash, which I suspect Cloudflare does just as a basic error check.
I really enjoyed how this article covered a variety of different "hacker-spirit" things; real-world entropy into "digital-world" meaningful use cases, plus a whole extra "one more thing" timelock encryption example at the end.
As someone who uses Cloudflare Workers / Pages heavily these days whenever I can, it's quite fun to see both "how the sausage is made" as well as the culture (playfulness?) behind it. Kinda makes me want to go visit the Austin office since I'm local.
Kudos and thanks to the Cloudflare team for writing stuff like this up! One of the more enjoyable tech pieces I've read in the past couple of weeks, and I learned multiple things along the way.