Harnessing chaos in Cloudflare offices

cojo · 2024-03-08T15:31:21 1709911881

It hadn't occurred to me that even a photo with the lens cap on still contains decent entropy, although it now seems somewhat obvious to me in hindsight after thinking about it.

I really enjoyed how this article covered a variety of different "hacker-spirit" things; real-world entropy into "digital-world" meaningful use cases, plus a whole extra "one more thing" timelock encryption example at the end.

As someone who uses Cloudflare Workers / Pages heavily these days whenever I can, it's quite fun to see both "how the sausage is made" as well as the culture (playfulness?) behind it. Kinda makes me want to go visit the Austin office since I'm local.

Kudos and thanks to the Cloudflare team for writing stuff like this up! One of the more enjoyable tech pieces I've read in the past couple of weeks, and I learned multiple things along the way.

jareklupinski · 2024-03-08T16:11:41 1709914301

> a photo with the lens cap on still contains decent entropy

https://en.wikipedia.org/wiki/Johnson%E2%80%93Nyquist_noise

spending so much time devising ways to mitigate or remove noise, it becomes difficult to think of it as a 'feature'...

flumpcakes · 2024-03-08T18:03:54 1709921034

I worked on research that built machine learning models to take that 'entropy' or sensor pattern noise and to match it against photographs, to trace image lineage when EXIF and similar are stripped out.

For a practical application: as you can imagine, there are certain crimes where it really makes a difference if an image is just on a phone, or if it was verifiably taken by that phone. Possession-of vs. Production-of...

jjeaff · 2024-03-08T19:10:47 1709925047

That's interesting. But, assuming the research found it possible to verify which device a photo came from based on the sensor noise, doesn't that kind of go against the idea that there is a lot of entropy in sensor pattern noise?

OJFord · 2024-03-08T22:52:54 1709938374

No, because you could have it that the device is always identifiable, but nevertheless producing a randomly varying sequence of images.

Take printers for example, with some known to print a signature. Clearly they can still print a sheet a solid colour (or a number 0-100, or something character or whatever) at random (given some random source & control for doing so I mean) despite the device being identifiable.

pclmulqdq · 2024-03-08T21:30:46 1709933446

I spend a lot of time thinking about randomness, and after running some tests on the entropy of dark images, I have started to believe that there is a lot less entropy in dark CCD images than people think, but there is still enough to get a useful entropy stream.

A substantial portion of the "noise" from a CCD is definitely not random.

code_biologist · 2024-03-08T21:58:40 1709935120

I'm curious, how close to raw CCD data did you get from consumer cameras? It wouldn't surprise me if hard-wired camera internal postprocessing often almost immediately regularizes random noise, even with raw images and software postprocessing turned off. Just a wild-ass guess though.

pclmulqdq · 2024-03-08T22:08:38 1709935718

I didn't use consumer cameras to test this, and I assume cloudflare doesn't either.

code_biologist · 2024-03-09T02:43:18 1709952198

I was annoyed at the caginess of your answer until I saw you ran a randomness-as-a-service company and these are low grade trade secrets.

everforward · 2024-03-11T14:09:45 1710166185

If I were going to take a stab at this, I would guess that most of the "camera" is really unnecessary and that you could do this using just the image sensor.

A lot of the camera is just functionality to make actual pictures better that don't apply here. Eg you don't need to control exposure with shutter speed if it's in a black box.

Having a whole camera might even be counterproductive. Eg actuating the shutter is predictable, so it might reduce entropy if actuating the shutter creates a signal that shows up in the randomness.

Or maybe they just mean pro quality cameras, but I'm not sure why you'd want a whole camera instead of just the sensor. Reasons are not readily apparent, and I don't expect anyone to be immediately ready to correct me on trade secrets.

KennyBlanken · 2024-03-08T22:04:06 1709935446

The sensor contains decent entropy, assuming you're actually getting raw samples out of the sensor, and your sensor isn't being affected by EMF either in the air or via its power supply.

Noise reduction algorithms are going to affect the entropy if you can't get raw values, and the level of AI crap in cell phone cameras makes it even worse these days.

extraduder_ire · 2024-03-08T16:26:07 1709915167

Somewhat related: https://www.random.org/faq/

Random.org generates random numbers by collecting static hiss from untuned radios around the world, selling them if they are required in bulk.

pclmulqdq · 2024-03-08T21:14:32 1709932472

I will shamelessly plug my company offering private TRNGs in the cloud:

https://arbitrand.com

9dev · 2024-03-09T07:57:32 1709971052

Can you fuse mount it? :)

paxys · 2024-03-08T15:37:22 1709912242

I wonder what the added effectiveness is of such elaborate setups vs an off-the-shelf HRNG dongle plugged into the server (which I assume is what every other company with such requirements uses). Are the lava lamps and pendulums actually that much more functional or just marketing?

mcherm · 2024-03-08T15:48:39 1709912919

The use of lava lamps as a source of randomness is, to use your term, "just marketing" -- it is not fundamentally more secure than other sources of randomness.

The use of a group of trusted randomness generators, a majority of whom would have to collude in order to trick a consumer into thinking an input was random when it was actually staged, offers genuine functionality that cannot be dismissed as "just marketing".

yellow_lead · 2024-03-08T17:37:46 1709919466

It's also the type of fun/cool marketing that potentially opens up new attack vectors

lukevalenta · 2024-03-08T19:02:52 1709924572

As long as it's done correctly, mixing new entropy sources into an entropy pool will never _decrease_ the entropy. So in the case of LavaRand, even if it only ever returned a string of zeros, systems that mix it's output into their entropy pools wouldn't be any worse off than before. Perhaps we could have made this point more clearly in the post. (I'm one of the authors.)

akira2501 · 2024-03-08T21:04:36 1709931876

So, if your source of random data does not reduce entropy to the pool, but generating random numbers with it does reduce entropy from the pool, along a long enough time line, aren't you going to deplete the entropy anyways?

lukevalenta · 2024-03-08T21:46:09 1709934369

Randomness from a CSPRNG (cryptographically-secure random number generator) never really gets "depleted," since as long as the seed contains enough entropy and isn't compromised, then it's computationally infeasible to learn anything about the internal state of the CSPRNG from it's outputs. See https://research.nccgroup.com/2019/12/19/on-linuxs-random-nu... for a nice overview.

The Linux random number generator did used to have a notion of entropy depletion, but that is no longer the case (at least for x86-64 systems: https://wiki.archlinux.org/title/Random_number_generation).

On older systems that have a notion of entropy depletion, you would eventually deplete the entropy counter and /dev/random would start blocking if you aren't feeding new entropy into the system.

dokument · 2024-03-08T19:15:57 1709925357

Compromising a single seed source for randomness doesn't mean you have control over the output. That's the whole point to using multiple sources.

charcircuit · 2024-03-08T18:09:18 1709921358

Most servers have HRNG in the cpu itself

fch42 · 2024-03-08T21:41:42 1709934102

maybe relevant ...

https://lwn.net/Articles/961510/

(CPUs may have hrngs but there are certain observations on their behaviour that create concern)

viraptor · 2024-03-08T23:58:36 1709942316

I don't think that's what the post says. I understand it as: Hardware rng is not like csrng which is designed to pass validation. If you measure a truly random hardware rng for a long enough time, you'll get a string of 9s, however unlikely that is. It doesn't mean that that hrng is broken in any way.

charcircuit · 2024-03-09T01:26:54 1709947614

That isn't it. When hardware rng is initialized it first runs tests on the randomness to prove that the hardware for getting entropy isn't broken. Then it feeds that entropy into a CSPRNG which gets exposed to code trying to use hardware rng.

That person's point was that it's impossible to know for sure if entropy collection is broken. In practice it isn't an issue as you can make the false positive rate very small even if it can never be 0.

fch42 · 2024-03-09T09:05:22 1709975122

I wasn't criticising hwrngs "as such" merely saying that actual evidence exists for some that their entropy pool can be exhausted and that they can degrade. "just use the cpu hwrng" is too-simple advice.

e28eta · 2024-03-08T16:18:38 1709914718

I wouldn’t consider an image to be sufficiently random, because you don’t have a uniform probability of any specific value: the walls aren’t changing colors, I don’t see any blue lava lamps, camera probably spits out an image with some compression, it has predictable changes through the day if the lighting changes, etc

So… I think it’s relying on a cryptographically secure hashing function to become random enough to rely on, correct?

jgrahamc · 2024-03-08T16:25:10 1709915110

The article contains details of how the images are used.

SR2Z · 2024-03-08T16:24:20 1709915060

Almost definitely, but ANY image of this type (video feed) probably has enough entropy for a lot of random bits.

senderista · 2024-03-08T16:43:04 1709916184

Love the shout-out to "The Snail and the Whale"--it's one of our family's favorite books! (As is anything else by Julia Donaldson.)

jgrahamc · 2024-03-08T16:49:22 1709916562

I almost edited that out when I looked at this post but I have a rule that we should allow authors on the Cloudflare blog to have their own style and so I didn't.

franky47 · 2024-03-08T16:51:51 1709916711

Same here, we started reading it with my son and we both love it, a great bedtime story.

creeble · 2024-03-08T15:45:53 1709912753

Interesting that they use a wall of pendulums, and even more interesting that it’s in London:

https://www.forbes.com/sites/kevinknudson/2015/07/29/pendulu...

And

Https://physicsworld.com/a/the-secret-of-the-synchronized-pendulums/

Still, probably plenty of entropy. But a weird choice if you know the context?

chzblck · 2024-03-08T15:58:02 1709913482

what if we dont know the context? what should we googs?

jgrahamc · 2024-03-08T15:59:58 1709913598

There are two links in the comment. They provide the context. The point being made is that pendulums connected to some support structure can exhibit synchornization caused by transmission of energy across the support. We have not observed synchronization in London which may be due to the chaotic nature of the double pendulums and/or the fact that they are very light compared to the structure they are on.

yencabulator · 2024-03-11T01:57:03 1710122223

They're also on relatively long stalks, which makes the shared support not very rigid, and the individual stalks are seemingly random heights, which might desynchronize the effect.

ThinkBeat · 2024-03-08T17:28:54 1709918934

I dont have math or statistics knowledge.

How hard is it to prove true randomness?

My guess is that it is either incredibly difficult or impossible. Or at least one would need an enormous number of samples to start the analysis.

pclmulqdq · 2024-03-08T21:13:28 1709932408

I run a randomness company. It's impossible to prove it 100%.

What you can do instead is run code audits, analyze the theoretical basis of the entropy source, and test large amounts of data for its statistical properties. That can get you to near certainty, but it's still empirical.

ThinkBeat · 2024-03-09T12:49:34 1709988574

That is interesting.

Do you offer randomness as a service? People contact your API to retrieve random data? or do you sell software / hardware to create random data?

pclmulqdq · 2024-03-09T15:46:11 1709999171

Both, depending on the desires of customers, but it's currently leaning towards high-end hardware/software. A broad-market randomness-as-a-service offering is coming this month.

IncreasePosts · 2024-03-08T18:10:19 1709921419

You can never prove it absolutely just from the output, because there is always a possibility that a true random number generator would generate some non-random looking output, like a run of one thousand ones.

charcircuit · 2024-03-08T18:08:41 1709921321

It's impossible as the seed can always be bigger.

deanCommie · 2024-03-08T18:21:01 1709922061

This is cool and an awesome tech demo and probably a great recruting technique, but...

Is it actually practically/meaningfully more "Secure" than the standard PseudoRandom approaches that everyone relies on today?

Short of "Nation States sharing secrets of alien technology", is this actually significant to any customer's practical applications?

lukevalenta · 2024-03-08T19:11:26 1709925086

The piece that helps security-wise is that we're mixing in entropy from a trusted external source, so not solely relying on the local random number generation from a machine in a data center somewhere. Is it likely that local random number generation would be compromised? No. But it does give us a little extra peace of mind.

pclmulqdq · 2024-03-08T21:22:08 1709932928

This is more of a defense in depth for the paranoid, but cryptographic PRNGs (and even hardware RNGs) can be compromised in ways that are not easy to find. Since they generate your keys, a compromise of the RNG chain is very valuable for a threat actor.

deanCommie · 2024-03-09T20:39:13 1710016753

So, no.

new23d · 2024-03-08T18:27:56 1709922476

charcircuit · 2024-03-08T18:12:27 1709921547

This is overengineered. We do not need that much entropy and entropy is easy to collect from within the chips themselves.

notatoad · 2024-03-08T19:27:55 1709926075

lol of course it's overengineered. but it's also cool, and that's okay sometimes.

kypro · 2024-03-08T15:55:28 1709913328

> Visible to visitors

Right, so they've gone to all this effort to generate input as close as possible to random, then they undermine it by opening it up to the public?

Would be interesting to see what the camera placement is like and if that's at least secure, otherwise someone's is just going to stick a still picture on the end of it...

If you're going to go to all this effort to create randomness, at least secure it from physical interference...

Sorry, I know I'm being a cynical here. This just feels like a marketing gimmick.

kube-system · 2024-03-08T18:17:57 1709921877

This is only one source of randomness in a pool of other sources

https://blog.cloudflare.com/lavarand-in-production-the-nitty...

Even if you managed to 100% compromise this source, there's still a pool of other sources involved.

aarreedd · 2024-03-08T16:12:37 1709914357

The whole point is that it's chaotic. Knowing the the approximate current state cannot be used to predict future states. https://en.wikipedia.org/wiki/Chaos_theory

kypro · 2024-03-08T16:23:55 1709915035

> Knowing the the approximate current state cannot be used to predict future states

The reason these entropy sources are used is because there's no such thing as a perfectly random algorithm. If there's a way to remove the entropy from the system then the whole thing becomes pointless, and you may as well go back to using a pseudorandom algorithm. That's my point.

If you care this much about ensuring true randomness then I'd argue the security of the system should be a primary consideration – perhaps the primary consideration. If you can't guarantee that you're entropy source is random, then you can't be confident of the randomness of the system generally.

I'm not an expert on this though so if someone wants to explain why I'm wrong then please do so.

jgrahamc · 2024-03-08T15:57:51 1709913471

The cameras themselves aren't accessible to visitors.

pixl97 · 2024-03-08T16:42:22 1709916142

>at least secure it from physical interference...

I'm assuming if you start screwing around with shit the security guards will show up and point guns at you...

jedberg · 2024-03-08T16:46:47 1709916407

Despite what you've heard about America, most private security at tech companies don't carry guns. :)

jgrahamc · 2024-03-08T16:49:43 1709916583

And definitely not in London.

pests · 2024-03-09T06:41:27 1709966487

> most

Ha, which ones do you know of?

jedberg · 2024-03-08T16:19:36 1709914776

You're getting downvoted for your cynicism, but your threat assessment is correct. It's possible for someone to put a still photo in front of the lens.

That being said, the randomness on the sensor alone would probably defeat that, but also you could just check to make sure the previous image and the current image don't have the same hash, which I suspect Cloudflare does just as a basic error check.

cm2187 · 2024-03-08T17:35:57 1709919357

Or to get a copy of the signal from the camera, as I assume it is less secure than cloudflare's datacentres.