An introduction to zero-knowledge machine learning

lukeschlather · on April 5, 2023

I'm very confused by the use case here, and this doesn't make sense to me:

> A good example of this would be applying a machine learning model on some sensitive data where a user would be able to know the result of model inference on their data without revealing their input to any third party (e.g., in the medical industry).

I don't get why I would care that the answer was generated specifically by GPT4. It sounds like they're billing this as some sort of "run a model on input with homomorphic encryption" but that doesn't really sound possible, and to the extent that it is I don't think you could ever convince me that the people managing the model on the GPU couldn't get access to both the plaintext input and plaintext output.

The way to get this kind of security is both simple and hard: make models that can run on consumer hardware.

mxwsn · on April 5, 2023

An important use-case is federated learning, which Google, and many healthcare / pharmaceutical companies are very interested in. In federated learning, multiple companies or groups with their own private data come together to train a model jointly on all the private data, while keeping the data private. You need more than zero-knowledge proofs to actually do federated learning securely, but to my limited knowledge they are one tool in the toolbox that can be useful.

rsrsrs86 · on April 6, 2023

Wow. This fed learning field is really, really cool.

ruuda · on April 5, 2023

> but that doesn't really sound possible

It sounds almost too good to be true, but snarks enable a prover to convince a verifier in O(log n) time that a statement of size n is true. In fact many constructs enable this in O(1) verifier time (but the prover is quite slow). With zk-snarks, part of the statement can even be private: the proof reveals nothing about the input, yet it can convince a verifier.

All of this is probabilistic and making some assumptions about the complexity of an adversary, but that is very normal in cryptography. We consider eddsa signatures secure, even though one could in theory find the private key by brute force. Snarks “convince” a verifier in the same manner: generating a proof of a false statement is computationally infeasible, but in principle not impossible.

webmaven · on April 6, 2023

All that can really be conveyed is the "truth" that the model produced the output in response to the input. Given the other vulnerabilities of neural networks (biases, opaqueness, etc.) this is a bit like worrying about a MITM attack when communicating with a sock puppet.

carlosdp · on April 6, 2023

I think you're misunderstanding the example use-case (it's understandable, it took me a decent amount of completely focused time to really understand zero knowledge cryptography).

The use-case with medical data described isn't like homomorphic encryption (where computation is done on an untrusted device, but with encrypted inputs/outputs). It would be more like: being able to prove you have some medical condition to an insurer, without your doctor having to provide them the lab results, but instead providing a ZK proof of an ML model that detects that condition, for example.

Using these kinds of proofs, you can start to silo your information off to just the parties that absolutely must have access to it, while also allowing parties that need to be able to verify certain narrow attributes (like an insurer needing proof you needed some procedure) to verify it without needing to invade your privacy.

reaperman · on April 5, 2023

> make models that can run on consumer hardware.

This will not be hard at all in 10-20 years given the pace of semiconductor FLOPS per watt improvement. https://en.wikipedia.org/wiki/Koomey%27s_law

The neural engine in the A16 bionic on the latest iPhones can perform 17 TOPS. The A100 is about 1250 TOPS. Both these performance metrics are very subject to how you measure them, and I'm absolutely not sure I'm comparing apples to bananas properly.

However, we'd expect the iPhone has reached its maximum thermal load. So without increasing power use, it should match the A100 in about 6 to 7 doublings, which would be about 11 years. In 20 years the iPhone would be expected to reach the performance of approximately 1000 A100's.

At which point anyone will be able to train a GPT-4 in their pocket in a matter of days.

MacsHeadroom · on April 5, 2023

You're assuming no algorithmic enhancements and missing the currently happening shift from 16bit to 4bit operations which will soon give ML hardware a 4x improvement on top of everything else.

We could be training GPT-4s in our pockets by the end of this decade.

vlovich123 · on April 6, 2023

To be fair, they’re also being extremely generous about HW scaling. There’s no way we’re going to see doublings every 18 months for the next 6+ years when we’ve already stopped doing that for the past 5-10.

reaperman · on April 6, 2023

I haven't seen evidence in a slowdown of Koomey's Law. Would be very interested in those sources!

vlovich123 · on April 6, 2023

Have you read the Wikipedia page? Moore’s law started ending ~23 years ago followed by Denmark Scaling ~18 years ago. It’s not necessarily fully stopped because there are other architectural improvements that have been delivered along the way, but we simply have reached nearly the end of the road for scaling this due to a combination of heat dissipation challenges and inability to shrink transistors further. 3D packaging might increase things further but it’s difficult and an area of active research (+ once you do that afaik you’ve unlocked the “last” major architectural improvement). I think the current estimates put the complete end to further HW improvements at ~2050 or so. You can still improve software or build dedicated ASICS/accelerators for expensive software algorithms, but that’s the world pre-Moore which saw most accelerators die off because the exponential growth of CPU compute obviated the need for most of them (except for GPUs). We’re coming back to it with things like Tensor cores. Reversible computing is the way forward after we hit the wall but no one knows how to do this yet.

> But in 2011, Koomey re-examined this data[2] and found that after 2000, the doubling slowed to about once every 2.6 years. This is related to the slowing[3] of Moore's law, the ability to build smaller transistors; and the end around 2005 of Dennard scaling, the ability to build smaller transistors with constant power density.

reaperman · on April 7, 2023

> But in 2011, Koomey re-examined this data[2] and found that after 2000, the doubling slowed to about once every 2.6 years.

Koomey's own actual post from 2011 still showed consumer systems matching the 1.57 slope up to that time, FWIW.

https://www.koomey.com/post/14466436072

Wikipedia mis-cited it in the text and should have said "But in 2016". However, the 2016 analysis misses the A11 Bionic through A16 Bionic and M1 and M2 processors -- which instantly blew way past their competitors, breaking the temporary slump around 2016 and reverting us back to the mean slope.

Mainly because now they're analyzing only "supercomputers" and honestly that arena has changed, where quite a bit of the HPC work has moved to the cloud [e.g. Graviton] (not all of it, but a lot), and I don't think they're analyzing TPU pods, which also probably have far better TOPS/watt than traditional supercomputers like the ones on top500.org.

thehumanmeat · on April 5, 2023

One large use case of ML and ZKML is Verifiable Computing. You can have an IoT device be able to enforce an untrusted super computer to process it's data on a certain program in a correct manner.

BiasRegularizer · on April 5, 2023

A 17 million parameter model (~Resnet50) takes more than 50s proof time. Is this on top of the inference time?

I can see some niche applications for this system, but I am very skeptical it's ability to handle larger models (100M+) and the ability to and it's scalability when there are increased demand.

iskander · on April 5, 2023

ZK is currently stuck with arithmetic circuit representations which are predictably very expensive to use as a representation for tensor data.

The matrix based formulations are still limited and don't play nicely with the parts of ML models which go beyond simple matrix multiplication.

I suspect someone will unify the two threads of research eventually, but it doesn't seem like it's there yet.

(FHE ML is even further away)

rasengan · on April 5, 2023

If Facebook releases Llama, and updated models thereafter, for purchase or as freeware, there will not really be as much need for this since everything will happen safely, locally, no?

It would be cool to see Meta release a 7B parameter as shareware, and subsequent larger models for a fee.

Edit: To be clear, I'm all for ZK, generally!

wslh · on April 5, 2023

Just posted another thread from an article of a16z and a Zcash tweet: https://news.ycombinator.com/item?id=35457720