Anyone have a good explanation on the intuition of non-interactive zero-knowledge proofs? For example, I thought the "paint-mixing" analogy for Diffie-Hellman key exchange (https://en.wikipedia.org/wiki/Diffie–Hellman_key_exchange#Ge...) really helped me handwave the math into "mixing easy, unmixing hard".
An intuitive explanation is that of proving you can find Waldo in a picture without revealing his exact location. Digital wallets can be interpreted as fancy signature schemes that operate on third-party issued commitments C instead of public keys that directly link users to their identities.
A simple signature scheme is based on proof of knowledge PoK{x : pk = g^x}, which is transformed into a noninteractive variant via the Fiat-Shamir transformation, where the message is appended to the hash. Range proofs work similarly, with the simplest form being for a single bit: PoK{(b,r) : C = g^b * h^r & b(b−1)=0}. This proves that commitment C contains a bit b in {0,1} without revealing which value it is.
Arbitrary ranges can then be constructed using the homomorphic properties of commitments. For an n-bit range, this requires n individual bit proofs. Bulletproofs optimize this to O(log n) proof size, enabling practical applications.
The commitment C can be issued by a trusted third party that signs it, and the user can then prove certain properties to a service provider, such as age ranges or location zones (constructed from latitude and longitude bounds).
A key challenge is that reusing the same commitment C creates a tracking identifier, potentially compromising user privacy.
for explanation i've seen for the where's waldo analogy: imagine the single page of the where's waldo puzzle, and another giant piece of paper with the shape of waldo cut out of it.
by providing a picture of waldo in the cut-out, you can prove you know where he is without providing the location. a zero knowledge proof.
I think the Where's Waldo example, while not technically zero knowledge, gives a pretty good intuition of the idea behind it.
It certainly gives a "layperson" example of being able to prove you know something without revealing it, which isn't the whole definition of ZK but is the idea driving it.
Imagine it isn't Waldo, but an unknown figure and you are only given the silhouette to find. If you can draw what's within the silhouette or something, you've proven you've located it to high certainty without saying where.
Say the whole image looked like noise and was generated from quantum measurements, and the coordinates to hash for the problem were generated with quantum measurements, and you were given the silhouette and the hash of the noise within to look for. I could see it for proof of work: you could slide along a hashing window and prove you actually did work examining half the image on average or whatever.
I think my example isn't great and would need to be modified like maybe give the hash of a neighboring area to prove you found it, so your answer couldn't be used by others to find the location much more cheaply.
This is an interactive example, isn't it? It doesn't help me understand non-interactive proofs like SNARKs/STARKs, where the verifier isn't communicating live with the prover.
* The prover commits to a starting value (public input)
* Instead of waiting for an interactive challenge, they hash it and use the resulting hash output as if it were a challenge
If we believe the hash is a random oracle (as we do for cryptographic hash functions), then it is hard for the prover to manipulate the challenges. Is that it?
You got it. There are a few nuisances, e.g. the "theorem statement" must be hashed as well so that proving that name=Mickey has a different oracle than proving that name=Goofy, but your basic understanding is correct.
This is what I was going to post. It helped me a lot by first giving a very intuitive understanding of the concept of ZKPs using the Where's Waldo/puffin-among-the-penguins example, but then also going deeper with the graph-coloring example.
Was looking to see if someone posted this video. The first few interviews are excellent - the later ones, not so much (in terms of explaining ZK - they're good chats, of course).
Usually in an IP, the prover (Bob) has to answer questions from the verifier (Alice), and Alice chooses her questions by flipping a coin. If the Bob doesn’t really know the answer, he’ll get caught cheating with high probability.
So now the trick: Bob starts generates his initial answer. Then he hashes it (“commits” in the jargon), and uses the hash as “Alice’s first coin flip”. Then he answers the question for that flip, hashes the whole thing for “Alice’s second coin flip”… etc.
Bob does this say, 100 times, and then sends the whole simulated conversation to Alice. Alice can verify that he didn’t cheat by checking the intermediate hashes.
The whole thing depends on the ability to not control the result of the hash function, so it’s vital to use a cryptographically secure one.
It "feels" much easier to generate random non-solutions and check if the random questions happen to pass, though. Is it really all there is to it? You increase the number of questions to compensate and that's the whole scheme? Wouldn't the responses be a ludicrous amount of data?
Yes, basically. The hard work happens in constructing the interactive proof to ensure that you can’t reneg on the earlier steps.
All the proofs that I know of allow one to get lucky with probability about .5 in each round. When you do an interactive proof with 100 rounds, you have a 2^-100 chance of getting away with cheating.
When you go non-interactive with 100 rounds, an adversary could cheat by trying about 2^100 proofs. So you replace a stronger guarantee with a weaker one, but 2^100 is a pretty big barrier.
(I just looked and the Wikipedia page and it’s very confusing fwiw)
The trick is called the Fiat-Shamir transform, and you're right, it does require more questions to get an equivalent security level, precisely because you can try a large number of random non-solutions without anyone catching you doing it.
But the number of questions you need to compensate grows only a little.
For example, interactively if you ask for Merkle tree proof that selected leaf values have a particular property, you only have to ask for about k leaves to get probability 1-(2^-k) that you'd catch a dishonest prover who had committed a Merkle root with less than half the leaves having the property.
Non-interactively, a dishonest prover could secretly grind attempts, say 2^g times, and then you'd have a lower probability of catching them, approximately 1-(2^(g-k)). But g can't be all that large, so you can increase k to compensate without making the proof much larger.
You.can also require certain hashes to have a fixed prefix, like Bitcoin mining, forcing every prover to have to grind 2^p times. This reduces the effective g that a dishonest prover can achieve, allowing k to be smaller for the same security, so allowing the non-interactive proof to be smaller. At the cost of honest provers having to grind.
The surprising part of STARKS and SNARKS comes down to the nature of polynomials. It's surprisingly easy to tell two polynomials apart with a small number of random checks (Schwartz Zippel lemma). In light of this it's not surprising there is good reading comparing them to erasure codes which rely on exactly this property of polynomials.
The non-interactive piece is pretty straightforward you just simulate challenge response conversation with unbiasible public randomness and show the transcript (Fiat Shamir transform).
Another area worth exploring is how some of these proof systems can have such incredibly small proofs (192 bytes for any computation in groth16 zk snarks). That relies on the much more difficult to intuit theory of elliptic curve pairing functions.
Yeah I'm also interested in some of the details here, but the linked library repo is a bit too low-level for my current understanding.
For example, in the usecase of providing a proof-of-age to a website: who provides the verification data (the government?); what form does that take (a file in a standard format?); who holds/owns the verification data (the user?); who runs the verification software (the end-user's web browser?).
Can the user use any implementation to provide the proof, or must it be a "blessed" implementation such as Google Wallet?
The specifics depend on local regulations, but roughy speaking: the government gives you a document in a standard format (eg MDOC). Your phone stores the document, with cooperation from a secure element that binds the document to the phone. The website you visit verifies the proof. The government gives documents to whatever wallet they want, which may be a special government wallet. They may or may not give the document to Google Wallet.
> Your phone stores the document, with cooperation from a secure element that binds the document to the phone. The website you visit verifies the proof.
So it does require a "blessed" implementation, and I have to trust Google or Apple to handle my data? I cannot own the document myself and use an open-source client that I trust to provide the proof?
It depends on local regulations. As far as I can tell Europe will require some sort of blessing of the wallet. To be clear, governments will develop their own apps and it's not clear that Google will be blessed. We (Google) are giving them the code pro bono to improve privacy.
Hmm. This introduces a third party to the protocol, right? Specifically the developer of the wallet. So we now have three parties: the user, the wallet developer, and the relying party. Does this zk protocol protect the user's privacy from the wallet developer as well as the relying party?
In other words, does the protocol give the wallet access to information about the relying party? For example, could this wallet that I don't control tell its owner, or the government, that I am using it to access a certain website?
Yes, a malicious wallet could leak your information. This is why some governments will insist on using only blessed wallets. However, wallet+zk is strictly better than sending the plaintext MDOC to the relying party.
There are no solutions in this space, only tradeoffs, and elected representatives have picked one tradeoff.
That's too bad :( I wish the protocol had been designed with that in mind. Requiring users to trust proprietary software from Google & Apple to be in complete control over their digital identities is a pretty crummy direction to go in.
> Controlled by users: The EU Digital Identity Wallets will enable people to choose and keep track of their identity, data and certificates which they share with third parties. Anything which is not necessary to share will not be shared.
I think where the ZKP stuff being discussed here fails to meet this criteria is the wallet provider is also a third (non-user) party. You stated elsewhere that a malicious wallet could leak data about a transaction: that's exactly the vulnerability that is not being accounted for by this protocol.
> If you think you have a better idea shoot me an email.
Sure, will do. It does seem to me like a solvable problem. I think this kind of tech is really important and I'd love to see this hole get closed so I can feel better about supporting it.
Update: After some email discussion with Matteo, it looks like my fears are unfounded. The EU regulations seem to require wallets to be open source[1]. Assuming that wallets do not need to pass any sensitive transaction data down to the OS libraries, then it should be possible for users to verify the behavior of their wallet software by examining the source and possibly even by building & deploying it themselves.
In principle, you could use an open source implementation, but not a user-modifiable implementation.
Nothing stops a government from making their code open source and providing you with reproducible builds. You just won't be able to change the code to do something the government doesn't deem legal.
(1) in this case, an identity issuer provides the source of truth identity information. Examples include state DMV, your passport (you can try "Id pass" in Google wallet), etc.
(2) One of the goals of this project was to layer ZK on top of current identity standards that DMVs already issue, so that gov orgs don't have to change what they currently do to support the strongest user privacy. One example format is called Mdoc.
(3) The user holds the identity information on their device only. No other copies. The user's device makes the zkp proof on-device. This was one of the major technical challenges.
(4) The relying party (eg a website) runs the zk verification algorithm on the proof that is produced by the device to ensure soundness.
(5) Yes, the user can use any compatible implementation to produce the proof. We have open-sourced our implementation and we have a spec for the proof format that others can also reimplement.
If you can achieve RCE on the chip and run arbitrary code without invalidating signatures, does the protocol still stay secure?
If so, what's the point of requiring your implementation to run on a verified secure element? If not, the protocol seems only as strong as the weakest chip, as obtaining just a single private key from a single chip would let you generate arbitrary proofs.
The role of the secure element is only to "bind" the credential to the device, so that if you copy the credential somewhere else then the credential is useless. Concretely, the secure element produces a ECDSA signature that must be presented together with the credential. This is the normal protocol without ZKP. Concretely, the SE is in the phone, but could be a yubikey or something else.
The ZKP library does not run on the secure element. It runs on the normal CPU and produces a proof that the ECDSA signature from the SE is valid (and that the ECDSA signature from the issuer is valid, and that the credential has not expired, and ...) If you crack the ZKP library, all you are doing is producing an incorrect proof that will not verify.
Am I correctly understanding that I'd get the credential from say my state DMV once, and then later whenever I want to prove my age to a website the proof protocol is just between that website and my device? The DMV gets no information about what websites I use the DMV credential with and they get no information about when I use the credential even if the website and the DMV decide to cooperate? All they would be able to get was that at time T someone used a credential on the site that came from the DMV?
I tried to sketch out a design an age verification system, but it involved the DMV in each verification, which made timing attacks a problem. Briefly the website would issue a token, you'd get a blind signature of the token from the DMV's "this person is 18+" service, and return the token and unblinded signature to the website. I think that can be made to work but if the site and DMV cooperated they would likely be able to unmask many anonymous site users by comparing timing.
Getting the DMV out of the picture once your device is set up with the credential from them nicely eliminates that problem.
You are correct. The property that the colluding website and DMV still cannot identify you is called "unlinkability" and as far as I can tell cannot be achieved without zero-knowledge proofs. See https://github.com/user-attachments/files/15904122/cryptogra... for a discussion on this issue.
However, the timing attack resurfaces once you allow the DMV to revoke credentials. Exactly how the revocation is done matters. We are actively pushing back against solutions that require the DMV to be contacted to verify that the credential has not been revoked at presentation time, but this is a very nuanced discussion with inevitable tradeoffs between privacy and security.
>> The DMV gets no information about what websites I use the DMV credential with and they get no information about when I use the credential even if the website and the DMV decide to cooperate?
> You are correct. The property that the colluding website and DMV still cannot identify you is called "unlinkability" and as far as I can tell cannot be achieved without zero-knowledge proofs.
Well, no. This is true only if you trust the unverifiable wallet software on your phone, which was provided by a for-profit, American big tech advertising company. In this protocol, the wallet may secretly leak the transaction details back to the DMV or whoever else they wish[1].
MatteoFrigo is suggesting that unlinkability requires ZKPs.
Your observation that a bad wallet could compromise unlinkability is not a refutation of that. To refute it you need to show that it is possible to achieve unlinkability without using a ZKP.
One part that I don't understand yet: How does the system ensure "sybil resistance"? (not sure if that's the right term in that context)
By providing both attestation of individual attributes combined with "unlikability", how would even a single verifying party ensure that different attestations don't come from the same identity?
E.g. In the case of age attestation a single willing dissenting identity could set up a system to mint attestations for anyone without it being traceable back to them, right? Similar to how a single of-age person could purchase beer for all their under age friends (+ without any feat of repercussions.
Great question. The current thinking, at least in high level-of-assurance situations, is this. The identity document is only usable in cooperation with a hardware security element. The relying party picks a random nonce and sends it to the device. The device signs the nonce using the SE, and either sends the signature back to the relying party (in the non-ZKP case), or produces a ZKP that the signature is correct. The SE requires some kind of biometric authentication to work, e.g. fingerprint. So you cannot set up a bot that mints attestations. (All this has nothing to do with ZKP and would work the same way without ZKP.)
In general there is a tradeoff between security and privacy, and different use cases will need to choose where they want to be on this spectrum. Our ZKP library at least makes the privacy end possible.
That seems a bit like a game of whack-a-mole where as long as the forging side is willing to go further and further into out-of-hardware emulation (e.g. prosthetic finger on a robot hand to trick fingerprint scanners), they are bound to win. Biometrics don't feel like they hold up much if you can have collusion without fear of accountability.
> Our ZKP library at least makes the privacy end possible.
Yes, that's also one of the main things that make me excited about it. I've been following the space for quite some time now, and I'm happy that it becomes more tractable for standard cryptographic primitives and thus a lot more use-cases.
Thanks for your contributions to the space and being so responsive in this thread!
No. ZK has a technical definition I don't want to get into, but note that the described system is deterministic and it always produces the same proof for Alice on a given day, and the proof for a later day can be derived from the proof for an earlier day. So two proofs can be linked back to Alice, and thus the system is not ZK. You need some kind of randomness for ZK.
Thanks for the reply. So in theory, I could get this MDOC file and store it on my desktop computer, and use an open-source library whose behavior I can verify, to provide the proof to the website via my web browser. Yeah? This sounds good to me.
No. Using the MDOC requires a signature from a hardware security key in the phone, and a lot of the complexity is how to avoid leaking the private key, which would identify you.
An alternative would be some secure chip in a credit-card size plastic document, but nobody seems to like that idea. We (Google) don't make these choices.
Another approach could be for a component in the protocol that I do trust (eg an open source web browser) to serve as an intermediary, providing only the information required to each of the components that I don't trust (wallet, website). The wallet does not need to know who is requesting the proof, right?
I hear you. The main problem is how to prevent you from giving your document to somebody else, and things have converged on certified smartphone with security key plus biometrics.
Yeah, Passkeys are doing the same thing, expecting users to just blindly trust American Big Tech companies. It's distressing that no one working on these protocols considers the developers of the software that implements the protocol to be a party in the protocol. What are the wallet provider's interests in this exchange? How can the user be protected from the wallet provider? Seems no one asks these questions :(
Anyone can implement passkeys. The feature where passkeys can be made to attest to the hardware provider is optional and no site I've used requires it. Firefox defaults to not allowing passkeys to attest to the hardware unless you click through a permission dialog.
I don't want to get into a Passkey derail, but no. The Passkey spec requires clients to handle the user's own data in certain ways, and the Passkey spec authors threaten clients that allow users to manage their own data with client bans.
Are you trying to say that there’s a signed blob called an MDOC, that happens to have the age and name of the user, and this library allows a website to prove that the provided age belongs to the person with the MDOC, but not also see the name?
But to be clear, mdoc already accounts for this through its selective disclosure protocol, without the need for a zero knowledge proof technology. When you share an mdoc you are really just sharing a signed pile of hashes ("mobile security object") and then you can choose which salted pre-images to share along with the pile of hashes. So for example your name and your birth date are two separate data elements and sharing your MSO will share the hashes for both, but you might only choose to share the pre-image representing your birthday, or even a simple boolean claim that you are over 21 years old.
What you don't get with this scheme (and which zero knowledge proofs can provide) is protection against correlation: if you sign into the same site twice or sign into different sites, can the site owners recognize that it is the same user? With the design of the core mdoc selector disclosure protocol, the answer is yes.
It is decentralized. The holder provides the data, which was ultimately provided to them by the government, they're the client. The verifier is the entity that wants to know how old the holder is, the server.
The form are eg things like the JSON Web Token (JWT), Digital Credentials, and the Federated Credential Management API (FedCM).[1][2][3][4][5] The software can be anything since they're expected to use open protocols, so yes, web browsers.[6] Per the Commission, "For remote presentation flows, … the Wallet Instance implements the OpenID for Verifiable Presentation protocol OpenID4VP in combination with the W3C Digital Credentials API."[7]
The explanation that one person gave me was basically that you use an RNG to generate the challenges. Not sure if this is quite "proper", but it makes sense to me so long as you can't game the system. Perhaps make the RNG slow to prevent picking a convenient sequence?
You want to prove to everyone that you know where the Waldo on Page 12 of Where's Waldo In Iceland, so you hold a big white sheet of paper with a hole in it in front of the page such that the hole is centered on Waldo. Then you let your friend see. Your friend now knows that you know where Waldo is, but they still don't know where Waldo is, because they don't know the relative position of the book under the sheet. This is also why they can't use your proof to falsely prove to anyone else that they know where Waldo is too.
https://blog.cryptographyengineering.com/2014/11/27/zero-kno... was a good intro for interactive ZK proofs but I haven't been able to find something for non-interactive ones.
This blog post comparing ZK-STARKs to erasure coding is in the right flavor but didn't quite stick to my brain either: https://vitalik.eth.limo/general/2017/11/09/starks_part_1.ht...