Regarding that rate, I’m no expert but my guess is that it’s the result of math,...

heavyset_go · on Aug 9, 2021

The author addresses this point:

> Perhaps Apple is basing their "1 in 1 trillion" estimate on the number of bits in their hash? With cryptographic hashes (MD5, SHA1, etc.), we can use the number of bits to identify the likelihood of a collision. If the odds are "1 in 1 trillion", then it means the algorithm has about 40 bits for the hash. However, counting the bit size for a hash does not work with perceptual hashes.

> With perceptual hashes, the real question is how often do those specific attributes appear in a photo. This isn't the same as looking at the number of bits in the hash. (Two different pictures of cars will have different perceptual hashes. Two different pictures of similar dogs taken at similar angles will have similar hashes. And two different pictures of white walls will be almost identical.)

> With AI-driven perceptual hashes, including algorithms like Apple's NeuralHash, you don't even know the attributes so you cannot directly test the likelihood. The only real solution is to test by passing through a large number of visually different images. But as I mentioned, I don't think Apple has access to 1 trillion pictures.

> What is the real error rate? We don't know. Apple doesn't seem to know. And since they don't know, they appear to have just thrown out a really big number. As far as I can tell, Apple's claim of "1 in 1 trillion" is a baseless estimate. In this regard, Apple has provided misleading support for their algorithm and misleading accuracy rates.