Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So lets skip the malice condition and go to the plain accident one; there are more chunks of content in the world than there are bits in any of the hashes we could use. We are eventually going to run into collisions.


Arbitrary example: If you're using SHA256 you're emitting 256bit hashes. Hopefully the distribution of SHA256 outputs is indistinguishable from random (or close enough) or else we've got big problems in other domains.

If you're drawing at random from a pool of size k you need approximately sqrt(k) draws until you reach a ~50% chance of a collision[0].

With 256 bits, there are 2^256 possibilities, so following the rule-of-thumb you'd need 2^128 draws until you had a 50% chance of a collision.

2^128 > # of atoms in the universe.

If you adjust your risk tolerance you'll have different numbers come out, but the chance of a collision in any realistic scenario is negligible.

[0]: https://en.wikipedia.org/wiki/Birthday_problem


> there are more chunks of content in the world than there are bits in any of the hashes we could use

The probability of collision is still negligible.


Come on, you must know that hashes are astronomical, unpredictable and collision-free for any earthly purpose. If not, just add a bit.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: