Hacker News new | past | comments | ask | show | jobs | submit login

After skimming Knuth's paper — does the algorithm work if values are hashed, that is, the "uniform deviate" is selected deterministically for each unique value of the stream?



Not sure which Knuth paper you're referring to but skimming through the article my understanding is this algorithm works /only/ if the values are hashable. IOW how else does one define unique/distinct values ?



https://cs.stanford.edu/~knuth/papers/cvm-note.pdf

note how "u"s are selected every time value is not in a list. I don't read it as being a hash.


I think the analysis relies on independent random "u"s, even for the same key.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: