Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

EMD is something that really needs more, better implementations.

The one that everyone uses from Python isn't the easiest thing to install, doesn't have a great API and isn't easy to extend.

I think Gensim recently added it, but I think they use the same backend solver.

Edit: this is a better article on EMD anyway: https://markroxor.github.io/gensim/static/notebooks/WMD_tuto...

Edit 2: I forget Textacy has an implmentation built on Spacy. Still uses the same backend solver, but the API is nice (https://chartbeat-labs.github.io/textacy/api_reference.html#...)



It needs the Hungarian algorithm to solve it and it's not the easiest algorithm to implement. In fact, it's by far the hardest algorithm I've implemented (I can't exactly remember why). I wrote it in Common Lisp and worked on the performance quite a bit. It's still an O(n^3) algorithm, though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: