Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Whilst abhorrent, it's easy to see how this happened. No matter how much you train a machine it will make mistakes in this kind of classification when natural language is involved. Identifying that an article is about something or someone is a very very hard problem.


I agree. I feel this points towards the limits of linguistic-based technology. Relying just on text strings to find correlations/matches is not enough. What I wonder is: why don't recommendation engines go beyond and try to cluster entities by more sophisticated means? It must be possible to determine that, in this case, the person involved was not clustered -not 'close enough'- to the universe of entities related to white supremacists. Relying just on words/names will lead to this kinds of results, specially where ambiguity is involved (the person's name is unfortunately very common).


True, but they could set the threshold higher. Like for instance have it match for both name, last name, and company. There is a tradeoff between accuracy and relevance and that can be tweaked.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: