Hacker News new | past | comments | ask | show | jobs | submit login

Superior in what ways?



With Elasticsearch and Solr, you can easily customize analysis and scoring. There are several scoring algorithms built into them [1] such as BM25 (the default now) which is considered the state of the art for keyword relevance. For analysis, you can remove stopwords, stem, apply synonyms, etc [2]. Elasticsearch is specifically designed to scale across multiple machines which is necessary for TB datasets. There are also things like "more like this" queries and context-aware spell checking. Some of that you can do with PG, but not all of it. If PG can do it, it is usually harder to set up.

1. https://www.elastic.co/guide/en/elasticsearch/reference/curr...

2. https://www.elastic.co/guide/en/elasticsearch/guide/current/...


The PostgreSQL team is working on some of the weak spots. Please have a look a the new 'RUM' index, that should improve ranking:

https://www.pgcon.org/2016/schedule/attachments/436_pgcon-20...

https://github.com/postgrespro/rum


Rich library of tokenizers and analyzers. A test proof analyzer model and pipeline. For full text search, different score modes are supported that is beyond the trivial case mentitioned tf-idf model, how are going to do field centric ranking in postgres?

As far as I am concerned, it is far superior to the goodies that mentioned in this article.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: