Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

playing devils avocate

> the fulltext tool, can and should hold only 'active' data

very possible with postgres, too. instead of augmenting your primary table to support search, you would have a secondary/ephemeral table serving search duties

> as data size is smaller, it better fits in RAM

likewise, a standalone table for search helps here, containing only the relevant fields and attributes. this can be further optimized by using partial indexes.

> as FTS tools are usually schema-less, there is no outage during schema changes (compared to doing changes in ACID db)

postgresql can be used in this manner by using json/jsonb fields. instead of defining every field, just define one field and drop whatever you want in it.

> as the indexed data are mostly read-only, the can be easily backup-ed

same for postgres. the search table can be exported very easily as parquet, csv, etc.

> as the backups are smaller, restoring a backup can be very fast

tbh regardless of underlying mechanism, if your search index is based on upstream data it is likely easier to just rebuild it versus restoring a backup of throwaway data.

> The PostgreSQL has had several longer outages - during major upgrades, because of disk corruption, because of failed schema migrations, because of 'problems' between the chair and keyboard etc...

to be fair, these same issues can happen with elasticsearch or any other tool.

how big was your data in solr?



PostgreSQL is over 90GB, SOLR is under 1GB.

The PostgreSQL has to handle writes, reports, etc..., so I doubt it will cache as efficiently as full-text engine, you'll need to have full or partial replicas to distribute the load.

And, yes, I agree, almost all of this can be done with separate search table(s), but this table(s) will still live in a 'crowded house', so again replicas will be probably necessary at some point.

And using replicas brings new set of problems and costs ;-)

One client used MySQL for fulltext search, it was a single beefy RDS server, costing well over $1k per month and the costs kept raising. It was replaced with a single ~$100 EC2 machine running Meilisearch.


Meilisearch is good, probably sufficient for a vast majority of use cases.

In fact self hosting it is so easy and performant that I wonder how their paid saas is doing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: