> the fulltext tool, can and should hold only 'active' data
very possible with postgres, too. instead of augmenting your primary table to support search, you would have a secondary/ephemeral table serving search duties
> as data size is smaller, it better fits in RAM
likewise, a standalone table for search helps here, containing only the relevant fields and attributes. this can be further optimized by using partial indexes.
> as FTS tools are usually schema-less, there is no outage during schema changes (compared to doing changes in ACID db)
postgresql can be used in this manner by using json/jsonb fields. instead of defining every field, just define one field and drop whatever you want in it.
> as the indexed data are mostly read-only, the can be easily backup-ed
same for postgres. the search table can be exported very easily as parquet, csv, etc.
> as the backups are smaller, restoring a backup can be very fast
tbh regardless of underlying mechanism, if your search index is based on upstream data it is likely easier to just rebuild it versus restoring a backup of throwaway data.
> The PostgreSQL has had several longer outages - during major upgrades, because of disk corruption, because of failed schema migrations, because of 'problems' between the chair and keyboard etc...
to be fair, these same issues can happen with elasticsearch or any other tool.
The PostgreSQL has to handle writes, reports, etc..., so I doubt it will cache as efficiently as full-text engine, you'll need to have full or partial replicas to distribute the load.
And, yes, I agree, almost all of this can be done with separate search table(s), but this table(s) will still live in a 'crowded house', so again replicas will be probably necessary at some point.
And using replicas brings new set of problems and costs ;-)
One client used MySQL for fulltext search, it was a single beefy RDS server, costing well over $1k per month and the costs kept raising. It was replaced with a single ~$100 EC2 machine running Meilisearch.
> the fulltext tool, can and should hold only 'active' data
very possible with postgres, too. instead of augmenting your primary table to support search, you would have a secondary/ephemeral table serving search duties
> as data size is smaller, it better fits in RAM
likewise, a standalone table for search helps here, containing only the relevant fields and attributes. this can be further optimized by using partial indexes.
> as FTS tools are usually schema-less, there is no outage during schema changes (compared to doing changes in ACID db)
postgresql can be used in this manner by using json/jsonb fields. instead of defining every field, just define one field and drop whatever you want in it.
> as the indexed data are mostly read-only, the can be easily backup-ed
same for postgres. the search table can be exported very easily as parquet, csv, etc.
> as the backups are smaller, restoring a backup can be very fast
tbh regardless of underlying mechanism, if your search index is based on upstream data it is likely easier to just rebuild it versus restoring a backup of throwaway data.
> The PostgreSQL has had several longer outages - during major upgrades, because of disk corruption, because of failed schema migrations, because of 'problems' between the chair and keyboard etc...
to be fair, these same issues can happen with elasticsearch or any other tool.
how big was your data in solr?