Solr and Elasticsearch are not relational database systems and have none of the optimizations for relational data that Postgres does. They are built on top of Lucene and target document retrieval via an inverted index [1] rather than dealing with relational data.
It's not just relational features that matter but the decades of work on reliability, performance, flexibility, security, and general usability of these databases that makes them so great.
ElasticSearch doesn't have ACID or very good OLTP manners. More importantly, it still continues to have data loss issues so it's not a good fit as a primary datastore.
It's not about results, the data itself within ElasticSearch is not guaranteed safe as a primary datastore, so you should have a reliable database for the primary data that is then synced to ES.
That's good for you, but it sounds like you're extrapolating it on just your experience rather than research? Perhaps you've been lucky? Do you really have the info to objectively state it as "simply not true"?
Read the jepsen tests (old but show core problems that aren't fixed):
After reading the links you gave, I’d like to clarify that I meant that edge cases that can cause data loss are not common enough, in my experience, for one to be forced to treat ElasticSearch as an unreliable data store. Anecdotally, I’ve never seen it happen. You’re right, maybe I’m lucky; I would like to see a case study of some project where these issues were significant enough to treat ElasticSearch as an unreliable data store.
I've never built something that did not need a relational database to model the domain, and search is such a common UX element that it's taken for granted in everything you use.
I can't even think of something that could use search that's not relational. Can you?
> I've never built something that did not need a relational database to model the domain
Yeah, so perhaps you don't have much experience from outside the relational model. Makes perfect sense to see a single truth then.
> I can't even think of something that could use search that's not relational. Can you?
Not sure how wide a blanket you're throwing by "not relational", but here are a couple examples that might fit: Google's search engine, most filesystems and their files, hierarchical and flat data structures in general, twitter.
Elasticsearch stores data in a JSON document store style format. So any situation where you need O(1) lookup for all data related to a particular entity you can have nested structures within JSON that facilitates this. In a RDBMS it is O(n) since you have to do a costly join for each embedded structure.
A document store is not a replacement for a relational database, and there are few cases where it's truly a better fit.
More importantly, ES is not a document store. It is a search system that can be used as a similarity index between all kinds of data, most commonly json docs but can be anything like images or audio waveforms.
ES can return none of the source data or not even store it in the first place. In addition, it lacks ACID and transactions and generally has an overall poor reputation for data integrity.
If you just want a json blob, you can already use a relational database column for it and get the key/value performance semantics, while still retaining all the reliability and manageability properties.
If you want to save JSON documents, you can pick Postgres [1] or MySQL and still have a RDBMS for free additionally.
Elasticsearch does that too, but its access certainly isn't O(1) - it has a complexity of O(log n), just like Postgres or MySQL would have, because all of them use tree-like index data structures. (Please note that ElasticSearch is not meant as a data storage layer)
Now, if you just want to have the JSON, you're done. O(log n) for ElasticSearch, Postgres or MySQL. That's it.
But normally, you want to do something with that data. You have a user and want all the products. Or you have a book and want all the authors, because you have a real app to write.
Most data in our world is relational: A student can partitipate in courses, a store sells products, and so on.
Queriyng your database using JOINs will make use of indexes in each of the tables that you have previously set up.
These indexes will prevent the theoretical worst-case (the "Full Table Scan", where your complexity is O(n) for the table). The query planner will decide which index is better to use for the optimal response time.
You can, however, choose to not leverage this feature of RDBMS. What do you do now, if you need all the courses of a student? Do you get all the JSONs and compare their IDs? Do you parse the JSON keys in your own app? But having your app iterate over all the JSON keys already has O(n) complxity. So you end up recreating tree-like indexes anyway somewhere in your stack.
But all of that is already provided for you, with decades of debugging and performance optimization - for free.
To sum up: Use Postgres JSONB type if you only need to write JSON documents. You might later want to query the nested JSON data structure and join it. I highly recommend "Mastering PostgreSQL in Application Development" [2]
The comparison was between ElasticSearch and a relational store. PostgreSQL's JSON type is not relational. And if you are fetching any entity by a primary key then it should be O(1). The point was that document stores offer you the ability to nest data structures whilst relational forces you to join. And so if that is your query/data pattern then a document store can be orders of magnitude faster than a purely relational store e.g. 360 Customer View.
Primary key fetching will always be O(log n) due to how indexes are implemented. Access time complexity will never be independent of the item count in your data store, relational or not.
There's not many use cases I come across where the 'nested' data is the only instance of that data. A student record might have some nested class information, but the class information itself lives as a master someplace else. If you nest things, you now have data living in multiple places, no? Updates need to update multiple locations, which would potentially introduce lots of errors.
I've got a handful of situations where nested structures are making sense, but it doesn't seem to be the norm (at least in the data worlds I work in).
[1] https://en.wikipedia.org/wiki/Inverted_index