Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know, my personal opinion was that RethinkDB had its head on straight, MongoDB is garbage and still neither is so much better than PostgreSQL that I will switch away from it.

Postgres is the default datastore (because schemas are awesome) for me, and I haven't had a use case yet where I needed something that Postgres wouldn't do. Maybe if you have a very specific need, you'd reach for another datastore (come to think of it, I have successfully used Cassandra for multi-DC deployments of a distributed read-only store), but that's not the norm for me.



I was truly hoping that RethinkDB would be the Mongo that Mongo could have been: a NoSQL database, but one with joins; a NoSQL database that is actually CP [0]; a database that comes out of the box with granular real-time updates (so you don't even need to worry at first about the extra moving parts of a Redis server or other queue). For a business starting from scratch with those needs, I'm making do with Mongo, but I was anxiously waiting for RethinkDB to get more battle-tested... alas, it wasn't meant to be. And I fear that we'll be stuck with Mongo for quite a while.

The truth is that NoSQL has great promise, but the folks with actual money to spare today are the ones who want a better SQL database, because they can afford the tradeoff of development time for reliability that SQL administration and migration provide. And people who solve the SQL pain points, such as the excellent folks at CockroachDB, and the various folks building replication layers on top of Postgres, can hopefully keep the market for database startups alive long enough for someone else to fix the NoSQL landscape as well.

[0] https://aphyr.com/posts/284-call-me-maybe-mongodb , where one finds that Mongo actually isn't guaranteed consistent during partitions, actually did hit us in production, and we had to dig through a rollback file manually (shudder); fun times. On the other hand, see: https://aphyr.com/posts/330-jepsen-rethinkdb-2-2-3-reconfigu...


I think Rethink would be that, but I don't know, I've never found SQL databases that hard to manage. I've found NoSQL databases hard to manage, because, invariably, someone will be doing a write right when you're trying to do a data migration/transformation, and now you have a single record that looks wrong, and it's a bug waiting to happen on read, rather than on write (where it should).

Rethink definitely had its use cases, I just never saw it as my primary data store.


That's been my experience as well. Postgres handles everything until you hit an edge case on some of your data (volume writes with scale out, performance critical advanced functionality, heavy BI work with huge datasets, etc).

When you hit those edge cases or specific needs, you implement the specialty solution. Most people never come close to those edge cases though. When they do, PGs foreign data wrappers make the adjustment as smooth as possible.


> heavy BI work with huge datasets

Because of single threaded query execution, lack of MPP, or something else?


I agree Postgres is pretty awesome, but I think the one thing it has going against it is a good GUI client. MySQL has way too many good options wherease Postgres is stuck on this front.


Please try Postage. It's free for all users, open source, comes in server and desktop distributions, and we listen to our users. You can get it at https://github.com/workflowproducts/postage

If it doesn't meet your needs please file an issue, our goal is to remove the no-good-GUI limitation that was imposed on postgres sixteen years ago.


What do you mean with "imposed"?


Imposed, as in, to put something in place on purpose.


Who imposed it? There's not exactly much central control over postgres in general, and just about none over the tooling built outside the core repo.


I've tried really hard to like Postgres, but coming from SQL Server land, the tooling is mind blowingly atrocious. Every few months I go on a google spree trying to find the magic tool to make it less painful, but everything is either just as awful as pgadmin or costs way too much.


Microsoft gets a lot of shit for various and mostly good reasons. But their tooling is seriously top notch. I would murder kittens to get tools like SSMS for postgres.

Hell Python Tools for Visual Studio almost makes me want to do python dev work on a PC.


We'd would love to have your help on Postage. We want to be the Microsoft SQL Server Management Studio of PostgreSQL. Just file issues at Github for bugs and features, we'll do our best to keep up.


JetBrains DataGrip is also worth considering - another polished quality app from JetBrains: https://www.jetbrains.com/datagrip/


I've been meaning to try this. I'll check it out.


I worked with people in the past that refused to touch postgres because it doesn't have a UI they like. They insisted on using MySQL for that sole reason.


+1. The main reason I didn't jump on to Postgress decades ago was because the tooling was really bad. Third party tools for MySQL were streets ahead, even back then.

End result is that I jumped on the MySQL bandwagon, and even all the negative posts against it now and the lauding of Postgres in here all the time cannot make me change tracks. I just have too much data and IP invested in it now.


Highly recommend DBeaver http://dbeaver.jkiss.org/


Just downloaded it and it seems effective and well-polished. Thanks for the recommendation, mySQL Workbench has some killer bugs that made me stop using it.


Will try. Thanks


Postico is very nice. I've been using it (and it's predecessor - PG Commander) for a few years. Worth the money I paid for it.

https://eggerapps.at/postico/


I developed and use(d) pgxplorer (github.com/pgxplorer). So, I can tell you this: the best out there is psql (CLI). There exists no use case where a GUI is better than psql.


Not even diagramming relationships between tables?


Those fall under db authoring/modeling tools. https://github.com/pgmodeler/pgmodeler is a good pick.


Postgres is horribly painful to cluster though, while RethinkDB clusters are trivially easy to set up.


Based on the postmortem, you weren't alone in thinking that way, it's just that all the people who did think so, didn't value the difference enough to actually pay for it. And, indeed, when something like Postgres is free...


Rethink, Mongo et al are scalable (at the expense of some ACID). PG is certainly a good DB (I use it in production) but it is at heart a traditional RDBMS - not designed for scalability. Apples vs Pears..


What do you mean pG isn't designed for scalability? Its SMP scaling is incredibly good, close to linear.


Postgres neither scales vertically or horizontally.

v9.6 finally brought some parallel scans but is still limited in scope and far behind the commercial databases. Same with replication and failover, although there are decent 3rd party extensions to get it working.


The vast majority of us probably won't ever need that scalability anyway. Most places I've seen that needed something more scalable needed to just stop being so retarded with the tools they already had.


As for the clustering in general... Scalability may be not important, but replication and failover should be. I believe, every place that had existed long enough must've had theirs "ouch, our master DB host went down" moment.

I wasn't able to set it up properly some years ago and settled down with warm standby replica (streaming + WAL shipping) and manual failover. Automating it (and even automating the recovery when old master is back online and healthy) was certainly possible - and I got the overall idea how it should operate. But the effort required to set it up was just too big for me, so I decided it wasn't worth the hassle and settled down with "uh, if it dies, we'll get alerted and switch to the backup server manually" scenario.

Having something that goes in line of "you start a fresh server, tell it some other server address to peer with, and it does the magic (with exact guarantees and drawbacks noted in documentation)" would be really awesome. RethinkDB is just like that. PostgreSQL - at least 8.4 - wasn't, unless I've really missed something. I haven't yet checked newer versions' features in any detail, so not sure about 8.5/8.6.


> you start a fresh server, tell it some other server address to peer with, and it does the magic

This is the one thing I ask of every database software, yet we still don't really have it. 90% of problems could be solved if there was a focus on the basics like easy startup, config and clustering.


SQL Server is almost that easy, but still doesn't handle schema changes well.


How so? Aerospike, ScyllaDB, Rethink, MemSQL, Redis are the only databases that get close to this.

SQL Server availability groups requires Windows Server Failover Clustering, which is not quick or easy.


You can try http://repmgr.org/ for automation.

It offers easy commands to perform failover and even has an option to configure automatic one.

After reading about github issues[1], I am a bit cautious about having automatic failover though.

[1] https://github.com/blog/1261-github-availability-this-week


Thanks for the link, I think I haven't saw this one before.

As for automation... Things can always go wrong, sure. But I wonder how many times HA and automatic failover had saved the day at GitHub so no outside observers had a faintest idea there was something failing in there.


Vertical scalability is important for everyone as it makes better use of hardware (lower costs or more performance).

Horizontal scalability is important for HA which every production environment would like or need.


I suppose the parent comment was more about scaling up for transactional workloads - the story there is a lot better (although there's still issues left we haven't tackled, but mostly on very big machines) than for analytics workloads. But yes, while we progressed in 9.6, there's still a lot of important things lacking to scale a larger fraction of analytics queries.


Don't have experience with Rethink, but Mongo scalable? That's a joke right?

Here how it compares: http://www.datastax.com/wp-content/themes/datastax-2014-08/f...

The only values that has higher than rest are on page 13, except those are latency and lower is better.

This paper shows clearly that Mongo doesn't scale.

And even for single instance Mongo is slower than Postgres with JSON data: https://www.enterprisedb.com/postgres-plus-edb-blog/marc-lin...

There are actually add-ons to Postgres[1] that add MongoDB protocol compatibility, and even with that overhead Postgres is still faster.

And even such benchmarks don't tell full story. I for example worked in one company that used Mongo to regional mapping (map latitude/longitude to a zip code and map IP address to ZIP). The database on Mongo was using around 30GB disk space (and RAM, because Mongo performed badly if it couldn't fit all data in RAM), mainly because Mongo was storing data without schema and also had limited number of types and indices. For example to do IP to ZIP mapping they generated every possible IPv4 as an integer, think how feasible that would be with IPv6.

With Postgres + ip4r (extension that adds a type for IP ranges) + PostGIS (extension that adds GEO capabilities (lookup by latitude/longitude that Mongo has, but PostGIS is way more powerful) things looked dramatically different.

After putting data using correct types and applying proper indices, all of that data took only ~600MB, which could fit in RAM on smallest AWS instance (Mongo required three beefy machines with large amount of RAM). Basically Postgres showed how trivial the problem really was when you store the data properly.

[1] https://github.com/torodb/torodb




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: