Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s difficult to accept the results without looking at the query plans to see how the queries are running and if they are optimal. Seems like it’s just a straight dump of data into PostgreSQL and letting SQLAlchemy handle some queries and indexes but no analysis of the indexes at all.

Could be able to squeeze more perf out of both SQLite and PostgreSQL.



The point of this exercise was to determine how these database perform with same schema, and if it makes sense to make the jump from SQLite to PostgreSQL or not. As a side note and since you ask, the indexes on this database are fairly well thought out, I have exhausted all ideas on indexing improvements to help with performance. I do mention a major implementation enhancement which is to precalculate monthly totals.

Also, another important point I tried to make is that this benchmark is useful for this particular deployment. I don't really want to convince you the numbers that I've got will also apply to you, my proposal is that you should test your application and make decisions based on your own results, which can certainly be different than mine.


While the indexes /may/ be pretty well thought out. That doesn’t mean that they are useful.

Indexes don’t work the same between SQLite and PostgreSQL so a 1:1 transition between the two databases on indexes probably isn’t going to yield the best results.

For example UUIDs were stored incorrectly in PostgreSQL.

SQLite does not enforce varchar lengths, and typically in PostgreSQL you just use text type and not impose any length restrictions unless absolutely necessary.

PostgreSQL can use multiple indexes in a single query which may help with the 2gb ram limits.

PostgreSQL supports partitioning which could help with the range queries.

All I’m saying is that I believe the performance you get from SQLite imo could be better, and the performance from PostgreSQL a lot better.


> typically in PostgreSQL you just use text type and not impose any length restrictions unless absolutely necessary.

There was a horror story posted on here a while back about some old API causing an outage because there happened to be a random text field that a user was effectively using as an S3 bucket. Their conclusion was to always set a sane length constraint (even if very large, say 1MiB for the smallest fields) on field sizes.

Hyrum's Law. Yes user input validation, blah blah, but that is infinitely more surface area than having a sane (or absurdly large) limit by default.


Right, but use a check constraint and not a data type constraint (i.e. `varchar(x)`).

That sounds more like a code review issue -- `blob` would've been an appropriate storage type in that case.


> That sounds more like a code review issue -- `blob` would've been an appropriate storage type in that case.

No, it was supposed to be a short text field. Not an infinite storage endpoint!


I use and prefer varchar(1000) personally, extremely easy to change in the future if needed.


There are also a shitload more indexing options available to you in Postgres.


> As a side note and since you ask, the indexes on this database are fairly well thought out, I have exhausted all ideas on indexing improvements to help with performance. I do mention a major implementation enhancement which is to precalculate monthly totals.

So, this is what Data Warehousing is all about: reporting related optimizations (typically what monthly totals and such are aimed at). The idea being that you trade disk for speed., so that kind of optimization is actually pretty normal if you've identified it as a bottleneck. There's no need necessarily to separate data warehousing constructs into another DB, though it's often good practice because sooner or later you're going to start running into problems on your important system related to storage inflation and non-indexed sorts/filtered.


I'm surprised you are the only comment mentioning this. I don't have quite enough knowledge to speak up about it here but it seems that while this article is about SQLite vs PostgresQL, it's really that neither are the ideal answer in this scenario.


exactamundo




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: