Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not saying you need to do all "best practices" blindly but if insert speed is your problem (at a high level) batching is the very first thing to explore.

Without any specific numbers backing up the 10x we can only guess what improved 10x. All of those things you listed show up as CPU wait events as well. Without specifics I assume he means they inserted the same row count in 1/10th of the time. Not that there was a direct drop in CPU tasks.



Author here. We were under the assumption that CPU was mostly being used for evaluating the partial index predicates. Under this assumption, we figured batching was unlikely to yield much of a benefit. It wasn't until we actually profiled Postgres did we realize batching would be worth a try.

As for the numbers, we specifically got a 10x improvement in ingestion throughput.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: