Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Huge "it depends", but typically organizations are not querying all of their data at once. Usually, they're processing it in some time-based increments.

Even if it's in the TB-range, we're at the point where high-spec laptops can handle it (my own benchmarking: https://ibis-project.org/posts/1tbc/). When I tried to go up to 10TB TPC-H queries on large cloud VMs I did hit some malloc (or other memory) issues, but that was a while ago and I imagine DuckDB can fly past that these days too. Single-node definitely has limits, but it's hard to see how 99%+ of organizations really need distributed computing in 2025.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: