Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wasn’t able to achieve the same performance with RedShift aggregates, I tried that first before I decided to migrate from RedShift to Druid back in 2014. We deal with dozens of dimensions per event and no combination of distribution keys in Redshift was able to give up the same performance over arbitrary scans+ aggregations.

Druid is not only fast because it pre-aggregates, but the memory structure is designed for scans over hundreds of dimensions.

Materializing views in BigQuery is just one DAG task. Unless you don’t have something like Airflow on your stack, I don’t see how it is worth mentioning. We are talking about denormalized data, time series data.

I am speaking from experience with each one of these products. Perhaps I did it all wrong, but we certainly achieved the objectives we were after.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: