Do different kinds of indexes work better for columnar storage? Or is it the sam...

hodgesrm · on Aug 3, 2023

Difference principles of indexing, as least based on my experience with ClickHouse.

* Column-based stores have really fast scans due to compression and vectorization, so you'll generally always read down the column. The way to speed it up is to have "skip indexes" that allow you to skip blocks, e.g., don't even bother to read/decompress them.

* Commonly used indexes need to be very sparse, so they fit in memory even when tables run to hundreds of billions of rows.

* Finally highly compressed columns can be used as indexes to filter data rapidly. ClickHouse calls this PREWHERE processing.

Edit: clarify skip indexes

pella · on Aug 3, 2023

We need a spatial index for spatial (columnar) data!

- https://www.crunchydata.com/blog/the-many-spatial-indexes-of...

- http://postgis.net/workshops/postgis-intro/indexing.html

- Spatial indexes for OSM in PostGIS (PDF) : https://pretalx.com/media/sotm2019/submissions/CAD93S/resour...