This is a common operation for my team as well. In my old project, so many index...

bshipp · on March 19, 2019

The ENUM type sounds like a great little tweak; I've got quite a few small tables that describe a fixed number of rows so I'll definitely investigate that.

I'm my own worst enemy for indexes, because I'm not only the database administrator but also the analyst. So when I'm trying to solve an ad-hoc problem and encounter a long-running query there's a 99% probability that I may have "accidentally" generated a new index to accelerate that query and then forgotten to remove it when I was done.

Lesson learned, and also a good argument for disaggregating the administrative and analytical user permissions.

aidos · on March 19, 2019

More recently I’ve got into the habit of creating a new schema when I’m analysing / patching data so I can pull out the data I need into new tables and index accordingly. Because I do it from vim and have a plugin to execute the queries, I just create a record as I go (and I set the path to hit my new schema first so any new tables go there).

bshipp · on March 19, 2019

Ooooh, I like that idea a lot, especially since I can assign it a tablespace on a different drive.

Jedd · on March 19, 2019

> Other big boons for us have been using ENUM types where necessary (small known dataset for a column) -- now your column takes 4 bytes instead of N bytes for a string. I find them a bit easier to work in than foreign keys for this optimization because of their direct string mapping.

Can you give an example?

It sounds like normalising the schema would have similar (perhaps more) benefits.

joeyrobert · on March 21, 2019

Normalizing has similar size benefits. For our use case (big aggregate reporting tables), incoming external data is not normalized, meaning we'd have to have another process iterate over it with foreign table in memory and map it to the foreign key. Enum's can be transparently ingested without this normalization requirement, while taking equivalent space. There's also now one level of indirection between the column and its value. Both are fine.

eropple · on March 20, 2019

Postgres keeps ENUM types in memory at all times. A join table is probably a slower option, all other things being equal. There are cases where it's worthwhile (dynamically expanding the values that would be in that enum, for example), but it's a tradeoff.

paulddraper · on March 19, 2019

They were useful. Unfortunately enums can't be changed (e.g. add a value) in a transaction without some extra work