Unrelated to post, but as you seem well informed in the field, would you agree t...

sgk284 · on Aug 30, 2021

> would you agree that if a schema is not likely to change and is controlled as you put it, there is no reason to attempt to store that data as denormalized document

As a general rule of thumb, yes. Starting with denormalization often opens you up to all sorts of data consistency issues and data anomalies.

I like how the first sentence of the Wikipedia page on denormalization frames it (https://en.wikipedia.org/wiki/Denormalization):

> Denormalization is a strategy used on a previously-normalized database to increase performance.

The nice thing about starting with a normalized schema and then materializing denormalized views from it is that you always have a reliable source of truth to fall back on (and you'll appreciate that, on a long enough timeline).

You also tend to get better data validation, reference consistency, type checking, and data compactness with a lot less effort. That is, it comes built into the DB rather than introducing some additional framework or serialization library into your application layer.

I guess it's worth noting that denormalized data and document-oriented data aren't strictly the same, but they tend to be used in similar contexts with similar patterns and trade-offs (you could, however, have normalized data stored as documents).

Typically I suggest you start by caching your API responses. Possibly breaking up one API response into multiple cache entries, along what would be document boundaries. Denormalized documents are, in a certain lens, basically cache entries with an infinite TTL... so it's good to just start by thinking of it as a cache. And if you give them a TTL, then at least when you get inconsistencies, or need to make a massive migration, you just have to wait a little bit and the data corrects itself for "free".

Also, there are really great horizontally scalable caching solutions out there and they have very simple interfaces.

geitir · on Aug 30, 2021

Thanks for your response. The comparison between infinite ttl cache entries and a denormalized doc is an insight I can't say I've had before and makes intuitive sense