This is really awesome. I've spent a ton of time optimizing high throughput PG instances. Every time one of my optimizations failed it was from an incomplete understanding of the physical storage layer and/or MVCC.
>First, TOAST stands for The Oversized-Attribute Storage Technique, probably the best acronym in the history of Computer Science.
Actually, that price IMHO goes to GIN (Generalized Inverted Index) which I actually believed to be a coincidence until the people behind Postgres' GIN index announced their new VODKA index format
TWAIN isn't actually an acronym - it comes from "and never the twain shall meet", replacing east and west in that prose with scanners, printers and other digital image sources/sinks, representing the difficulty people had getting examples of such technology to cooperate with each other.
I took a database class in college which had a decent amount of theory behind it and found pretty interesting the way Postgres (and I presume other databases) handle multiple transactions simultaneously, including the isolation level of each transaction (read uncommited, read commited, repeatable read, and serializable).
> Coming from pretty heavy background in MSSQL internals, this article is really great.
Do you know of any good resources for the undocumented function fn_dblog? I'm looking to understand the structure of RowLog Contents and Log Record in different Operations/Contexts to reconstruct DDL/DML.
It'd not be particularly hard to raise it, although it'd require some code restructuring/duplication (for on-disk backward compat). I doubt it's really worth it anytime soon, such large datums imo have biggers problems than the length limits.
These 2 docs really help with a deeper understanding if you want to learn more: http://momjian.us/main/writings/pgsql/internalpics.pdf http://momjian.us/main/writings/pgsql/mvcc.pdf