Based on my understanding of DB storage (which is decades old and as I write the...

jhgb · on July 11, 2022

> That data is stored in a similar order on disk (is this still true?).

There should be no requirement for this. Columns in relations are not conceptually ordered, so it shouldn't matter for the things you're doing with the data anyway, and the database should be able to reorder the data in whatever way it likes, since desire to isolate the user from physical data structures was one of the main reasons for the rise of RDBMS.

masklinn · on July 11, 2022

1. That’s not exactly true, AFAIK all RDBMS return columns in table order when order is unspecified (`*`), and while they could reorder on retrieval

2. postgres definitely does not, and column tetris is absolutely a thing in the same way struct packing is (with the additional complexity of variable-size columns)

jhgb · on July 11, 2022

But the order of attributes in the presentation of the result relation has nothing to do with physical layout of base relvars -- primarily because any relationship between the two is purely coincidental. The vast majority of useful queries will not reuse the order of attributes in base relvars, so optimizing for the unusual trivial case by prohibiting better rearrangements that could be useful for a much larger number of use cases seems rather pointless.

And what PostgreSQL does is of course an implementation detail of PostgreSQL.

MichaelApproved · on July 11, 2022

> the database should be able to reorder the data in whatever way it likes

My point was that you’re trying to prevent the DB from reordering the data because there’s a performance cost when that happens.

jhgb · on July 11, 2022

Why would you be trying to prevent the DB from reordering the data? You're not supposed to have better knowledge of what's good for your use case than an RDBMS that can collect usage statistics on queries and such. Ditto for compilers rearranging structures and such. When you start having hundreds of tables and thousands of queries, I don't see how you can do a better job than an automated system at that point.

MichaelApproved · on July 11, 2022

> Why would you be trying to prevent the DB from reordering the data?

Sure. “Reduce the need” would be a better word than “prevent”.

If I can do a good job organizing the table columns (as described above) it’ll lower the need for the DB to reorder data.

Reduced need to reorder, improves performance.

jasonwatkinspdx · on July 11, 2022

DB storage is a lot more sophisticated than what you were taught.

Most databases use Slotted Pages to organize storage. Pages are fixed size and numbered by their offset within the database file. The page header contains the number of rows, followed by an array of offsets for individual rows within the page. Rows themselves generally are stored at the end of the page filling downwards. The storage engine can move around rows in arbitrary ways to consolidate free space.

Fundamentally there's no connection between SQL schema order and how table storage is organized on disk. For example in a column store there's often no contiguous row stored anywhere, instead there's just separate indexes per column.

lsaferite · on July 11, 2022

That would seem like an argument in favor of allowing column reordering.