jamie=# ((select a from nums) union (select a from nums)) order by b;
ERROR: column "b" does not exist
This seems to point to a missunderstanding of how sql works. It's not a table query language, but a set query language. Tables are just a way to store sets while thay are not in use.
(select a from nums)
This part creates a new set from the "a" column of nums. The column name is carried over as convinience but the nums table is not visable from the union.
What I’ve noticed in my years with various databases is that many SQL parsers error on ambiguity and not necessarily strict compliance. In the case above the parser is likely dropping the parentheses where in the union there’s ambiguity because of the use of two sets and it can’t tell between which set’s b you’d like to order by.
I wonder what it would do if you aliased the subquery and ordered by alias_one.b
I think it's even simpler. Starting a subquey here doesn't make sense since there is no main query. So the parens gets dropped promoting this query to main. Same with the union
But, if "QE is a <query expression body> that is a <query term> that is a <query primary> that is a <simpletable> that is a <query specification>" then the <order by clause> may select columns from the <table expression> even though those don't really exist any more by the time the <order by clause> runs. It's tricky to implement.
I agree, I should have written that it helps to think in sets rather than tables. SQL implementors do a lot of fantastic shit to get our queries to run in a reasonable time, and SQL is not one of those langages that are like "understand this one core principle, and you get it" like lisp, prolog, TCL, Smalltalk, Forth etc. Have a look at the query plan if your DBMS can output it, it's not always easy to guess what it will spit out. In the case of
(select a from nums) order by b;
my best guess is that the parser helpfully strips out the parens as because the "order by" needs to be a part of a table expression to make sense. This expression is a syntax error in SQLite by the way.
IIRC most SQL engines do FROM → WHERE → GROUP BY → HAVING → ORDER BY → SELECT → LIMIT → UNION, which is why you can SELECT after ORDER BY in the previous example but not after union.
If I were to start SQL from scratch I would allow statements in an arbitrary order and execute them in whichever order I coded them.
But then i will not be SQL. SQL is built like this because all these operations are projections on a set. You can not really change their order and have meaningful results.
Some oeprators can be swapped, but this part of the optimization process and nothing you need to be aware of.
that hellscape doesn't exist and I am glad it doesn't. The order should just be considered a spec and accepted and worked around -- I think SQL has proven its staying power over the years.