That is strong typing. Python, a dynamically typed language, exhibits the same q...

mytherin · on Jan 30, 2023

Types are static within the context of a query, not necessarily within the context of the lifetime of a database. Types are determined and propagated at query compile time. In a query, the value "v" refers to a VARCHAR column (as determined by the table definitions). At query compile time an error is thrown when a type violation is detected.

It is distinct from Python because in Python individual objects have types - and type resolution is done strictly at run-time. There is no compile time type checking because types do not exist at compile time at all. Note how in my example I am preparing (i.e. compiling) a query, whereas in your example you are executing the statement.

If you refrain from changing the types of columns in a database and prepare your queries you can detect all type errors before having to process a single row. That is not possible in a dynamically typed language like Python, and is very comparable to the guarantees that a language like Typescript offers you.

randomdata · on Jan 30, 2023

> Types are static within the context of a query

Types in Python are also static within the context of a statement. That's not particularly meaningful, though, as programs don't run as individual statements in a vacuum. Just like SQL, the state that has been built up beforehand is quite significant to what each individual statement ends up meaning.

> types do not exist at compile time at all.

Same goes for SQL. Given the previous program, it is impossible to determine what the type of v is until you are ready to execute the SELECT statement. If the CREATE TABLE statement fails, v will not be the VARCHAR you think it is. If someone else modifies the world between your CREATE TABLE statement and your SELECT statement, v also will not be the VARCHAR you think it is. All you can do is throw the code at the database at runtime and hope it doesn't blow up.

> whereas in your example you are executing the statement.

What do you think "" + 42 executes to, then? The thing is, it doesn't execute because it doesn't satisfy the type constraints. It fails before execution. If it were a weakly typed language, like Javascript, then execution may be possible. Javascript produces "42".

> and is very comparable to the guarantees that a language like Typescript offers you.

Not at all. Something like Typescript provides guarantees during development. SQL, being a dynamically typed language, cannot know the types ahead of time – they don't exist until the program has already begun execution – and will throw errors at you in runtime after you've moved into production if you've encountered a problem.

mytherin · on Jan 30, 2023

> Same goes for SQL. Given the previous program, it is impossible to determine what the type of v is until you are ready to execute the SELECT statement. If the CREATE TABLE statement fails, v will not be the VARCHAR you think it is. If someone else modifies the world between your CREATE TABLE statement and your SELECT statement, v also will not be the VARCHAR you think it is. All you can do is throw the code at the database at runtime and hope it doesn't blow up.

If you are working in a hypothetical adversarial world where people are changing data types out from under you then that might happen. That will also happen with any other shared data source, regardless of what language you might be using. If you are sharing a set of JSON documents and people start altering documents and removing keys then your Typescript program that ingests those documents will also start failing.

That is not a problem of the language but a fundamental problem of having a shared data source - changing to a different language will not solve that problem.

I think you are conflating SQL the language with a relational database management system. A relational database management system models a shared data source - and a shared data source can be modified by others. That introduces challenges no matter what language you are using.

> What do you think "" + 42 executes to, then? The thing is, it doesn't execute because it doesn't satisfy the type constraints. It fails before execution. If it were a weakly typed language, like Javascript, then execution may be possible. Javascript produces "42".

It is fundamentally different. Given a Python program, you cannot know whether or not it will produce type errors at run-time without executing the program. Type checking is done at run-time as part of the execution of the "+" operator. In SQL, type checking is done as a separate compilation step that can be performed without requiring any data. In Python the object has a type. In SQL the variable has a type.

Given a fixed database schema and a set of queries, you can compile the queries and know whether or not the queries will provide type errors. You cannot do this with Python.

> Not at all. The benefits of something like Typescript is that the guarantees are provided during development. SQL cannot know the types ahead of time – they don't exist until the program has begun execution – and will throw errors at you in runtime after you've moved into production if you've encountered a problem. That's exactly when you don't want to be running into type programs, and why we're largely moving away from dynamically typed languages in general.

SQL can provide the same guarantees as Typescript. However, unlike Typescript in which the variables are fixed, SQL always deals with processing data from an external data source - the tables. The data sources (i.e. tables) can be changed. That is not a problem a language can solve.

randomdata · on Jan 30, 2023

> Given a Python program, you cannot know whether or not it will produce type errors at run-time without executing the program.

Like you mentioned earlier, if you refrain from changing types, a compiler can theoretically determine Python types as well. The Typescript type checker when run on Javascript code is able to do this. But that puts all the onus on you to be super careful instead of letting the language hold your hand.

> SQL can provide the same guarantees as Typescript.

1. It cannot as it does not provide guarantees about its memory model. Declaring a variable as a given type does not guarantee the variable will be created with that type. Typescript does guarantee that when you declare a type that's what the variable type is going to be.

2. As SQL is dynamically typed, allowing you to change the type of a variable mid-execution, what type a variable is depends on when that statement is up for execution. And as SQL is not processed sequentially, at least not in a meaningful sense, it is impossible for a compiler to determine when in the sequence it will end up being called upon.

mytherin · on Jan 31, 2023

Again, you are still conflating tables in a relational database with variables in Typescript. While I can understand the confusion - given that SQL makes it so natural to interact with tables - they are not equivalent.

SQL as a language is statically typed. It has variables that are statically typed [1], much like Typescript. All columns in a query have fixed types, and there is a separate compilation phase that resolves types of parameters - much like any other statically typed language.

SQL generally operates on tables - and tables model a persistent, shared data source. The Typescript equivalent to a table is JSON files on disk. There is a fundamental need to be able to change data sources as time goes on and business requirements change. That is why SQL supports modifying shared data sources, and supports operations like adding columns, changing column types and dropping columns. SQL deals with the fact that changes can be made to tables by rebinding (recompiling) queries when a data source is changed.

Any hypothetical language that will replace SQL will have to deal with the very same realities of people wanting to modify shared data sources. A different language cannot solve this problem because there is a fundamental need to change how data is stored at times.

Perhaps what you are looking for is better tooling around this problem. SQL can be used to power such tooling, because of its strong static type system that works alongside the persistent shared data source. For example - you could check if all your queries will still successfully compile after changing the type of a column.

[1] https://www.postgresql.org/docs/current/plpgsql-declarations...

randomdata · on Jan 31, 2023

> Again, you are still conflating tables in a relational database with variables in Typescript.

No, we're only talking about SQL here, and specifically to the example you gave. It demonstrated strong typing, but not static typing. I am not sure where you think Typescript comes into play with respect to the core discussion.

> SQL as a language is statically typed.

As before, SQL defines the ALTER statement. SQL is no doubt strongly typed. Once you declare a type it will enforce that type until such time as it is explicitly changed (SQLite notwithstanding), but it can be changed. To be static means that it cannot change.

> SQL generally operates on tables - and tables model a persistent, shared data source. The Typescript equivalent to a table is JSON files on disk.

That's but an implementation detail. Much of the beauty of SQL, the language, is that it abstracts the disk and other such machine details away. Conceptually all you have is a machine that has functions that operate on sets (or vectors, perhaps, since SQL deviates from the relational model here) of tuples. Perhaps your struggle here is that you are hung up on specific implementations rather than the concepts expressed in these languages?

> Any hypothetical language that will replace SQL will have to deal with the very same realities of people wanting to modify shared data sources.

That's not a reasonable assumption. In practice you don't want arbitrary groups of people modifying a shared data source. You want one master operator that maintains full control of that data source, extracting the information other people need on their behalf. The n number of users feeding a virtual machine code fragments to build up an entire application is an interesting concept, but one that I am not sure has proven to be successful. It turns out we've learned a lot about software development since the 1970s. These days we usually build a program that sits in front of the SQL program to act as the master source in order to hide this grievous flaw, but a hypothetical SQL replacement can address it directly.

> SQL can be used to power such tooling, because of its strong static type system

Let's go back to your original example:

    CREATE TABLE varchars(v VARCHAR);
    ALTER TABLE varchars ALTER COLUMN v TYPE INTEGER USING v::integer;
    PREPARE v1 AS SELECT v + 42 FROM varchars;

What type is v? Well, it could either be a VARCHAR or an INTEGER. It depends on when the machine gets the `PREPARE v1 AS SELECT v + 42 FROM varchars;` statement. SQL makes no claims about order of operations, so PREPARE could come at any point. Therefore it is impossible for such tooling to figure out what type v is. If SQL were statically typed you could derive more information, but as it is dynamically typed...

Now, you said before that each statement when observed in isolation is statically typed. While I suppose that is true to the extent that the the type won't randomly change in the middle of the execution of that statement, the statement alone doesn't provide type information either. Parsing `PREPARE v1 AS SELECT v + 42 FROM varchars;` in isolation, we still don't know what v is.