Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A Gentle Introduction to Liquid Types (ucsd.edu)
138 points by marvinborner on Sept 3, 2023 | hide | past | favorite | 104 comments


Liquid Types enabled a 6x speed up in high-performance parsing of UDP packets, according to "Scrap your Bounds Checks with Liquid Haskell" of Gabriella Gonzalez [1].

With Liquid Haskell, the bound checks are moved from runtime to compile time, semi-automatically handled by SMT-solvers. Static types help programmers to write correct programs faster, and the programs also run faster.

As an aside, speeding up programs with static analysis (constrained dynamism) are also present in Mojo (a variant of Python) or Swift [2].

[1]: https://github.com/Gabriella439/slides/blob/main/liquidhaske... "Scrap your Bounds Checks with Liquid Haskell"

[2]: https://github.com/modularml/mojo/discussions/466 "Mojo and Dynamism"


I assume Liquid Types can greatly increased performance in C programs where pointer checks are littered everywhere.


Maybe naive question, but what if a UCP packet comes in that doesn’t meet the requirements of the parser? Undefined behavior?


The type system removes checks that are redundant or unnecessary given the rest of the code. It doesn't assume things about external data.


Yeah, anything you assert has to be proved at runtime. Liquid Haskell just a) checks your assertions are consistent with the type returned and b) avoids you having to check again.

You could theoretically remove all the unnecessary checks by hand, but it’s tricky.


This seems to be a complication to haskell to fix a performance problem that is only in haskell.


I don't think that's fair. Automated bounds checking exists in many languages (and most modern high-level languages). In Rust you might be incentivised to use unsafe blocks for array access in performance sensitive code. Julia has its `@inbounds` annotation. Some languages might not have an escape hatch at all.

Automatically removing bounds checking where safe seems like a good goal, but in all but the most trivial cases its quite complex for the compiler to decide what's safe.


I think it's more than fair. C++ doesn't have this problem at all, bounds checks are in the std library in debug mode only. Rust and Julia make it very easy to take the bounds checks out. They don't need 'liquid types' or any extra complexity then a single keyword and even that is a little quirky.

Also if you are accessing a data structure out of bounds, that's a show stopping bug that shouldn't ever happen. That should be something that is prevented ahead of time, not recovered from. Tanking a program's performance to catch bugs in released programs is a bold strategy to begin with.


> C++ doesn't have this problem at all, bounds checks are in the std library in debug mode only. Rust and Julia make it very easy to take the bounds checks out.

Haskell can do that too, with unsafeTake and unsafeDrop [1], you can opt-out of runtime bounds checks as easily as C++, Rust, and Julia. But opting-out of machine-checked guarantees is unsafe, and Haskell can do better.

What's more is that Liquid Haskell can move such runtime bounds checks to compile time, made ergonomic with SMT solvers. This way, you get machine-checked guarantees (so programmers can write correct programs faster) with no runtime overhead (the programs also run faster).

> Also if you are accessing a data structure out of bounds, that's a show stopping bug that shouldn't ever happen. That should be something that is prevented ahead of time, not recovered from.

Yes, that's the whole point of static analysis, to move runtime checks to compile time, made ergonomic with Liquid Types. Note that this compile time verification is much stronger than testing at development time, which is what C++, Rust, and Julia offer without Liquid Types.

> That should be something that is prevented ahead of time, not recovered from.

Gabriella Gonzalez agrees with you, and calls this idea of "Prefer pushing fixes upstream over pushing problems downstream" as "The golden rule of programming", at the 1st slide in her talk [2]. Her talk shows you how such machine-checked verification is possible with Liquid Haskell, which is not possible in C++, Rust, and Julia without Liquid Types.

[1]: https://github.com/Gabriella439/slides/blob/main/liquidhaske... "Documented preconditions - Scrap your Bounds Checks with Liquid Haskell"

[2]: https://github.com/Gabriella439/slides/blob/main/liquidhaske... "Scrap your Bounds Checks with Liquid Haskell"


Yes, that's the whole point of static analysis, to move runtime checks to compile time

Move simple runtime checks to compile time. Most if not all of these can be done with iterators or simple loops. Index lookups that depend on use input and data still have to be checked somewhere and should probably be done outside of a loop anyway.

Static analysis is always welcome, but this seems like a lot to go through to solve something that isn't much of an issue in the first place. Bounds checks were a big problem in C, but with more explicit iteration and runtime checks they are almost never a problem I see anymore.


> this seems like a lot to go through to solve something that isn't much of an issue in the first place...but with more explicit iteration and runtime checks they are almost never a problem I see anymore.

If you can afford runtime checks (if they are not much of an issue in the first place), Haskell, C++, Rust, and Julia can have runtime checks.

If you want to opt-out of runtime checks, Haskell, C++, Rust, and Julia can opt-out easily, but unsafely.

So, to be fair to Haskell or any of them, this is not a complication specific only to Haskell.

But Haskell enables you to do more. With around 30 lines of Liquid Haskell (those lines between {-@ and @-}), you can upgrade simple runtime checks to compile time verification, see the full example [1]. You get great efficiency, safely, without unnecessary runtime checks, which is impossible in C++, Rust, or Julia without Liquid Types.

[1]: https://github.com/Gabriella439/slides/blob/main/liquidhaske... "Full program - Scrap your Bounds Checks with Liquid Haskell"


> But Haskell enables you to do more. With around 30 lines of Liquid Haskell (those lines between {-@ and @-}), you can upgrade simple runtime checks to compile time verification, see the full example [1]. You get great efficiency, safely, without unnecessary runtime checks, which is impossible in C++, Rust, or Julia without Liquid Types.

Just a side-note, julia's compiler is very adept at proving situations where array accesses are guaranteed to be in-bounds and will automatically elide bounds checks in those situations. Not saying this to take anything away from Liquid Haskell though, that seems like a very cool project with some very cool capabilities.


Thanks for the great note, I learned something today.


you can upgrade simple runtime checks to compile time verification

These aren't the same though. Runtime checks can check every access, static analysis isn't going to deal with every scenario like access based on inputs. Lumping them together is pretending there is something happening that isn't.


> static analysis isn't going to deal with every scenario like access based on inputs.

Not true.

With proper abstraction, static analysis proves the absence of bugs under all relevant inputs, which gives stronger guarantees than testing at development time (which only checks for a number of inputs), and is more performant than runtime checks (without runtime overhead).

An example is the use of a static type system, which eliminates a whole class of bugs under all possible values of the same type (those due to type mismatch: passing a string into a function expecting an integer), which is stronger than a test suite testing certain inputs, and does not have runtime overhead (unlike runtime checks).

Going further than usual type systems, people study refining types by possible values (such as positive integers as a refinement of all integers), and the result is refinement types, which is what Liquid Types is based on. And you can see this in action in the full example [1], such as { n : Int | 0 <= n } which refines n to be a positive integer.

[1]: https://github.com/Gabriella439/slides/blob/main/liquidhaske... "Full program - Scrap your Bounds Checks with Liquid Haskell"

> Lumping them together is pretending there is something happening that isn't.

Correct. Static analysis can formally verify correctness under every access (like runtime checks), but without the runtime overhead (unlike runtime checks), with proper abstraction.

That is why compile time verification is an upgrade from runtime checks.

OK, in reality the trick is that static analysis removes redundant runtime checks (check exactly once), see the sibling discussion https://news.ycombinator.com/item?id=37376963.

PS: I figured that setting the record strict may benefit others, such as https://news.ycombinator.com/item?id=37380176, hence this reply.


static analysis removes redundant runtime checks

Seems like back peddling. If part of the input data you have is there to reference input data in an array in C++ you would check the input data ahead of time so that you don't have to check the range in the middle of a loop.

Then you can have performance that is compared to everything else instead of just other programs written in the same language.


I think the main confusion is that, "Scrap your Bounds Checks with Liquid Haskell" presented a toy example/implementation which scales to real implementations, for the task of high-performance parsing of UDP packets (used by Awake Security, I guess).

The toy example/implementation is chosen to teach Liquid Types/Haskell, and therefore, covers the simple case of parsing UDP headers, which is a simple bounds check of "indexing an array of length 8". In this case, there is no need to use Liquid Types in any language.

But in the real use case (not shown in the talk), when you want to dig deeper into the content of UDP packets and such, you will need more than "check the input data (length) ahead of time so that you don't have to check the range in the middle of a loop". This is the case where Liquid Types help, tremendously. But such complicated cases, where Liquid Types shines, may not fit into a talk.

Again, it is not fair to think this is a complication specific to Haskell.


Again, it is not fair to think this is a complication specific to Haskell.

It is fair, because other languages have a single keyword or setting instead of a whole 'type system' with fancy labels that introduces lots of complexity for something simple.


It is not fair to think this is a complication specific to Haskell, because you can opt-out of runtime bounds checks as easily as C++, Rust, and Julia, with functions such as unsafeTake and unsafeDrop.

https://news.ycombinator.com/item?id=37382215

If you can afford runtime checks (if they are not much of an issue in the first place), Haskell, C++, Rust, and Julia can have runtime checks.

https://news.ycombinator.com/item?id=37387005

It is unfair to not respond to these points, while calling "this is a complication specific to Haskell." Because Haskell can do what others can do, but more.

> lots of complexity for something simple.

"indexing an array of length 8" for parsing UDP headers is simple, agreed. And Haskell can handle that simply with unsafe functions.

Keeping track of properties with complicated dependencies to parse arbitrary content (perhaps application-logic-aware contents, not "just checking for array length before indexing in a loop") is simple, strong disagree. See the section "Beyond Bounds Checking" of "Why Liquid Haskell matters" for example [1]. This might be used to implement the kinds of static analysis needed for Mojo or Swift, for instance [2].

[1]: https://www.tweag.io/blog/2022-01-19-why-liquid-haskell/ "Why Liquid Haskell matters"

[2]: https://github.com/modularml/mojo/discussions/466 "Mojo and Dynamism"

https://news.ycombinator.com/item?id=37376010

And this cannot be solved in the languages you mentioned, either. Unless you are claiming that all parsing checks are "just checking for array length before indexing in a loop".

https://news.ycombinator.com/item?id=37416911

We are still working on completely different scales.

The talk is using a toy example to show what is possible, and you are using that example too literally, without realizing that the example scales to handle much more complex problems that people need to deal with in the real world.

It is unfair to lump together different scales and call them the same thing.

Your reaction is like criticizing a talk which shows sorting 5 numbers with programs, with "it is simple to sort 5 numbers by hand, why bother learning how to program with all those complexity for something this simple?". And then calling this fair, without realizing that in the real world, sometimes people need to sort way more than 5 numbers.


C++ doesn't have bounds checks in the std library in release builds by default, so you can't get better than that.

Your reaction is like criticizing a talk which shows sorting 5 numbers with programs,

It is not, because this idea that loops and bounds checks are complex in the first place isn't true. No other language invests so much complexity in a simple problem that is barely an issue in modern languages in the first place.


> C++ doesn't have bounds checks in the std library in release builds by default, so you can't get better than that.

In Haskell you can provide your own prelude, which uses unsafe functions without bounds checks by default. With CPP or cabal tricks, you can easily have bounds checks only for debug builds, but no bounds checks for release builds. No Liquid Types needed either, though this is only testing in development but not compile time verification. Less safe, but same effort and result as in C++.

> loops and bounds checks are complex in the first place isn't true.

Liquid Types people are not thinking about only loops and bounds checks. They think about any properties, for example:

* pointer checks, a huge security issue, see [1] or https://news.ycombinator.com/item?id=37376946.

* valid expression checks (Beyond Bounds Checking of [2]), see https://news.ycombinator.com/item?id=37440896.

[1]: https://arxiv.org/abs/2207.04034 "Flux: Liquid Types for Rust"

[2]: https://www.tweag.io/blog/2022-01-19-why-liquid-haskell/ "Why Liquid Haskell matters"

Sorting 5 numbers is complex in the first place isn't true. No other technology invests so much complexity as programming and this is barely an issue in modern problem solving in the first place.

My previous comments already said loops and bounds checks are not complex in the first place, and they are only used as a demo in a talk, as Liquid Types scale to handle any properties.

I am pretty certain that a reasonable reader can judge the merits of our arguments and make up their mind. Thanks for raising a possible misconception about Haskell and Liquid Types, and this forced me to strengthen the arguments to clear the misconception for other reasonable readers. This is my last comment on this issue.


In Haskell you can provide your own prelude, which uses unsafe functions without bounds checks by default.

So your solution is to rewrite the haskell standard library yourself?

With CPP or cabal tricks

I don't know what cabal tricks is supposed to mean here.


(Can't resist this one line rebuttal)

> So your solution is to rewrite the haskell standard library yourself?

No need--the simplest solution is to pass flags correctly to the dependent library (set all flags BoundsChecks, UnsafeChecks, and InternalChecks to false for the vector package) for release builds.

Explanation after the one-line rebuttal:

This exactly matches the C++ behavior: no bound checks for release builds, but with bounds checks for debug builds. This shows that Haskell can do what C++ can do, very easily, just by passing compiler/cabal/stack flags correctly.

With this rebuttal, I think the issue is completely settled, at least for reasonable readers.


No need--the simplest solution is to pass flags correctly to the dependent library (set all flags BoundsChecks, UnsafeChecks, and InternalChecks to false for the vector package) for release builds.

If performance is an issue, why not just do this instead of dealing with "Liquid Types".

And with this rebuttal, the issue is now settled for all readers and even those differently abled that are using voice readers.



But why have a separate type system instead of just building it into the language. C++ could build in pointer checks and it would be in the same type system.

And with this rebuttal, the issue is now double settled.


> But why have a separate type system instead of just building it into the language.

Excellent question. The existing "mainstream" languages (C++, Rust, Julia, or even Haskell) are not designed to support such applications as bound checks or pointer checks in their type systems, where certain runtime values/checks can affect/refine the types (as in dependent types, or refinement types which Liquid Types are based on).

It could be done (in some newer languages: Idris, Agda, Lean), but people have not figured out an ergonomic way to do so in more "mainstream" languages (e.g., not hurting type inference, light and maintainable proof burdens). Liquid Haskell is an attempt to bring refinement type to Haskell, as an extrinsic type system.

> C++ could build in pointer checks and it would be in the same type system.

People have not figured out how to do so (and I doubt if it is possible at all). Suffice to say:

* Rust is an attempt to build in lifetime-and-ownership-checks into the type system of C++, and Rust is a new language,

* Liquid Rust is an attempt to add pointer checks (refinement types) to Rust, and even then it is not built into Rust.

> And with _this_ rebuttal, the issue is now double settled.

People are still researching solutions (see Rust and Liquid Rust over C++) to this problem (building in pointer checks in the type system of C++), and they would love it if you can settle their problems.


> With CPP or cabal tricks

> I don't know what cabal tricks is supposed to mean here.

It means conditional compilation, i.e. what the C preprocessor also gives you (but somewhat more principled).


It seems a little disingenuous to call something written in an obscure Haskell dialect a "high performance UDP packet parser"


Why? Looks like they're doing realtime network monitoring for security purposes. Sounds like a high performance application to me. Haskell has a reputation for being slow, but the proof is in the pudding - and what's being asserted is that a stronger type system enables better optimization.


Interesting - I didn't realise the performance of a system was measured in how "obscure" the source language is.

(edit: I am Australian - this is sarcasm.)


I’m also Australian and can confirm this was sarcasm


An important thing to remember about fancy type systems is that they can only make guarantees in a closed world. If your data comes from the network or a file, you will likely need a reader or parser that does runtime checks somewhere, but a good type system can provide assurance that you did each check exactly once.

On a complicated single-process system (a monolith), that can be a big improvement. For distributed systems (microservices), there will still be plenty of runtime checks. That's what allows each service to have its own lifecycle.

I'm wondering what Elixir will do, given that Erlang servers divide up work into lots of independent, distributed components?


This is a fantastic point, and a huge issue that I had with TypeScript.

Libraries for TypeScript like Zod[0] allow you to define types and validate against them at runtime. For example: https://stackblitz.com/edit/typescript-pwzng4?file=index.ts

At work, we use this to validate API responses, local storage contents, or URL/router data. It solves the problem you described -- dirty data coming from outside of the type system.

This isn't a plug for Zod or TypeScript -- there are other similar libraries for TypeScript, and I would imagine other statically typed languages have something to fill this role.

[0]: https://zod.dev/


Two random thoughts:

- why is there `safeParse`? Why is parsing not safe by default? Have an `unsafeParse` as an escape hatch, not the other way around.

- the imperative check for `success` seems unfortunate. Then again, I am familiar with pydantic from Python, which seems awfully similar to Zod, and there you'd get an Exception on failure which you'd have to catch. Some would call that uglier (I don't). Point being, I don't know of a better alternative, but sure hope native liquid/dependent types wouldn't suffer from this anymore


You can use `parse` which will throw an exception if the data does not match the schema, or `safeParse` if you want to manually check.

> the imperative check for `success` seems unfortunate.

Yes! But the cool thing is that `success` being `true` narrows the type and ensures that the `data` field is present. You cannot access `data` without checking for success

Here's another example showing those concepts: https://stackblitz.com/edit/typescript-wbuikz?file=index.ts


JS doesn't have any useful built-in way to deal with success/failure, other than exceptions. We use Boxed[1] for this:

  Result
    .fromExecution(() => schema.parse(input))
    .mapOk(parsed_input => /* */)
    .mapError(parse_errors => /* */)
which is an alright way to deal with things IMO.

I would hazard a guess that `parse` is a function introduced long ago, and `safeParse` came afterwards. (Safe as in "do not throw an exception")

[1]: https://github.com/swan-io/boxed


`schema.parse()` is equivalent to pydantic `MyModel.parse_obj()` and throws if invalid.

`const result = schema.safeParse(val)` does not throw, and if `!result.success`, `result.error` contains a `ZodError` object with all the details. It's common to further transform this error object, perhaps to make it more human-readable or suitable for an API response.

Typically I find myself using `safeParse` far more than `parse`.


TypeScript became so much more pleasant to use after we started using parsers regularly. We use Runtypes in some projects and a hand rolled lib I wrote in others, and being able to parse unknown data into a known shape is a life saver.

Glad to not have to write seemingly thousands of lines of interfaces and type guards any more.


For certain data you can exhaust the space of values and thoroughly test that your parser validates only the correct datatypes, and if you can do that, then you can be fearless in your impl with those "fancy type systems" because you guarantee that within the context of the impl, the software is sound. See [1], and the respective discussion here [2].

[1] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

[2] https://news.ycombinator.com/item?id=35053118


Most modern languages with static types have some way to preserve them across network/file transports. You can use a serialisation format that is typed (e.g. FlatBuffers etc.) or a library that validates the types on load, e.g. Serde. Even Python has Pydantic.


Type systems for serialization (protobufs and flatbuffers) are partially closed-world in the sense that we assume all past and future versions of the same schema come from the same place and we assume the developers followed certain rules when updating the schema. Because they allow for version mismatches, they’re designed rather differently than most native type systems for programming languages. There’s often an impedance mismatch with the native type system.

Validating on load is a form of runtime type checking. It’s not relying on a type assertion, which would allow the compiler to omit the type-check and the programmer to omit the error-handling when the check fails. Instead of preserving knowledge about the type constraints on some data, we prove the constraints again.


> Validating on load is a form of runtime type checking. It’s not relying on a type assertion, which would allow the compiler to omit the type-check and the programmer to omit the error-handling when the check fails.

You can totally do that if you want and you fully trust that the thing that generated the data used the right types.

Similar to what you do every time you use a dynamic library.


Yes, you could, but not verifying input usually isn't advisable. It's making a closed-world assumption that isn't warranted in an environment where there are no guarantees. Networks are all about sharing data with other organizations. Filesystems contain files with uncertain origins.

A type system should be designed based on what invariants you can reasonably expect in the environment where the system operates, versus what you can expect will change. For example, Swift has an interesting type system that's specifically designed to make upgrading dynamic libraries easier and safer. Apple can make assumptions about what sort of changes there will and won't be to the libraries that ship with its operating systems.

Contrast with Go, which makes the assumption is that there are no dynamic libraries; all Go source code is seen by the compiler.


> Yes, you could, but not verifying input usually isn't advisable.

It's not not verifying input. It's just not verifying the types. There are no additional security issues unless you do something spectacularly stupid.

For example you can load random bytes as a Protobuf file. It never validates that the data is in fact the right type, it just loads the bytes assuming they are. The only issue you'll see is junk data (or a parse error).

In any case I think you know that. I'm not really sure what point you were trying to make.


I'm not that clear about what point you're making either. We seem to talking past each other?

The way I see it, types are sometimes used to represent security guarantees, so junk data getting cast to the wrong type could be a security hole. It depends on which types we're talking about.

For example, you can declare a SafeHTML type whose underlying representation is a string, and if you somehow deserialize it without actually doing the runtime check, it could be a security bug.

Verifying types and verifying input are closely related; the type keeps track of what you verified when you constructed an object of that type.


I honestly thought that this was going to be about actual liquid types -- from the regular ones that (macroscopically) differ largely in terms of viscosity, to non-Newtonian liquids which have non-linear viscosity characteristics, to long-chain (PEG-type) polymers that have slightly unusual flow characteristics, to superfluids that have extremely weird characteristics and zero viscosity.

...I'd read that article. Now I'm starting to think that I should write it.


I’d read it! That sounds really interesting.


Had the exact same assumption, definitely still interested in something like that.


This sort of idea is why I sketched out pdv [0] in Zig. Even extremely simple type systems can handle arbitrary invariants when you are allowed to compute with those invariants at compile time.

[0] https://github.com/hmusgrave/pdv


Very cool idea - starred!


Happy to help :)


I like this sort of idea for typifying dynamic languages (like JS)

What if instead of TS, you had plain JS as-is (no compiler needed!) then just add comment like:

    /* x is an integer, x > 0 ... */ 
    function add(x, y) { ...
Sorry the example ain't great but the point is that you can be precise with refinement types. I think refinement types (less all the mathy explanations) are more intuitive. They would be like assertions or C# code contracts. The content of those types would look "business-ey" rather than "category theory-ish".


The difference is an informative error message at compile time when such fails to satisfy the 'refinement'


DML link in the article is broken - Might I instead point you to a relevant ATS-lang (successor2 of DML) documentation about `Dependent Types in ATS2` https://ats-lang.sourceforge.net/DOCUMENT/INTPROGINATS/HTML/...

This link discusses, along with other usefulness, the use of dependent types for the likes of array bounds checking, creating "sorts" of numbers satisfying various predicates, and more interestingly the utility of combining Dependent Types with PURE Linear Types (as opposed to Affine Types a la Rust lol) ...

AFAIK, DML was originally intended as a gateway to constructing typed assembly (DARPA)

Having been a researcher in this field for, this quote from the original paper on 'Liquid Types' originally seemed suspect.... "a system that combines Hindley-Milner type inference with Predicate Abstraction to automatically infer dependent types precise enough to prove a variety of safety properties" .... type inference with dependent types is undecidable!? Then again, this is one of the primary focuses of ATS3, I digress...

This area of research is rich in pipe-dreams and cerebral investigation. From my biased opinion, ATS3 (the current version of the successor3 of DML, being actively devloped by the PI Professor Hongwei Xi, is the only language in this space to be aimed at practical programming in industry... of course there is always haskell....)

What I mean is that, TEMPLATES++ Linear Types + Dependent Types + Hindley Milner Type inference ....

At the end of the day, who doesn't love hyper efficient meta-programming magic?


ATS is great, I’ve used ATS1 and ATS2 quite a bit and it worked well. Unfortunately it suffers a bit from the “rewrite every few years” as research directions change.


How do you define operations on these liquid types? You can't even define subtraction or addition on this "non zero integer type". These types are not closed so how is it useful?


Liquid types are represented under the hood using ordinary data types, so e.g. the nonzero integer type will be stored as a value of ordinary Haskell integer type at runtime. Subtraction on the underlying type is defined as usual.

Now, the Liquid Haskell type checker knows a lot about integers: it can deduce whether your invocation of x-y could give a negative result or not.

If it cannot result in a negative value (e.g. because y has a particular liquid type), the type checker will accept your program, and you'll be able to use the subtraction function as you normally would.

However, if your invocation of x-y could potentially result in a negative value (e.g. because the values of x,y are not sufficiently constrained), the type checker will reject your program, and tell you to do something else (e.g. add runtime checks that rule out a negative result before invoking the subtraction).


Presumably it will reject it with a helpful message like "no"?

I think that's the main disadvantage of these types. It's now no longer a decidable system with (usually) easy to understand error messages. You're in the realm of formal verification and SMT solvers where the only error message is "didn't work; try something else".

Probably still good but I think it's important to be aware of the downsides.


Yes, that was my first thought.

As long as there's a good way to map from the real world into the liquid typed world, either by just telling the liquid typed world that this value can only be a positive integer, or by the liquid typed world having functions that ensure this at the boundary and return nice errors on runtime failure of this check, then it sounds really interesting.


Addition isn't a problem, any nonzero integer plus a nonzero integer is a nonzero integer.

For subtraction it becomes fallible. You get back an option, 5 - 4 = Some(1), 4 - 5 = None. (These are probably methods rather than infix operators, your language may vary.)

See for example the methods documented here: https://doc.rust-lang.org/stable/std/num/struct.NonZeroUsize...

As for it's utility, it enables certain optimizations and it can be a useful constraint. Eg for ensuring that you don't divide or multiply by zero.


It is worth pointing out that the same problems apply to "normal" integer types like int32/uint32; you can trivially over/underflow the bounds with basic arithmetic. In that sense it doesn't really matter what the bounds are exactly.


> any nonzero integer plus a nonzero integer is a nonzero integer

That’s not true. 5 + (-5) is 0.


You're right, I was thinking of a nonzero unsigned integer. D'oh! Thanks for the correction!


It would be in interred ring to have a memory bounds for the program too!


I wonder how pragmatic those can become. Usually I don't really care about "exact" exactness, a common theme in haskell. A sort of "under common assumptions" (no underflow/overflow/OOM etc.) proof of laws and properties. This eliminates a whole class of bugs and strengthens your ability to reason, but doesn't make everything super, super complicated by thinking about all those edge cases.

One usually doesn't program airplane control software. Some bugs are okay and speed of development and correctness is a trade-off.


I wonder what is possible in more common languages like Java or Python.

Certainly, the language itself won't have liquid types in the compiler any time soon. But taking Ruby's Sorbet as inspiration, I wonder if you could annotate Java classes with semantic rules and do some simple verifications as a post-compilation step?

There's too much bad code out there that can't just be replaced. Tools to help make it better would be a huge boon.


Java has JML [0] and Python has all sorts of DBC libraries that can accomplish something similar [1].

[0] https://www.thestrangeloop.com/2018/contracts-for-getting-mo...

[1] https://www.hillelwayne.com/talks/beyond-unit-tests/


In 2008 at univ I had a few classes using http://krakatoa.lri.fr/. Tools definitely exist, although you could probably argue that the production ready version of that that people use in practice is really Ada/SPARK (built on similar technologies if not the same)


reminds me a little of eiffel's design by contract. it looks really promising


Your intuition is right. Liquid (refinement) types, design by contract and Hoare logic have, essentially, the same expressiveness.

Personally I find refinement types more promising than dependent types, but time will tell.


It made me wonder too, what the (practical) advantage over contracts is. If anything, Liquid Types seem less powerful. E.g., you can't use it to express a relation between two function parameters. It seems to be a macro for contracts (the type name implicitly copies its restriction as a contract clause), so in that sense it could be welcome as a practical way of improving correctness.


If Int (without 0) should be used, why not use IntWithoutZero type?


Title needs to append (2015).


Programming language design took a wrong turn with these extremely complicated type systems. They end up hurting developer productivity, and you spend more time fiddling with types than solving the problem at hand. We should be focusing on more effective forms of abstraction instead of types.


What languages are you talking about? I find that the weaker the type system the more I feel fiddling with it is a waste, eg Python, but I don't feel that way when I'm working with a strong type system that can not only help eliminate bugs but aid the compiler in analysis and actually deliver a better artefact, eg Rust.

LSPs are a sort of middle ground where I derive a lot of the value from, and even weak type systems really help the analysis of the LSP. They make me immensely more productive. I've been writing a lot of TypeScript lately, and I may not have taken the plunge to the frontend if LSPs hadn't made it dramatically easier. So in that case the impact on my productivity is +100%.


I'm not the parent but at least for me, my #1 complaint is "whiny computer".

Especially when dealing with someone else's code, it can be a ceremonial roadblock for moving data around that could easily be processed without issue but artificial constraints in order to service an orthodoxy have been erected to make things needlessly more brittle.

In the name of reducing bugs, it in practice can often impose workarounds that increase the chance the confusion and more bugs.

This isn't always the case but most of the time things are built in incompetent hands (including my own). Accommodating for this reality is probably the better move. Imposition of absolutist rules doesn't improve the behavior of incompetency, it just makes it more subversive and the defects it creates more insidious


Hmm interesting. Are these projects that were originally untyped but then gradually typed later? I ask because I've not really encountered that, the most I've had to do is add asserts or ifs to help explain to the typechecker something that I could see as a human. A few times in a gradually typed language I've needed to add an ignore to a line. But it's been rare in my experience.

I try not to think of it as whining, I try to think of it as the type checker helpfully pointing out, "hey, you haven't done your due diligence here." I ran into this advice when learning Rust and it made it much more pleasant. In the same way that you might chase a series of failing tests to finish a task, you can chase a series of type warnings. I have them integrated into my editor, so usually the "series" is length 1 because I'm knocking them out as they arise.

The projects I've worked with that were previously untyped (Python SaaS backends), I never actually ran the type checker on the entire source code. But if I had a megabyte of type errors that weren't really going to provide value to the business, yeah that would feel like a waste of time. Unless we were having serious quality issues and it was between annotating the entire codebase and rewriting it from scratch, I'd leave well enough alone - it's a tool, not a religion. But the new code I'm checking in, that's getting annotated for sure.

I don't really see types as arbitrary absolutist rules. They're the semantics of the language, they're the rules you're following whether you have a typechecker or not. And if you violate them you're gunnuh get bugs and crashes regardless. I too am a flawed human who frequently makes mistakes, and I like having them caught immediately, before I've even checked my code in.


> I try not to think of it as whining, I try to think of it as the type checker helpfully pointing out, "hey, you haven't done your due diligence here."

It can be both true that you haven't done your due diligence in the sense that you haven't taken care to handle the entire range of all possible inputs (or at least it seems like it when you're only considering the relevant code section in isolation), but that as a practical matter it's just not a problem. So the time spent attending to what it's nagging you to take care of is time wasted. Alternatively, you're not even a program maintainer, you're just trying to build the thing so you can run it for your own one-off use, but it refuses to even build because of some issue totally irrelevant to what it is you're about to use it for.

The approach that the TypeScript compiler takes is beneficial—complain about the thing, but don't let that prevent the compiler from running to completion and still producing a program that you can run.


Sure, as a mindset thing I find thinking of it as "whining" leads to a deeply frustrating experience where I feel like I'm fighting my tooling, but thinking of it as "helping" makes the work lighter.


I probably just choose my projects wrong. I get attracted to charity cases and disasters for some reason. I've tried to stop but there's obviously something complicated going on


Haha. It's a niche and I respect it. All software is bad but it's worth supporting software that's useful.


I am familiar with both Python and its typing, and Rust. Obviously, Rust has ADTs, which open up a whole new class of types. However, I wouldn't call Python's type system weak! It's still under very active development, but making full use of it (i.e., type annotations) and the strictest settings in mypy makes for a very powerful combination. I've never felt the system was too weak; if anything, I was.

There's type narrowing, warnings for not awaiting Awaitables, generics, protocols/interfaces, even whole function signatures can be lifted into the type system (ParamSpec). Dictionaries can be turned into typed versions as well, which is quite powerful.

By default, it's weaker, as it's optional. Rust's isn't optional. But start a greenfield project with type annotations and you won't be disappointed, I hope. Features like ADTs will still be absent, but I feel that's an orthogonal issue to type safety etc.


Yeah. Maybe weak is too strong a claim. I have done a lot of typed Python code fwiw. I'm certainly never going back to untyped Python for anything outside of a scripting context.

"Python wasn't weak, I was" is an accurate call out, I have abandoned complex annotations in Python before because there was a bunch of cruft involved and the documentation wasn't always as straightforward as I'd have liked.

The lack of ADTs (and thus a Result type) combined with the lack of exceptions as part of the function signature (like Java) make it really difficult to develop a reliable application. Maybe that's not quite a type safety issue, but it's a safety issue for sure.

Digging in, it looks like people are working around this by returning exceptions, which is very clever and I hadn't thought of it before.


> Digging in, it looks like people are working around this by returning exceptions, which is very clever and I hadn't thought of it before.

Never came across that! Where did you find that? Pretty interesting. I'd call it deeply unpythonic and very surprising. It's unfortunate to not have a Result type/idiom, but IMHO faking it isn't the way to go either.


I saw it in this Stack Overflow answer:

https://stackoverflow.com/a/74192977

I haven't given it a great deal if thought, it would probably mess up stack traces if you raised it later, which would be bad. I agree it's unpythonic, I've implemented result-like types in Python before which has worked okay.

ETA: That answer links to this library, which uses a decorator technique that I think will still print a good traceback if you reraise it later (definitely will if you use `raise ... from` syntax), but I can't test it at the moment:

https://github.com/alexandermalyga/poltergeist

I probably won't be putting this into production anytime soon but it was charming food for thought.


A lot of people probably don’t know that Python is adding optional types FWIW. I started and stopped using Python before it was even thought about.


complexity doesn't go away, you distribute it

you either create a complex program, with simple language constructs, or a simple program with complex language constructs

if you use only the simplest types, your program will be very complex, if you put all the complexity in types, your type system will be very complex

you need to balance things

i think OOP is the extreme where all the complexity is in the Object system (or types system)

functional programming with complex types, i think give a nicer balance, where some logic goes into the program flow and some goes into types

anyway balance is everything, keep things balanced , dont lean too much into any direction


Well, intrinsic complexity can be split, but artificial complexity might be added on top of it.

Languages and libraries that are well designed and compose better end up being great at taming complexity. Functional languages are made to compose better, but their popularity is a hint that they are not that simple to use effectively.


> more effective forms of abstraction instead of types

do you have any in mind ?

I often think about a blend of functional and OO as 'protocol graphs' but it's a faint idea for now


This protocol graph idea sounds intriguing. I'd love to hear more about it if you're interested in sharing.


Array programming is an extremely powerful paradigm that is criminally underused outside of data science applications. It also feels like the full power of array programming hasn't been explored yet.


Array programming doesn't really have much effect on how relevant a type system is for a particular language. At most it will let you make operations generic over the rank of the inputs (scalar, array, array of arrays, etc), which is nice, but the scalar values at the bottom of that hierarchy are subject to all the same factors that motivate the type systems present in any other modern language.


My point is that academic PL research has diverged significantly from the kinds of problems people are really facing. That's why we've seen array programming emerge organically to address a direct industry need, and it didn't come from the PL crowd at all.


APL started at Harvard (though it developed more in IBM) and even nowadays there's the ARRAY workshop co-located with one of the main PL conferences.

The thing is that while array programming is amazing for some specific problems it's not going to help you make sure that you don't have memory errors, race conditions, or wrong states in your program


The implementations of array programming in use today are quite different from APL, and they all came from industry.

> it's not going to help you make sure that you don't have memory errors, race conditions

Not only does array programming abstract away those kinds of errors, it also solves the problems that I care about: code that is faster, more expressive, and easier to read and write. Unwieldy type systems do not help me solve those problems.


I like (and share) your enthusiasm for array programming! but can you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

You needn't use your real name, of course, but for HN to be a community, users need some identity for other users to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...

Also, can you please stop using trollish usernames? Those aren't allowed either, because they basically troll every thread the account posts to. https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...


I can see how array programming helps with certain memory errors, but I don't think it helps with race conditions. Whereas eg Rust ownership, borrowing, and lifetimes eliminate one class of race conditions (data races) in safe code.

Type systems also enable optimizations that do make code faster. Eg monomorphization. Consider that JITs makes untyped code faster by inferring types at runtime. See also this comment https://news.ycombinator.com/item?id=37376010

I honestly can't make heads or tails of the idea that type systems don't help make code more expressive? They allow you to express your precise intent in a way that's legible to both humans and the compiler.


> academic PL research has diverged significantly from the kinds of problems people are really facing

Was it ever not?


As soon as the programming landscape becomes too large.. PLT cannot address every crowd on the distribution. I guess he means that type theory is too far away from most people duties.


Similar arguments here: https://bower.sh/typescript-terrible-for-library-developers

As a library developer I spend substantially more time on making the types right than the actual implementation. It can be brutal at times.


Think of how much time you'd have to spend writing trivial tests instead if you didn't have an automated system to detect basic mistakes like typos and type confusion!

I guess you can complain that Typescript forces you to spend time making sure your code meets a minimal bar of correctness...


Library developers in particular should see type complexity as a warning that something isn’t great in the design. This seems largely an issue for libraries migrating to TS which isn’t a surprise.


I disagree. These "complicated" type-systems enable fearless refactoring and give you guarantees without cluttering your code with validation checks because, well, there is the guarantee that faulty states are not representable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: