Look at the humongous type that Hindley-Milner infers for this tiny program

flebron · on Aug 4, 2014

Perhaps an easier example is to do this in GHCi:

    let x = id id id id id id id id id id id id id id id id id id id id id id id id

See how long that takes (add ids if your machine can take it). To deduce a type for x, ghci will use 2^k type variables, if you have k ids. This is easily seen by induction:

If k = 1, then the type of x is a -> a. This takes 2 = 2^1 = 2^k variables. If k > 1, it is of the form id f. Let t be the type of f. t has 2^{k - 1} variables, by induction. Thus the outermost id must take something of type t (this is 2^{k - 1} variables) and produce something of that same type t (this is another 2^{k - 1} variables), that is, it's of type t -> t. 2^{k - 1} + 2^{k - 1} = 2^k.

Note that this information is purely at compile time. Haskell isn't typed at runtime, this is all just to check types. At runtime, x is a no-op, as expected.

seanmcdirmid · on Aug 4, 2014

I used to be able to kill Scala compiler performance with just a a few identifiers. Well, Scala is trying to deal with nominal subtyping, which is well known to be quite hard (so the type inference algorithm isn't really H-M, and must rely on heuristics).

virtualwhys · on Aug 4, 2014

> I used to be able to kill Scala compiler performance with just a a few identifiers

Interesting, got an example? Also, what version of Scala were you on?

seanmcdirmid · on Aug 4, 2014

This was in 2006/7 when I was working on scalac, so the early 2.1/2.2 days. I seemed to often push the type system in places that were not well defined, mostly because my scala style was so different (very OO, but with heavy recursive use of type parameters). Unfortunately, I've forgotten what the cases were, but they were usually simple constructions.

virtualwhys · on Aug 4, 2014

heh, ok, the stone ages of Scala ;-)

Current Scala (2.11) has many issues to be ironed out, but I'd be surprised if the topic at hand here were one of them.

spacemanaki · on Aug 4, 2014

That's very interesting! This seems to me like it might be a different kind of edge case though, since the type of x is still t -> t, even if it takes a long time for the type checker to arrive at this. From the little I got out of Mairson's paper, it seems that the proof he uses depends on the exponential size of the type itself, but I could be mistaken... either way I wonder how they're related?

throwaway_yy2Di · on Aug 4, 2014

    "I wonder how they're related?"

You could replace `id' with something that expands things, like \x -> (x,x):

    let f = \x -> (x,x) in (f.f.f.f.f.f) ()

Now it builds exponentially large tuple types. (?) And going back to the example in the blog:

    let f = \x -> (x,x) in (f.f.f.f.f.f) id

This is the same as the spacemanaki's example, without explicitly labeling the d1,d2,d3... intermediates.

spacemanaki · on Aug 4, 2014

Actually, these examples are similar to the first example in my post (under the first "pathological case" heading) and don't quite exhibit the worst case behavior since the type can be represented in linear space as a dag.

seliopou · on Aug 4, 2014

The title's a little misleading, as it suggests that there's something strange or even wrong about the type that OCaml or Haskell infer for the program. Of course the type that these systems infer for the program is correct, and the author does demonstrate an understanding of that. But that a small input to a system can produce a large output should be little surprise to any person that works with complex systems, especially computer systems.

The underlying mechanism that leads to the computational complexity in HM(x) systems--the process of generalizing types to schemas and instantiating schemas to types in order to support let-polymorphism--would make for an interesting discussion, but the author hasn't hit on that yet. Unfortunately, the only people that really understand that part of the algorithm are the people that have implemented the algorithm, and there aren't many of those people around. Strange but true. The algorithm that's taught and implemented in most undergraduate PL courses (PL:AI-based courses included) does not support let-polymorphism. If the author's reading, upgrade to ATTAPL[0] and start implementing.

Also, to answer the question "How to compile Turing machines to ML types", have a look at typo[1] for one way to do it in Haskell (by reducing it to lambda calculus (shameless plug)).

[0]: http://www.cis.upenn.edu/~bcpierce/attapl/

[1]: https://github.com/seliopou/typo

JadeNB · on Aug 4, 2014

I think that this is my new favourite esolang. Type-level naturals (http://www.haskell.org/haskellwiki/Type_arithmetic), eat your heart out! What was your inspiration for making this?

By the way, sorry for a "useless use of cat" (http://en.wikipedia.org/wiki/Cat_%28Unix%29#Useless_use_of_c...) nit-pick, but note that

    $ cat examples/fac.typo | typo

should probably be

    $ typo < examples/fac.typo

seliopou · on Aug 4, 2014

Thanks! Typo made it HN last summer, so here's a response to your question on motivation[0]. Also, the 'cat' thing has come up before. Here's my response to that[1] as well. But now that I'm rereading my response it doesn't seem clear what I mean. The README for the language demonstrates how you can use shell grouping as a poor man's module system. When you do that, you have to use 'cat' on single files in certain situations. So I stuck with 'cat' for the simple example so it's consistent with the more complex examples. If you take a look at the typo script[2] that wraps the compiler, you'll see that I know when to use redirects and when to use 'cat'. But regardless of all that, I think most modern machines can spare an extra process.

[0]: https://news.ycombinator.com/item?id=6176340

[1]: https://github.com/seliopou/typo/pull/4

[2]: https://github.com/seliopou/typo/blob/cc64ec38603a50c6543b38...

JadeNB · on Aug 5, 2014

Thanks for the helpful reply. By the way, I didn't mean to suggest that you didn't know when to use `cat` vs. re-directs; I just thought it might have been an oversight. I'm sorry if it came across as rude.

throwaway_yy2Di · on Aug 4, 2014

What about this example?

    f0 x = (x,x)
    f1 x = f0 (f0 x)
    f2 x = f1 (f1 x)
    f3 x = f2 (f2 x)
    f4 x = f3 (f3 x)

This is worse than O(c^n): the size of the last function's type is O(2^2^n).

(In spacemanaki's blog example, each new expression doubles the size of the previous one. Here, each new expression squares the size. f4 builds 2^2^4 = 65,536 copies of 'x', and f5 will crash your compiler if you define it).

emillon · on Aug 4, 2014

I wrote an article on this example:

http://blog.emillon.org/posts/2014-05-21-making-type-inferen...

At first I expected something of the form O(2^n) too but DEXPTIME actually contains O(2^p(n)) where p is a polynomial.

natte · on Aug 4, 2014

O(2^2^n) = O(4^n) = O(c^n) ; O(2^3^n) = O(8^n) = O(c^n) ; ...

cousin_it · on Aug 4, 2014

I think grandparent meant 2^(2^n), not (2^2)^n.

chengsun · on Aug 4, 2014

Even so, O(4^n) != O(8^n); there is no constant k such that k * 4^n >= 8^n.

munificent · on Aug 4, 2014

This is a fantastic post. I'm passingly familiar with ML and have heard the term "let polymorphism" a number of times, but this is the first time I've seen it clearly explained.

spacemanaki · on Aug 4, 2014

Thanks, that's very flattering as I've been a long time fan of your blog!

tomp · on Aug 4, 2014

> Why can’t we just allow polymorphic lambda-bound variables?

Two reasons. The first is that in general, type inference for higher-rank polymorphism is undecidable. The second is that even if it were decidable, it would be impractical - I think that if we have a program like this one:

  let test(f) = (f(1), f(true))

the assumption that the programmer made an error is statistically much more likely than the assumption that `f` should have a polymorphic type. Another reason is that it's not entirely clear what kind of polymorphic type to infer for `f` - should it be `forall a. a -> a` or e.g. `forall a. a -> int`?

> What does this have to do with different “ranks” of polymorphism?

Higher-rank polymorphism means that polymorphic types can appear within other types; in a system with support for higher-rank polymorphic types, the parameter `f` in the function `test` above could be declared with type `forall a. a -> a`, the function `test` would have a type `(forall a. a -> a) -> int * bool`. Higher-rank polymorphism is formalized using System F, and there are a few implementations of (incomplete, but decidable) type inference for it - see e.g. Daan Leijen's research page [1] about it, or my experimental implementation [2] of one of his papers. Higher-rank types also have some limited support in OCaml and Haskell.

> How is let-polymorphism implemented? How do you implement it without just copying code around?

The "standard" implementation consists of two operations, generalization and instantiation. At let bindings, types are generalized by replacing all unbound type variables with polymorphic type variables (care must be taken not to generalize type variables that can be bound later, but that's a secondary issue here). Every time a variable with a polymorphic type is used, its type is instantiated, which means that all polymorphic type variables are replaced with fresh unbound type variables.

  let f = fun x -> (x, x)

  print f(1), f("boo")

In the example above, the type inferred for `fun x -> (x, x)` is `_a -> _a * _a`, where `_a` is an unbound (but not polymorphic) type variable. At let binding, this type is transformed into `forall a. a -> a * a` by replacing `_a` with a polymorphic type variable `a` (in most ML languages, the `forall` is implicit). Then, when `f` is used in `f(1)`, its type is instantiated into `_b -> _b * _b` and `_b` is unified with the type of `1`, `int`. When `f` is used in `f("boo")`, its type is instantiated into `_c -> _c * _c`, and `_c` is unified with type of `"boo"`, `string`. Since `_b` and `_c` are different variables, there is no error here (if `f` had a monomorphic type, `_b` and `_c` would in fact be the same, and this example would result in an error "cannot unify type int with type string").

> What’s the relationship between let enabling exponential function composition and the exponential time result?

So there is quite a bit of copying happening every time a variable with polymorphic type is used. I've never studied the worst-case scenarios of Hindley-Milner, but I imagine that it has to do with instantiation copying large polymorphic types around.

> Do implementations of Hindley-Milner actually represent types as dags and utilize structural sharing?

Yes; an unbound type variable is typically represented as a reference cell, which is assigned to the type the type variable is unified with. So, if we have

  double : forall a. a -> a * a

  double (1, 1, 1) : (int * int * int) * (int * int * int)

then the type of double is first instantiated, yielding `_b -> _b * _b`, and then the type of the parameter is unified with the type of the argument, `int * int * int`. At this point, the reference cell in the internal representation of `_b` is updated to point to `int * int * int`, which means that both `_b` in the result type actually point to the same representation of `int * int * int`.

[1] http://research.microsoft.com/en-us/projects/fcp/

[2] https://github.com/tomprimozic/type-systems/tree/master/firs...

spacemanaki · on Aug 4, 2014

Clearly I should write more blog posts with lots of questions at the end, thanks!

userbinator · on Aug 4, 2014

Another reason is that it's not entirely clear what kind of polymorphic type to infer for `f`

How about forall a. a -> b ? If we make assumption that f is polymorphic, then nothing can be inferred about the return value, so it should also be polymorphic?

tomp · on Aug 4, 2014

Let's instead talk about the type of `test`. It could be one of these:

  test : (forall a. a -> a) -> int * bool
  test : forall b. (forall a. a -> b) -> b * b
  test : (forall a b. a -> b) -> (forall c d. c * d)

The first case is the most "obvious" one with polymorphic types, the one that I described above.

The second is the one that I hinted at, but I used less general type `(forall a. a -> int) -> int * int`, i.e. instantiating `b` with `int`. The second type above is a more general one, as it could be instantiated into other types such as `(forall a. a -> bool) -> bool * bool` as well, but this is the standard first-rank HM polymorphism.

The third type above is the type that would be inferred if `f` had different polymorphic type variables as parameter and return types. However, there are no values that have the type `forall a. a`, so a function with type `forall a b. a -> b` either (1) cannot return (i.e. diverges or raises an exception) or (2) is a dynamic cast function, i.e. a hack (`Obj.magic` is an example of such a function, but using it is "dangerous" in the sense that it bypasses the type system and so might result in undefined behavior at runtime). Disregarding case (2), the function `test` would also not return, so it's equivalent to say that it's type would be

  test : forall c. (forall a b. a -> b) -> c

and as such, it would be pretty useless.

pacala · on Aug 4, 2014

Not sure why "test : forall b. (forall a. a -> b) -> b * b" is "the standard first-rank HM polymorphism". The first comment implied "test" is a type error in HM, typed as "forall a b. (a -> b) -> b * b", which fails to unify a with both int and bool.

Based on the principle of least privilege, "test : forall b. (forall a. a -> b) -> b * b" feels the right type. Only the body of test feeds a to the first argument, so forall a is scoped to the first argument, whereas the caller of test needs to consume b, so we scope forall b to the whole expression.

tomp · on Aug 4, 2014

What I meant is that going between `forall b. (forall a. a -> b) -> b * b` and `(forall a. a -> int) -> int * int` is something that is possible in standard HM (i.e. if we consider the `forall a. a -> ...` part as opaque to HM).

I guess you're right, the "least polymorphism" we need to add to make `test` pass the type checking is to make just the argument of `f` polymorphic... However, in standard HM without any hacks/dynamic types, a function with the type `forall a. a -> b` for some fixed `b` can't really do anything useful with its parameter - it can't inspect it, it can only wrap it up in a data structure, but it can't return the data structure, unless the type of the parameter is wrapped in an existential type (but unwrapping it, there is still nothing to be done with it...).

pacala · on Aug 4, 2014

Great example, it now clicks. We lifted b all the way up because it needs to be consumed by the caller, might as well lift a as well, as types with free variables, like b in "forall a.a -> b", are useless. Which looks a lot like let-polymorphism :)

quink · on Aug 4, 2014

Somewhat related: http://en.wikipedia.org/wiki/Billion_laughs

Camillo · on Aug 4, 2014

If you're busy, just read the original Stack Overflow Q&A: http://stackoverflow.com/questions/22060592/. It's 1/20th the size of the blog post and I got all I needed out of it.

Patient0 · on Aug 4, 2014

I've found the following: http://okmij.org/ftp/Haskell/AlgorithmsH.html#teval

to be very useful for understanding how the HM type algorithm works.

They rephrase the problem as "writing an expression interpreter" for your program that recursively evaluates the abstract types of each expression instead of the concrete value.

I found this to be a very intuitive way to understand it.

It also then makes it easier to see why some "type inference" algorithms might guarantee to terminate while others might run forever: the more complicated your type system, the less abstract the "type values" are. They become like actual "values", and the behaviour becomes closer to the actual running of the program, which may not terminate.

bmh100 · on Aug 4, 2014

Fascinating behavior. Since this is an edge, would it ever occur in human-written code? Could there be a situation where a computer would end up generating such code?

lmm · on Aug 4, 2014

I've had this kind of thing happen when I try to use dependent types in Scala. E.g. if you want to select a HList sublist out of another HList (using the types) then that's quadratic, because just figuring out that a HList contains a particular type takes linear time.

(The idea was to do DI using HLists, to get something like Spring but that worked at compile time. It worked, but as the program got larger the compile times became too long for it to be practical)

rwmj · on Aug 4, 2014

I've written a lot of OCaml code and I don't recall having the compiler blow up on hand-written code. The OCaml type inference engine works well except on objects where you sometimes need to annotate variables to tell it what object type it is. (But then we rarely use OO in OCaml).

It's a different matter with generated code. For a while I had a program that generated OCaml code (from descriptions of Linux kernel structs). That would result in 100000+ line generated OCaml programs which frequently caused very long compile times (eg 20 minutes and more), with the time being spent in the type inference part of the compiler.

spacemanaki · on Aug 4, 2014

It's unlikely to occur as the type is exponential in size and thus quite unwieldy to do anything with! I haven't worked with generated ML or Haskell code, but I still think it's unlikely to be an issue in "real" code.

nostrademons · on Aug 4, 2014

It's rarely a compile-time performance issue because O(2^N) algorithms perform well on modern hardware when N ~= 3. You could nest the `double` example from the article 6 times and it would still only be a 64x slowdown.

Where it does show up, annoyingly, is in error messages. You actually see this all the time with GCC's STL error messages (C++ templates provide a parameterized type system very similar to Haskell or ML, but without the inference, at least until C++11). STL containers are usually parameterized not only over the key and value, but over the comparator and allocator. Nest a couple of them and you suddenly have a type with several dozen variables. GCC expands the whole type out, without typedefs, when printing error messages, which means that you will sometimes get compile errors that are 10 pages long. Clang fixes this by preserving the typedefs, without which you wouldn't want to write such a type. Haskell has the same problem, except worse because it infers types, but GHC has gone to great pains to format error messages so they are at least readable if verbose.

ubernostrum · on Aug 4, 2014

XML/SGML people know this sort of thing (exponential resource use due to expanding references) as the "billion laughs" attack. There are real-world consequences to this sort of stuff.

freyrs3 · on Aug 4, 2014

No, HM algorithm has terrible worst case complexity, but the worst cases don't correspond to programs humans write and generally it works perfectly in practice.

ufo · on Aug 4, 2014

IIRC, the algorithm used by the ocaml implementation is guaranteed to be close to linear time as long as you dont have this sort of nested polymorphic lets.

seanmcdirmid · on Aug 4, 2014

Humans tend to surprise in what they wind up writing. All HM has to do is fail 5% of the time to be very annoying.

rwosync · on Aug 4, 2014

In my experience the failure rate is far below 5% for vanilla HM algorithm. I've never seem a pathological input in the wild that wasn't contrived specifically to test the typechecker.

toolslive · on Aug 4, 2014

Actually, there's some use case for these things: you can simulate varargs, and yes the compilation time can be made arbitrary large: https://gist.github.com/toolslive/5957292

  $> time ocamlc vararg.ml
  real    1m9.372s 
  user    1m9.264s
  sys     0m0.044s

Try adding a few a's and see what that gives.

spacemanaki · on Aug 4, 2014

Thanks for pointing that out! I hadn't made the connection and now I've got another reason to figure out how fold works (have yet to wade through all of http://mlton.org/Fold)

BorisMelnik · on Aug 4, 2014

question: anyone know of where the pronunciation of "!" to be "bang" came from? I always called these exclamation points but see a lot of programmers calling them "bangs." anyone?

AnimalMuppet · on Aug 4, 2014

Well, they are exclamation points. But for many punctuation characters, there's a hacker nickname (or more than one). I've heard of "bang" for "!", "hook" for "?", and "shriek" or "splat" for "#".

The Jargon File has more detail: http://www.catb.org/jargon/html/B/bang.html

lmartel · on Aug 4, 2014

My favorite of these: I've worked with a few people who used "huh" for the coffeescript `?` operator.

quink · on Aug 8, 2014

While I haven't seen anyone actually use this pronunciation I'd consider 'p', short for predicate, a fairly logical way of pronouncing '?'.

If nothing else, it's LISPy in origin at least. Without actually being inherent to LISP. For example, with a ternary operator:

var adult = (age > 18) ? true : false;

Pronounced: 'adult' equals (age greater than 18) p true else false.

BorisMelnik · on Aug 5, 2014

thank you, I did a bit of research on this but didn't find anything as good as this.

troels · on Aug 4, 2014

That sub title is hilarious.

pohl · on Aug 4, 2014

This little program walked out on stage. What it did next blew the compiler away.

mehwoot · on Aug 4, 2014

Compilers hate him! Find out this one simple program....

JadeNB · on Aug 4, 2014

Indeed, that is already (essentially) the subtitle of the article.

mehwoot · on Aug 5, 2014

Well, I look like an idiot.. I read the whole article and then came back to the comments and it popped into my mind. Which was apparently just me remembering it.

http://i.imgur.com/Uwysayc.png

JadeNB · on Aug 5, 2014

Now that is a classy mea culpa!

htk · on Aug 4, 2014

English called, they are running out of exclamation points.

antics · on Aug 4, 2014

This talk was submitted to a conference which literally requires exclamation points to be in the title. So that's where they come from.

glibgil · on Aug 4, 2014

We can borrow the extra ones from the beginning of Spanish sentences.

htk · on Aug 6, 2014

Maybe there are not as many Seinfeld fans here as I thought.