Functional Typelevel Programming in Scala

virtualwhys · on July 7, 2018

Looks like this[1] could eliminate quite a bit of typeclass boilerplate. And this[2] opens the door for typed literals in Scala (which are quite a nice feature in TypeScript).

Will be interesting to see what use cases library authors apply TLP to in Dotty/Scala 3.

[1] https://github.com/dotty-staging/dotty/blob/add-transparent/...

[2] https://github.com/dotty-staging/dotty/blob/add-transparent/...

didibus · on July 6, 2018

I've never heard of typelevel programming. Can someone explain what it is?

_fq4v · on July 6, 2018

It's using the type system as a language in which to perform computations. For example, you can do type level programming in c++. As an easy example, 'std::list<x>' is a function that maps the type 'x' into a type that represents a linked list of 'x's. You can type systems compute many interesting invariants.

For example, in a language like Idris, you can define the type of arrays of length 'n' and with elements of type 'a' as 'Vector n a'.

Then, if you want to append two arrays together, you can use a function:

append : (x : Vector m a) -> (y : Vector n a) -> Vector (m + n) a

The '+' operator here is a function that takes two types ('n' and 'm') and returns another type that represents the arithmetic sum of 'n' and 'm'.

Joeri · on July 6, 2018

I haven’t done it myself, but I understand it as creatively using the type system to have the compiler already partially evaluate the program, instead of deferring all execution to runtime as with traditional programming.

rs86 · on July 6, 2018

No, it's about proving something about your program. Types are theorems about programs. If a compiler type checks (well if it is a sound type system) that your function returns an non-zero integer, it proves that it can only return non-zero integers. And you can write that kind of constraint in types. Types can be very expressive! You can even check that all variables are within scope in a lambda expression by using an appropriate encoding (de Brujin indexes users integers instead of variable names) for the expression. Typically you provide hints or even proofs for the compiler; they mix with something called proof assistants.

Scala looks hacky at it. If you want cleaner examples, look at the Coq, agda, epigram and idris languages.

chowells · on July 6, 2018

It's usually more about adding additional constraints on code that the type checker will accept than it is about partial evaluation.

That is, you're embedding additional logic in the type system. This additional logic lets you be more precise about what operations are allowed on values of particular types.

It's most useful when there are additional facilities in the language for type-directed code generation, at which point the complex types can also be used to generate behavior, not just restrict it. (Some examples of type-directed code generation include traits in Rust, implicits in Scala, and type classes in Haskell.)

lomnakkus · on July 6, 2018

I think you may be thinking more about typeful programming? (Aka. type-driven design.)

At least to me type-level programming is just a mechanism to achieve more at compile time. Another mechanism would be e.g. C++ constexpr or dependent types as in e.g. Idris.

chowells · on July 6, 2018

No, I definitely mean type-level programming. Sure, most dependent typing systems subsume type-level programming, but you can do interesting things without needing dependent types.

The only requirement for type-level programming is that you have program logic that functions at the type level, taking types as arguments and returning a type.

The most overused common example is concatenation of length-indexed vector types. You need a type-level addition function in order for that to be expressible.

A more sophisticated example where type-driven code generation is required would be something like a type-safe printf that puts the format string at the type level and computes the required argument types from it at compile time.

Sure, you can stuff entire programs into the type level, but that's not where things are the most useful. The greatest value comes from using type-level functions to make the types describing your values more precise.

lomnakkus · on July 6, 2018

Alright, I can see where you're coming from...

I guess it does all get kind of fuzzy once you have a TC type system anyway. (Whether that be via full dependent types or the rather strange not-quite-like-working-with-terms-and-types world of not-full-dependent-typing of e.g. Haskell.)

dwohnitmok · on July 6, 2018

A Turing-complete type system and a dependent type system are two different and independent things. That is it's possible for a type system to be dependently typed and not Turing complete and it is possible for a type system to be Turing complete and not be dependently typed (even a little bit).

Dependent typing concerns whether your types can depend on values. So for example

  if a then Int else Bool

is a (single) dependent type if `a` is a runtime value.

The difference between the two can be illustrated with natural numbers with the classic natural-number-indexed vector example that always shows up with dependent typing.

If you have a type

  Vector 5 Character

representing a vector of 5 elements of characters, if that 5 is the same 5 as you have at runtime then you have dependent typing. Otherwise this is not an example of dependent types.

Put another way, if you're allowed to do something like the following

  val x: Int = 5
  val xs: Vector x Character = Vector('a', 'b', 'c', 'd', 'e')

in pseudo-Scala syntax then you have dependent types.

If on the other hand you have something like

  type Zero
  type One
  ...
  type Four
  type Five
  ...
  val xs: Vector Five Character = Vector('a', 'b', 'c', 'd', 'e')

where you have a special set of type-level numbers that are distinct from your run-time natural numbers then you have typelevel programming but do not have dependent typing.

Your type system might let you do arbitrary things with the type-level natural numbers

  type Zero
  type One
  ...

which would mean your type system is Turing-complete even if it's not dependently typed.

Note that in practice it's usually true that languages with typelevel programming tend to inch towards dependent types. For example, by allowing

  val xs: Vector Five Character = Vector('a', 'b', 'c', 'd', 'e')

to type-check but not allowing

  val xs: Vector Five Character = Vector('a', 'b', 'c', 'd', 'e', 'f')

to type-check, unless your language is special-casing behavior for collections, there's usually some way to back out some sort of connection between a type-level Five and a run-time 5 (although this connection often only works for numeric literals in e.g. current Scala and works via macros). There's also a considerable amount of social overlap between programmers interested in typelevel programming and programmers interested in dependent types which tends to drive development of language features along those lines.

However, in theory, the two are completely separate.

didibus · on July 7, 2018

I see, so in a way it adds nothing more then convenience? In that, you could manually go ahead and create VectorTwo VectorThree, etc. But no one does because its tedious. So now you could instead say the type is (Vector of Two), which would evaluate to VectorTwo.

I'm still unclear how the type system knows that a VectorOne fits in a VectorTwo, and how inference could work at that level.

Also, with dependent types, I'm unclear what you mean with runtime values, if its still type checked at compile time, wouldn't the value need to be there at compile time?

dwohnitmok · on July 9, 2018

First off, general note to whom it may concern: Down-voting clarifying questions (e.g. the question I'm replying to) seems unhelpful.

It is indeed merely syntactic convenience if all you have are the labels

  type Zero
  type One
  type Two
  ...

If however, you have operations on those labels that are checked by the compiler, then you have what most people think of when they say "type-level programming," which is more than a syntactic convenience. For example, if the compiler recognizes addition of type-level numbers, that is there is a notion of Plus so that

  Plus One Two

unifies with

  Three

then all of a sudden you can do the classic length-indexed demo of length-checked append.

  appendInts: (Vector N Int, Vector M Int) => Vector (Plus N M) Int

again in pseudo-Scala syntax, which you can't do with simply labeling VectorOne, VectorTwo, etc.

Your question about dependent types is the classic one, if I understand it correctly. Namely, what does it mean for a type (a compile-time concept) to depend on a run-time value, if that run-time value almost always (except for literals) doesn't exist at compile time? The answer is that your type signatures usually end up being generic in the run-time value, and then different code paths allow you to add increasingly specific pieces of information to the run-time value. These code paths are fixed at compile time so the compiler is able to use these pieces of information when it type-checks the code.

An example is probably in order.

Let's say you're accepting user input in the form of a vector of integers (and let's ignore how we were able to format stdin into a vector for now). Again we'll use pseudo-Scala syntax.

  val userInput: Vector n Int

Note that here `n` is an Int (e.g. a 5) not a type-level Five since we're back in dependent type land.

Now we might have a function that only works for vectors of exactly length three and generates some string for those vectors.

  myFunction: Vector 3 Int => String

In order to use `myFunction` on `userInput`, we'll need some intermediate function that actually performs a check that `n` is indeed 3.

  checkIf3 : (n: Int) => Either[n != 3, n = 3]

Now we can do the following.

  checkIf3(userInput) match {
    case Right(nWasIndeed3) =>
      // Now the compiler knows that in this code path n is indeed 3
      myFunction(userInput)
    case Left(nWasNot3) =>
      // Throw some error as appropriate
  }

How you actually pass the information in the `Right` branch that `n = 3` to the compiler is implementation specific. Some implementations might decide to special-case equality and have the compiler automatically include that in the compilation process. Others might use some form of implicit arguments to `myFunction` (so that e.g. Vector 3 Int is actually syntactic sugar for (implicit n = 3, Vector n Int)). Hopefully, however, that serves as an understandable overview of what the general approach looks like.

Now if you're familiar with statically typed languages, you might rightly point out that what I've done in this example I could've done with just ordinary non-dependent types. E.g. if I used a type-level `n` instead of a runtime `n`. This is true for single-argument functions, but dependent types start pulling ahead in expressivity when you have multiple arguments that are related to one another.

For example indexing a vector might look like

  getElementAtIdx: (Int, Vector n A) => Option[A]

in a non-dependently typed language because we don't have a way of expressing a (run-time) relationship between the first and second arguments to our function, namely that the first integer must be less than `n`.

With dependent types we can do this by introducing a third argument.

  getElementAtIdx: (idx: Int, Vector n A, idx < n) => Option[A]

Note that we've also added the syntactic ability to bind a type to a variable in a type signature (idx), which is helps us express relationships among different arguments.

noncoml · on July 6, 2018

> Consider the standard definition of typelevel Peano numbers.

> A function that maps positive integers to Peano numbers can be defined as follows

Really? That was the simplest example you could think of?

imh · on July 6, 2018

Peano numbers are kind of a classic first typelevel programming example. Maybe it's their unofficial Hello World. Booleans and natural numbers are usually the start, but natural numbers are complex enough to show the pain points.

type_enthusiast · on July 7, 2018

As a shapeless enthusiast I agree with you that the Peano examples (which seem to be the go-to for any dependent types demo; not just for Scala... they must think it's helpful!) is terrible for people who don't already get dependent typing and logic programming. Which are the people you're supposed to be reaching with a demo.

I have a talk to intro typelevel proofs in Scala (with shapeless as the example target), that uses the kind of geometric proofs we all did in high school as an analogy. I think that's more of an accessible metaphor, but it also seems like precisely what Odersky is trying to eliminate in Dotty - the use of inductive implicit derivation to form "proofs" of invariants that are also implementations. Admittedly, it's an approach that seems rather difficult for a lot of engineers to grasp, which means it's of questionable utility. But it always seemed really elegant to me - type systems are about proving invariants, and Scala's just gives you powerful ways to prove things by unifying that concept into implicits.

rs86 · on July 7, 2018

Yep, it's hard to grasp for engineers, this kind of thing requires training in formal logics to get started with. I think things will not make much sense until one gets that the type system can be a formal system equivalent to logic