I’ve been reading some of Simon Peyton Jones‘s papers on type families, and he mentions types representing Peano numerals, namely:

data Zero data Succ a

And then proceeds to implement arithmetic like this:

type family Add a b :: * type instance Add Zero x = x type instance Add (Succ x) y = Succ (Add x y)

Such things are quite useful, and are done quite often using templates in C++ (taking advantage of C++’s turing-complete compile phase). This can be done in C++ as follows:

struct Zero; template<class T> struct Succ; template<class T, class U> struct Add; template<class U> struct Add<Zero, U> { typedef U Result; }; template<class T, class U> struct Add<Succ<T>, U> { typedef Succ< Add<T,U> > Result; };

One example is to give a type of fixed-size vectors, so that you can do, for example, a safe element-wise add. As primarily a creator of domain specific language-like libraries, the main use I see for such nice types is creating type systems for DSLs which are not “almost Haskell”.

However, in both C++ and Haskell, this is very clunky. There is no guarantee that Add<X,Y>::Result exists (in C++), likewise there is no guarantee that Add x y is defined (in Haskell). In addition, there is no guarantee that C++ will ever finish compiling; there is such a guarantee in Haskell, but that guarantee is made at the expense of not being able to do very much.

I want my compile time to be decidable. If I’m hacking on a huge project like mozilla, I don’t want to start a rebuild, go get a cup of coffee and for the usual 20 minute compile phase, come back half an hour later just to realize that I’ve written a compile-time infinite loop. You could see how that would be a major productivity hinderance. On the other hand, I want numbers at compile time!

Meanwhile, I’ve been learning about languages like Agda and Epigram, which use *dependent types*. These languages have *extremely* powerful type systems, at the expense of essentially running the whole program at compile time. The distinguishing feature, though, is that an implementation of any program in Agda or Epigram comes with a proof that it will halt. That is, they are not Turing complete, but they can still do most things.

Then it occurred to me: would it be possible to have a full dependently-typed language for the compile phase of a program, but not let the type system so far down into the semantics as Agda and Epigram do, so that you can have infinite loops at runtime (and just as importantly, don’t have to prove your program halts, since that is often pretty difficult). For type-level stuff, though, I’m okay with doing such proofs, since my type level stuff is infrastructure for my clients to prove their code is correct, so why shouldn’t I have to prove mine is?

The question is whether that is reasonable. Can I really get the desired power without “true” dependent types going all the way to the value level? Dependently typed language can use value-level functions in their types because the value-level functions have halting proofs. We might provide two sorts of functions, “total” functions and normal functions, where total functions must have a halting proof, with the benefit that you’re allowed to use them in your types. That doesn’t seem very clean, but it does seem practical.

Another practical issue with dependently typed languages is that many standard functions have more precise forms, and it would be easy for a smarter person to write a library that I couldn’t use because I’m not smart enough. Here is a passage of an Agda program demonstrating this:

-- returns a list containing all naturals between a and b range : (a : Nat) -> (b : Nat) -> (a <= b) -> [ Nat ] range a b pf = map (\x -> a + x) (upto (subtract a b pf))

Look at the signature for that function. It’s a range function, it ought to take two arguments. But it looks like it takes 3, and what the heck is that last argument? It turns out that it’s a proof that the first argument is less than the second one. That wasn’t really necessary to implement range, I could have just said that if a > b then I could just return the empty list, but it felt cooler to require that.

That’s what feels unclean about this. There’s a time for extremely precise functions and there’s a time for less abstract ones, and mixing them is pretty tough (i.e. it’s hard to call a precise function from an imprecise one, because you don’t have the proofs you need). But it might be okay. Take Haskell’s compare function for example (after instantiating it to Int):

data Ordering = LT | EQ | GT compare :: Int -> Int -> Ordering

In such a half-dependently typed language, we might instead imagine that function working like this:

data Compare : Int -> Int -> Set where LT : (s < b) -> Compare a b EQ : Compare a a GT : (b < a) -> Compare a b compare : (a : Int) -> (b : Int) -> Compare a b

So comparing two numbers not only tells you whether one is less than the other, it proves it. But of course with every weakening of an assumption we have to strenghten a proof elsewhere, so now to implement an Ord instance you have to prove your comparison result (and if it’s implemented fully, also some transitivity and reflexivity properties), which can be really annoying.

Anyway, I think it would be good to have such a language (I’ll call it “practical dependent typing”) just to play with to see how these issues work themselves out. I think implementing such a language shouldn’t be too hard using Agda: basically I have to allow Agda’s type system to relax in certain places. I’m just afraid of hacking on Agda, because those guys are way too smart for me…

**UPDATE:** Woohoo. It seems that my work is already done for me, because in Agda a function which can’t be proven total just issues a warning, but will still execute.

Try the D language, it allows all the numbers you want at compile time.

Right, but D’s compile time is undecidable like C++’s.