# System IG Semantics

I don’t have much time, so here’s just a quick jot. I don’t think System IG can prove this rather obvious proposition:

```    |- GL(\a. Ga(Ka)) (\a x. x)
```

Or translated to more conventional notation: (\a x. x) : (a : Type) -> (a -> a), the type of the polymorphic identity function. In order to prove this, we need the axiom LL (conventionally, Type : Type). But that axiom is inconsistent. So I’ve tweaked the rules a bit to allow the above to be proven, without allowing LL. It’s possible that these tweaks are still consistent; they don’t admit Girard’s paradox, at least.

The rules I’m modifying can be found on page 8 of the above paper, in the IG box. My tweak is, changing rule Gi from:

```    ?, Xx |- Yx(Zx)  ;  ? |- LX  ;  x not in FV(?,X,Y,Z)   =>   ? |- GXYZ
```

To:

```    ?, Xx |- Yx(Zx)  ;  ? |- L(GXY)  ; x not in FV(?,X,Y,Z)  =>  ? |- GXYZ
```

I’m saying that you need to prove that GXY (conventionally, (x:X) -> Y x) is a type before you can say something has that type. Without more modifications, this is equivalent to the original. However, now I add two new rules:

```    ? |- L(GL(KL))
? |- GL(KL)p  =>  ? |- L(GLp)
```

These say: (Type -> Type) : Type, and if p : Type -> Type then ((x:Type) -> p x) : Type. The latter one is the one I’ve heard is still consistent, but you can’t really do anything with it without the former. So, it needs further investigation.

Trying to prove the consistency of this new system, I need a semantics for terms. If I can model these semantics in CIC (the calculus used by Coq), then if CIC is consistent, so is this system. My idea for semantics is as follows:

Bare lambda terms don’t really have semantics; they are modeled by their syntax. But typing proofs have the semantics of whatever they are proving as well-typed. If I use [-] as the meaning function, then the semantics of [Lx] is a type (because it is a proof that x is a type). Formally:

```    [Lx] : Type
[Gxyz] : (a:[Lx]) -> [Gx(KL)y] a
```

Simplified, the latter means that the semantics of a proof that x has type y -> z is a function from y to z. (Rather, it’s the dependent version of that). These semantics motivate the axioms I am adding.

Okay, that’s all the time I have!

Advertisements

# Update

It has been a little while since I posted anything. So here’s a general update about my work and other aspects of my life.

I’ve been passively thinking about Dana, but not doing very much significant. I have abandoned the RTS language on the grounds that it is too hard for my little mind right now. Hopefully I will come back to it, but it can be safely moved further down in the queue without hurting my progress.

That brings me back to the dependent typed core. I’ve switched directions and decided that it really ought to be total. In experimenting with how to bring back partiality to a total language, it finally sunk in how partiality is a monad in its essence. It is in the very same way Maybe is, though my practical ideas about the essence of nontermination prevented me from seeing that.

I’ve been fooling around with the semantics of system IG, and have gotten nowhere interesting. Whatever beautiful connection between application and typing there is, I haven’t seen it yet.

Since typechecking arbitrary terms in system IG is undecidable, I’ve been trying to write an “interactive typechecker”: essentially a library for typechecking. Various inference algorithms can be built on top, but their correctness follows from the underlying library. I am stuck on the representation of conversion proofs between lambda terms.

All of these are rather minor problems that can be worked through with a little thought. I’ve been distracted, however, because of a woman named Anna, with whom I’ve been having at least two hour conversations almost every night. I had forgotten what it meant to have good conversations about nontechnical things (it’s really nice). We seem to like each other, despite my friend’s warnings about her being insane or something :-). I’m interested to see where it goes.

Oh and I’m starting school again in the summer, to finish by Bachelors in mathematics. The intention being to go do a PhD program with Paul Hudak, if that can be arranged, or something similar.

More meat soon.

# RTS Research

Over the past week, I have been again exploring low-level functional systems, in which to write an RTS (notably a garbage collector) for high-level languages. The progress has been slim, but I’ll report on my findings.

I get the hunch that linear lambda calculus is unsuitable, but I haven’t yet convinced myself of that and it’s the closest I’ve got so far. So I have been trying to implement a minimal LLC runtime in C. Apparently Haskell has fried my brain and I can no longer write in C without segfaulting up the wazoo. Or maybe that’s what writing in C is like and I had just forgotten :-).

Said runtime is more complex than I’d like it to be. It is a fairly involved pattern-matching graph reduction machine. It needs a dedicated memory pool to allocate from and deallocate to, holding the graph currently under consideration (but when things fall off the graph, they are freed immediately, which is its charm). I was hoping that it would run in constant memory, but I don’t think that’s possible, so this is the best I could hope for (in some sense, this memory pool is the “stack”). I would like to find a more direct translation, say into a (parallel) sequence of instructions. But that seems to conflict with the graph reduction properties that I like, namely that code can be “compiled” on the fly.

I wonder if there is an incarnation which can generate short bursts of assembly and run them, thus taking the best of both worlds.

Still, I’m not sure it’s possible to implement an RTS in linear lambda calculus, so I may just be barking up a ditch.

Jeremy Siek pointed out that copying collectors are simple and composable; i.e. all they do is copy a root set into contiguous memory and throw away the rest. A copier for a composition is a composition of copiers. This is very promising, and is what I’m shooting for.

He also pointed me at the Categorical Abstract Machine (sorry I couldn’t find any public links), which I guess is fairly popular, and definitely a clever little thing. But alas, it appears to need garbage collection, so I’m not focusing on that level yet.

The boulder functional programmers group is doing a two-session focus on CUDA, and in particular next meeting we will be talking about functional high-level (at least in comparison to C for CUDA) specification of parallel algorithms for CUDA. I’m hoping that what I find for the RTS language will also be applicable to this task.

Until next time.

# Dependent types are ridiculously easy

After an inquiry about a combinatory calculus for dependent types, somebody in #haskell — unfortunately I don’t remember who — recommended I look at Illative Combinatory Logic. After studying the paper a bit, all I can say is Wow! It implements a (conjecturally) complete logical system in but three simple typing rules, together with the untyped lambda calculus conversion rules. I was able to write a dependent typechecker in 100 lines.

The specific system I’m working with in the paper is called system IG. The paper also introduces three weaker systems, which are also very nice and worth understanding.

As an example, here is a derivation of the identity function and its type. This is a backward proof, so each step is implied by the step after it. But reading it this way makes the algorithm clear.

```    |- G L (\A. G A (\_. A)) (\A x. x)
L a |- (\A. G A (\_. A)) a ((\A x. x) a)     (Gi)
L a |- G a (\_. a) ((\A x. x) a)             (beta)
L a |- G a (\_. a) (\x. x)                   (beta)
L a, a b |- (\_. a) b ((\x. x) b)            (Gi)
L a, a b |- a ((\x. x) b)                    (beta)
L a, a b |- a b                              (beta)
```

The trick, which isn’t so much a trick as a beautiful deep connection I gather (although I don’t fully grok it), is that typing propositions are just applications. So if A is a type, then A b is the proposition “b has type A”. L is the type of types, and G is essentially Π (the product type constructor). So, for example, the fourth line in the above proof would be written as follows in a conventional system:

```    a : Type  |-  (\x. x) : a -> a
```

I need to extend it with finite types and boxes, but I really don’t want to because it’s so pretty as is! :-)

# Dana update: Core execution language, dependent combinators

It’s been a little while since I gave an update about Dana, so here it is.

There has been very little code progress in the last week. Instead, because of the self-embedding issues I talked about last time, I have been exploring the use of a combinator calculus as an intermediate stage for typechecking. I’m on to something, though nothing is terribly concrete yet. The combinator basis I’m using is:

```    Sxyz = xz(yz)
Ktxy = x    -- with the constraint (y:t)
Bxyz = x(yz)
Cxyz = xzy
Itx = x   -- with the constraint (x:t)
```

Along with primitives for Pi, Arrow (which is always expressible as Pi, but this keeps term size down), and Type.

The point of the constraints is to provide type annotations. I added a type annotation to every combinator which “erases” a lambda (rather than just pushing it downward) during factorization, because when a lambda is erased so is its annotation, so this puts it back.

My goal is for the typechecker not to use any supply of unique names.

Checking is not totally straightforward. Notice how terms like S have no unique most general type; they would have to take some type arguments to do that. So my current lead is that the type of a combinator term t is either a “real” type, or it’s a function which takes a term x and returns the type of t x (or another such function, etc.). This algorithm will be partial; i.e. there will be cases where I require an argument to have an actual type rather than a function. This is because attempting the “total” algorithm is undecidable, so it’s partial anyway.

The other thing I’ve been thinking about is a simple, composable “RTS language”, which does explicit memory management in some form, in which I could implement a garbage collector, etc. This has not been as fruitful as my combinator explorations. I’ve been looking at various forms of linear lambda calculus, which has been neat and enlightening — for example, there is a sense in which all computable functions are differentiable using LLC, which I might blog about soon — but unfortunately it does not look very promising as an RTS language. Any suggestions?

And today I started thinking about what I will need to write my higher-level languages, in particular PureHaskell (Haskell98 sans IO and FFI). It’s pretty annoying that I can’t use any Haskell libraries which use tricky GHC extensions, because those kill my bootstrap (i.e., my PureHaskell compiler needs to be in PureHaskell). There are a lot of libraries that use extensions. For once I wish the flagship compiler weren’t so capable!

# Dana needs a self-embedding of U-Box

My first major goal in Dana is to write a console session interpreter, in which you can construct, name, compose, and observe Dana objects in the console. Since this is a foundational project, I best say what it is I mean by “console session”:

```    newtype ConsoleSession = ConsoleSession (String -> (String, ConsoleSession))
```

In layman’s terms: a console session is a stateful function from input lines to output lines. Then the “hardware” will observe this by allowing you to type things in the console and showing you what the function returned.

Let’s say that this console session has a dictionary from strings to values and their types, and the user tries to apply one of those values to another. This might be illegal, since the types might not match up. Everything has to be well-typed, so we can’t just “try it and see”. If the Type of types is exposed (I’m not sure if it should be — it’s a question of whether the class of types is extensible), it looks like this:

```    data Type where
Type :: Type
Pi :: (x:Type) -> (x -> Type) -> Type
...
```

With some other stuff — those are the two most important. Notice that Pi has a function in it, so we cannot compare two Types while maintaining referential transparency and functional extensionality. These properties are very dear to me, so Dana had better have them too!

So how can we tell whether this application is well-typed? It doesn’t suffice to keep the types themselves in our dictionary; we need to keep quotations of them with decidable equality. I.e. we have to have:

```    TypeRep : Type
embed : TypeRep -> Type
equal : TypeRep -> TypeRep -> Bool
```

But then, when it is well-typed, we need to actually perform the application, which we can only do if it’s well typed. Bool is not a good enough return type for equal:

```    equal : (x:TypeRep) -> (y:TypeRep) -> Eq Type (embed x) (embed y)
```

Where Eq is the usual:

```    Eq : (A:Type) -> A -> A -> Type
Eq A x y = (P : A -> Type) -> P x -> P y
```

Then, if libraries wish to be accessed via the user, they should export TypeReps for every Type they export. For example:

```    Export : Type -> Type
Export T = (rep : TypeRep ; Eq Type (embed rep) T)

Integer : Type
IntQuote : Export Integer
```

I suspect that such a quotation scheme will be quite a challenge to create (it also corresponds to U-Box proving its own consistency, thus U-Box is inconsistent — but we already knew that :-). It could be fun to attempt if I’m in the mood for tricky mathematics. If I’m not in such a mood, I guess I could also cheat and provide TypeRep, embed, and equal as primitives, which might be the way to go just to get things rolling.

The necessity of quotation is the main reason that the core U-Box calculus should be as simple as possible, and most of my effort in this early stage should go toward simplifying it. Is there a combinator calculus for dependent types? That would make things easier.

# Dana update: System U-Box compiler is go

Over the past couple days, I have been working on a compiler for something between System U and ΠΣ, as experimentation for Dana’s core language. Here is the repo. I have just finished (I think — seems to work pretty well) the typechecker and it correctly typechecks all the programs I can think of. It is 339 lines: not beautiful and simple, but pretty nice nonetheless (and I haven’t given any thought to refactoring it yet).

The results are both positive and negative. For one, I am very happy with it, since unlike the PiSigma prototype I have only observed it die when there is actually a type error. In particular, unlike PiSigma, it correctly types this fixpoint cominator:

```    fix : (A:Type) -> (A -> A) -> A
= \A:Type. \f:A->A. let x:A = f x in x
```

I’m very pleased with the “box” dynamic introduced by the PiSigma paper; it seems to model the difference between data and codata in a nice uniform way. Boxes are in some way just explicit laziness.

Using the above fix, we can model equi-recursive types, by introducing a Box function, which “hides” its argument under a box.

```    Box : Type -> Type
= \A:Type. (tag:1) -> !case tag of { [A] }
```

(Numbers are finite types; eg. 3 has elements (0:3), (1:3), and (2:3), and case elimination is done by listing 3 cases in order). Then, for example, a Stream type looks like:

```    Stream : Type -> Type = \A:Type. fix Type (\S:Type. Pair A (Box S))
```

We had to hide the Stream in a box, otherwise we would end up with an infinite type at typechecking time. A high priority on my list is to characterize the semantics of boxes (hopefully they are sensible), to explain more deeply why you must box this infinite type.

Also, a few days ago I had “unsafe compilation” working, which compiled ASTs into a very fast “native Haskell” representation, as described in The Monad.Reader issue 10. It takes advantage of the typechecker, compiling into untyped combinators with unsafeCoerce everywhere, for proper type erasure. I have since added features to the AST, so it’s broken, but it shouldn’t be hard to revive.

So, that’s the good. I’m convinced this “ΠΣ sans Σ” calculus is powerful and expressive, and will work just fine for Dana. I would like it to be simpler — the AST has 12 cases! — but as far as I can tell the only thing that can reasonably be removed is letrec (replaced with a fixpoint primitive).

The bad is that my typechecker is fully and earnestly a typechecker — it doesn’t even feign interest in inference — and it is quite annoying writing all those type annotations freaking everywhere! In particular, case expressions need to be annotated with their eliminator scheme; so the case tag of { [A] } above was a fib; I actually need to write:

```    case tag => \tag:1. \$Type of { [A] }
```

So my compiler as-is, unfortunately, can’t reasonably function as a proper bootstrapping language, because I would never write anything in it.

I’m not quite sure what I will do about bootstrapping now. Perhaps I can translate Yhc.Core to, um, let’s call it “System U-Box”, and then bootstrap in Haskell. That would be cool. But I foresee problems when I need to start writing user interface systems, when we need to do deep embeddings of U-Box into itself… which I will write about shortly.

# Fun wth ΠΣ

I have somewhat irrationally chosen ΠΣ as the core language for Dana. I’ve been struggling to create a small, beautiful implementation. But I am getting closer and closer — it’s just about finding the right combination of representations so they fit together nicely — and my hope is that when I am finished the implementation will be under 300 lines or so (only for the bootstrap, once I get to unsafe compilation it will of course grow).

Meanwhile, I’ve been playing with the prototype implementation to get a feel for the language. It is very pleasant to work with, even without the “necessary” programmer niceties such as inference. The typechecker is partial, and it is occasionally annoying when it loops. But it is predictable, and as I gain more experience I am seeing how to avoid them, just as one avoids infinite loops when writing programs. The convenience gained by a parital typechecker more than makes up for this annoyance.

One of the things I played with is encoding sum types as product types with a church encoding. In my experiment it worked, so it’s possible that sigma types are not necessary at all. To warm up, here is a non-dependent encoding for pairs in Haskell.

```type Pair a b = forall c. (a -> b -> c) -> c

pair :: a -> b -> Pair a b
pair x y cc = cc x y

-- Here's how we destruct
unpair :: Pair a b -> (a,b)
unpair p = p (\x y -> (x,y))
```

So the encoding I used was just the dependent analog of the same thing, which I will reveal shortly. For those familiar with Agda, you will feel right at home in ΠΣ’s syntax. For those not, the trick is the notation (x : A) -> B, which is just like the type A -> B in Haskell, but first names the argument of type A, so that B may depend on it.

```Sigma : (A : Type) -> (A -> Type) -> Type.
Sigma = \A -> \P -> (C : Type) -> ((x:A) -> P x -> C) -> C.

Pair : (A : Type) -> (P : A -> Type)
-> (x : A) -> P x -> Sigma A P.
```

Look for the resemblance of the “Sigma =” line with the “type Pair a b” line above.

The encoding of Nat is straightforward (exactly the same as the encoding with ΠΣ’s built in sigma type):

```Nat : Type.
Nat = Sigma {'Z,'S} (\tag ->
Case tag of
{ 'Z -> {'Unit}
| 'S -> Nat
}).
```

And using it is almost as convenient; there are a few extra necessary type annotations.

```iterate : (A : Type) -> Nat -> (A -> A) -> (A -> A).
iterate = \A -> \n -> \f -> \x0 ->
n A (\tag -> \ns ->
Case tag of
{ 'Z -> x0
| 'S -> f (iterate A ns f x0)
}).
```

Notably, the “A” after “n” is absent when using the built-in syntax.

Meanwhile, Sigma types make a great encoding for modules. I will switch to the built-in syntax now, because it is a bit cleaner. It looks like this: (x : A ; B), which is just like the Haskell pair (A,B), except that it names the element of type A so that B may depend on it. Also, ; associates to the right, so we can make multi-field syntax.

So here is a first-class Nat module:

```NatModule : Nat : Type
; zero : Nat
; succ : Nat -> Nat
; iterate : (A : Type) -> Nat -> (A -> A) -> (A -> A)
; {'Unit}.
NatModule = Nat, zero, succ, iterate, 'Unit.
```

So the core calculus already supports first-class modules (this is true of any dependently typed language). Throw a little syntax sugar on that and I might start to prefer this to Haskell.

Which is good, because the plan is to bootstrap in Haskell and then switch to all-PiSigma, until I can build a rudimentary Haskellish interpreter that compiles down to PiSigma.

I’m doing all this work because everything has to be first-class in Dana, because there is no compilation phase. It’s all tied together in one big jumble.

# Existential Memoization

FRP has this essential trade-off of computation vs. garbage collection. I can come up with implementations which are lazy in their computation, but never forget anything; and I can come up with implementations which forget but are too eager in their implementation (in which garbage collection has too much effect on runtime performance, because we need to kill threads with weak references and stuff). But I can’t come up with an implementation which has both. And my current theory is — I never thought I’d say this — Haskell isn’t good enough!

This language + RTS extension is still in its early stages, and I’ll have to think and prototype more to hash out the details and verify its feasibility. But since I have decided to embark on the Dana project, I am now comfortable with relying on new language features and runtime to get the job done, which previously has not been an option.

Here’s the motivating example. You have a bunch of behaviors — functions from Time to a — which are defined progressively, i.e. they use earlier results to compute later ones. In the best case, we could model these as plain functions of Time, but of course that is not practical in Haskell. In that case, it is lazy computation that never forgets. My idea is to have enough support so that it can forget.

The reason it is allowed to forget is that Time itself is also progressive; i.e. if you have a reference to a some time t, you cannot get to any earlier times. So when the last reference to a time before t disappears, all behaviors can forget everything before time t.

But that’s not a very good solution by itself (unfortunate, because it can be accomplished in Haskell with weak references). The reason is that a single naughty value that holds on to an early time will ruin it for the rest of the program — no behaviors can forget anything after that time. How do we keep the effects more local. Well, when the garbage collector does its run, instead of marking objects as “alive” or “dead”, it can mark them as “needed after t”. So you might mark an object multiple times, with decreasing times each time you see it.

And of course there is nothing special about time. The extension ought to be general enough to handle “stuff like this”. I haven’t even thought about how to make a garbage collector that does this efficiently.

Alright, enough motivation. Onto the language design. I want to add support for ‘progressive folds’, and memoize them with garbage collection. Say you have this data type:

```  data U s where
Succ :: U s -> U s
```

This looks inane, right? It is unpopulated. Or is it? We can fold a progressive function over this:

```  foldU :: (b -> b) -> b -> exists s. (U s, (U s -> b))
```

I.e. we can define functions over differences of nodes, i.e. in terms of a starting point (which is kind of what the s type variable denotes).

Also, when a particular value U s disappears, we can purge that entry from the memo table and everything that goes along with it. We also know that references inside the scope of the existential eliminator can’t escape, which will end up giving us the time-like relationships I described.

I’m still thinking about the mathematical foundations of this. I.e. what does such an “unrooted” data type mean? This fold also feels less general than it could be; is it possible to introduce two unrooted functions at the same time, or use different recursion schemes, … etc.

# The Third Virtue

Ever since working with the Anygma team, the idea of a purely functional operating system has been stirring in my head: An operating system that stays denotational and composable all the way down to the device drivers. In this world, we have to make no sacrifices like IO to interface with the impure ideas around us. We can still communicate with the outside world by modeling them in a pure way.

Just to be clear: I will not be accepting comments about the feasibility of the idea itself (eg. don’t say “but the idea of an operating system is necessarily stateful” or any of that bull). Comments and criticisms about the approach I describe are more than welcome.

So this has been stirring away. Then at Thursday’s Boulder functional programmers group, somebody mentioned Squeak. I had seen Squeak a long time ago, and was amazed and impressed by the concept. There is no distinction between compile time and runtime, between source code and program. Everything is just objects floating around in the world. And of course this matches perfectly with the purely functional operating system idea.

It sucks to program with text. Programs are not text. Ultimately, programs are their denotation, but we need a representation. I say that representation is abstract syntax. Our program editors output abstract syntax, which the compiler turns into real objects. Text doesn’t have to enter the equation anywhere.

I don’t think this idea fits Haskell’s type system very well. We need something closer to dependent types, because types have to be objects floating around in the world too. So, while I’m at it, I think I’m going to make a pure language — pretty close to, but not exactly Haskell — which is the core abstraction language. Maybe something both expressive and barebones like PiSigma. But I’ll need a real-world language to go on top of that.

Hubris indeed.

So! What’s the large-scale plan? I am going to assume nobody will help me (other than to offer suggestions) in these early stages. This is a Grand Vision (TM), and I doubt anybody but me is going to get excited enough to contribute until it starts coming together at least a little. But I am completely excited, as this is the culmination of an idea that has been developing for years. Oh, and that was at the beginning of the large-scale plan because excitement is the top point in my book — engineering is secondary to motivation.

I don’t really know anything about drivers or hardware or any of that, and this undertaking is already way too huge for me. Therefore, I’m going to go the route of Squeak and implement an image-based sandbox — an OS inside an OS. I’ll start by choosing a core language and making a really slow proof-of-concept interactive text-based interpreter for it. At that point, many options open up, depending on what I’m inspired to do at any moment: improve the language frontend, work on graphical hooks, start work on a faster interpreter or JIT, start designing the FRP essentials, ….

As for the FRP essentials, why do I think they will be more possible in this new concept than in Haskell? Well, I have a sophisticated RTS extension that I think will handle that, which I will blog about riiiight… now! (And there are a few ways to compromise in case that idea doesn’t pan out)

Oh, btw, I’m calling this project Dana, after da man Dana Scott.