Monthly Archives: April 2009

Dana (actual) progress

I have some very exciting news! I wrote some actual code in the Dana repository. It is the Certificate module. That is, it’s an abstract data type Proof, such that only valid proofs in IΞ can be constructed.

The certificate module (IXi.Term and IXi.Proof together) is about 280 lines, which isn’t fantastically small, but isn’t too bad. This is especially considering that I don’t expect it to grow—it is perfect, modulo bugs.

Right now, Proof is just a proof checker function, but it’s designed so that I could swap it out with some serializable format, or even (crosses fingers) a dependent certified type.

One interesting development in this implementation is the new conversion certificate for lambda calculus with De Bruijn notation. That is, objects of type Conversion represent a valid βη conversion between terms. Previously, I hadn’t been satisfied with my solution: the certificate was implemented as a pair of convertible terms. This led to far too many equality comparisons of terms when combining and using certificates, which is both inefficient and I suspect would be hard to prove things about. Also, it required you to know too much about the goal you were proving, bloating the proof terms and tangling them with the theorems they were trying to prove.

The hardest part is β expansion, for example turning X into (\x.\y.x) X (\x.x). That was the reason for the pair representation: to do beta expansion, you just did a beta reduction and reversed it.

The new implementation instead implements conversions as partial functions. I.e. you give a conversion a source term, and it gives you an equivalent term (or says it couldn’t convert). This means I had to separately model beta reduction and beta expansion, because you can’t easily reverse a conversion. However, the solution is quite clean. I chose a basis of expansion combinators, which can be composed to form any expansion. They are:

    Identity : A → (\x. x) A
    Constant : A → (\x. A) B    [x not free in A]
    Apply : (\x. A) C ((\x. B) C) → (\x. A B) C
    Lambda : \y. (\x. A) B → (\x. \y. A) B   [y not free in B]

This is in addition to the other combinators, which are needed to make this basis complete. They include β reduction, η expansion/contraction, and ways to focus a conversion on a subexpression. The key is that each combinator is correct by inspection, so we can be confident that the conversion algebra is sound.

I chose these combinators by thinking about what would be needed to construct the inverse conversion from bringing a term to normal form. If you’re familiar with SKI factorization, the process is pretty similar. Whenever you reduce an application (\x. A) B, you look at the structure of A and “push” B in by one level, applying one of these combinators. For example:

Term Conversion
(\f. \y. f (f y)) (\x. x) Lambda
\y. (\f. f (f y)) (\x. x) inLambda Apply
\y. (\f. f) (\x. x) ((\f. f y) (\x. x)) inLambda (inLeft Identity)
\y. (\x. x) ((\f. f y) (\x. x)) inLambda Identity
\y. (\f. f y) (\x. x) inLambda Apply
\y. (\f. f) (\x. x) ((\f. y) (\x. x)) inLambda (inLeft Identity)
\y. (\x. x) ((\f. y) (\x. x)) inLambda Identity
\y. (\f. y) (\x. x) inLambda (Constant (\x. x))
\y. y

The reverse composition of the conversions on the right will bring us from \y. y to (\f. \y. f (f y)) (\x. x).

But isn’t it an awful pain to write all those combinators when proving things? Of course not! I make a computer do it for me. I have a little algorithm which takes two terms and computes a conversion between them, by bringing them both to normal form, and using the forward conversions one way and the inverse conversions the other way. Of course, if I give it terms which have no normal form it won’t halt, but the idea is that these terms are static: I use dummy terms to explain the conversion I want, and then apply the conversion I got back to the real terms (which may have subterms without normal forms).

So I say: get me from (\x y. A x) A I to (\x. x x) A, where “A” and “I” are just strings, and then I apply the conversion I got back to, say, (\x. \y. WW x) (WW) (\x. x), where WW has no normal form. The conversion still succeeds.

The certificate pattern shines here: my constructors are easy to verify, then I have a fairly involved algorithm for constructing certificates that is easy to use, which is guaranteed (at least one sort of) correct by construction.

So that’s fun stuff.

Proofs are still pretty tedious, however. My next step is to make some smart “tactic” combinators (which of course generate the underlying certificates) to make proofs easier. It shouldn’t take too long to make it at least tolerable. Then I’ll build up a library of certified infrastructure necessary for typechecking Haskell--, and finally write the compiler to complete the bootstrap. There are plenty of dragons to be slain along the way.

Some Constructions in IΞ

Over the past couple months, I have been attempting to find a language to use as a core calculus for Dana, as anyone who follows this blog knows. I have been slowly converging on IΞ for its elegance and simplicity. It turns out to be very powerful, capable of modeling many structures while avoiding many of the paradoxes typically involved with those structures. In this post, I give an exposition of IΞ and construct some common structures.

The main idea is to use lambda calculus for all of our symbol manipulation needs, adding a constant Ξ for quantification. “Types” are unary predicates, so membership is application. For example, “Nat 0″ is a proposition saying that 0 has type Nat (where 0 and Nat are given previous definitions somewhere).

ΞXY means “forall x, if X x, then Y x”, or, if you think of predicates as sets, simply “X is a subset of Y”. So if we have types Nat and Bool, we can say that f has type Nat → Bool with “ΞNat(\x. Bool (f x))”, read “forall x in Nat, f x is in Bool”. Very direct, is it not?

System I

Interpreting these meanings directly, we arrive at Curry’s System I, whose rules follow. A proposition (sequent) has the form “Γ |- X”, which has the interpretation “assuming Γ, X is provable”.

Γ |- X   ⇐  X ∈ Γ
Γ |- Y   ⇐  Γ |- X  ;  X is βη convertible with Y
Γ |- YZ  ⇐  Γ |- XZ  ;  Γ |- ΞXY
Γ |- ΞXY ⇐  Γ,Xa |- Ya  ;  a does not occur free in X,Y,Γ 

I have stated the rules as “to prove X, you need to prove Y”, because that’s kinda how my brain works. Take a moment to internalize them. They are obvious given the above intuitions, and shouldn’t be hard to read.

On top of this, we can build propositional calculus. Define “K = \x. \y. x”, and write “X ⇒ Y” as shorthand for “Ξ(KX)(KY)”. This system has the properties you would expect of a simple propositional calculus.

Sadly, this system is inconsistent. We can prove any proposition X:

  Let Y = (\x. x x ⇒ X) (\x. x x ⇒ X)
  Observe that Y = Y ⇒ X.
  [1] Y |- Y
  [2] Y |- Y ⇒ X
  [3] Y |- X      -- modus ponens on [1],[2]
  [4] |- Y ⇒ X
  [5] |- Y
  [6] |- X        -- modus ponens on [4],[5]

Martin Bunder had a brilliant idea to block this paradox, and also many others, which brings us to the object of my infatuation:

System IΞ

The crucial step above was [3]-[4], where we abstracted over the infinite proposition “((… ⇒ X) ⇒ X) ⇒ X”. The way we will block this is to only allow abstraction over finite propositions. Introduce a new symbol, H, such that HX means “X is a finite proposition” (or simply “X is a proposition”). We will derive finiteness from the finiteness of proofs: to prove HX, we first have to prove H of each component of X. Our system becomes (the new additions are in bold):

Γ |- X      ⇐  X ∈ Γ
Γ |- Y      ⇐  Γ |- X  ;  X is βη convertible with Y
Γ |- YZ     ⇐  Γ |- XZ  ;  Γ |- ΞXY
Γ |- ΞXY    ⇐  Γ |- H(Xa)  ;  Γ,Xa |- Ya     ;  a does not occur free in X,Y,Γ 
Γ |- H(ΞXY) ⇐  Γ |- H(Xa)  ;  Γ,Xa |- H(Ya)  ;  a does not occur free in X,Y,Γ
Γ |- H(HX)

The final rule is an axiom, simply saying that “X is a proposition” is always a proposition.

Constructing the Naturals

Now I will embark on constructing the type of naturals. Since types are predicates, I also need to come up with a representation for naturals. It turns out that it doesn’t matter what representation I use, as long as it has zero and a 1-1 successor function. For the sake of discussion, let’s use the Church encoding.

  0 = \f. \x. x
  S n = \f. \x. f (n f x)

So a natural is an iteration function. For example, the number 3 iterates a function 3 times on its argument: 3 f x = f (f (f x)).

Coming up with a way to classify all of these, but not any others (such as infinity f x = f (f (f (f (f ...))))), was quite a challenge. You might try to classify these function on the property that they are an iteration function, but any sensible method of doing that ends up including infinity. I began thinking that IΞ was not strong enough, and looking for ways to enrich it by adding more axioms.

Fortunately, no new axioms are necessary! The encoding is obvious in retrospect. What is the first thing a mathematician thinks when you talk about the naturals: induction! Let’s define a natural as any object which you can do induction (with 0 and S) over.

To make this readable, we need to introduce a few more symbols:

    f . g = \x. f (g x)   -- composition
    A → B = \f. ΞA(B . f)   -- the type of functions from A to B
    True = ΞHH    -- the true proposition
    U = K True    -- the type of all objects (mnemonic: universe)
    L = U → H  -- the type of predicates/types

Now, for the naturals:

    Nat x = ΞL(\p. p 0  ⇒  ΞU(\n. p n ⇒ p (S n))  ⇒  p x)

Reading this in English: “x is a Natural if, for all predicates p, if p 0, and p n implies p (S n), then p x”. In other words, x is a natural if you can do induction up to it.

Exercise: prove |- Nat 0 and |- (Nat → Nat) S. Note that this can be done independent of our definitions of 0 and S.

More Inductive Constructions

Using this scheme, we can construct all sorts of things. For example, equality:

    Eq x y = ΞL(\p. p x ⇒ p y)

Lists:

    nil = \n. \c. n
    cons x xs = \n. \c. c x xs 
    List a x = ΞL(\p. p nil  ⇒  ΞU(\ys. p ys ⇒ Ξa(\y. p (cons y ys)))  ⇒  p x)

There is a classic paradox involving the inductive type Set = Set → H, which is definable using this scheme:

    Set x = ΞL(\p. Ξ(p → H)p ⇒ p x)

However, as I tried to prove the inconsistency, I was blocked by the H rules. This gives me hope.

Coinductive Constructions

It is also possible to construct coinductive types. Here are the “conaturals”, the naturals with infinity. We can’t use the constructor definition anymore; coinductive types are all about projections. So the conaturals have only one projection, onto a disjunction. So to eliminate a conatural n, you pass it what to do if n = 0, and what to do with n’ if n = S n’. For example, to check if a conatural is zero, you can use n True (K False).

   CoNat x = ΞL(\p. Ξp(\y. y True p) ⇒ p x)

In English: x is a conatural if for all predicates p, if p y implies y True p, then p x. y True p can be interpreted as y = 0 or y = S n and p n.

Isn’t IΞ cool? Simple, obvious, but powerful enough to model a wide class of data (and codata). The thing I like best about it is that it is untyped at its core. Functions are just pure lambda calculus functions that we are “later” proving properties about. Type erasure comes for free (well, sortof: encoding dependent types into this system will end up passing types as parameters at runtime, even though they are never used).

Future Concerns

Bunder proved his version of IΞ consistent and complete. But the rules I gave, and that I have been using, are not his, and in fact are more powerful than his. This makes it possible that my rules are inconsistent. My system can prove LL, while his cannot. This worries me, because LL in the typical interpretation means “Type : Type”, which gives rise to the Burali-Forti paradox. However, even though I was able to encode the infrastructure for Russell’s paradox, I was unable to complete the proof because of the H rules. Maybe the same thing will happen?

I’m spending a bit of time trying to understand the Burali-Forti paradox that gives rise to the inconsistency in Girard’s System U, so that I can try to carry it out in IΞ. If IΞ does turn out to be inconsistent, I am not worried. A few universe restrictions here and there (i.e. leave U unspecified, rather than defining as K T) should do the trick, at the expense of some convenience and beauty.

Also, since I intend to compile Haskell to IΞ, I need to talk about partiality somehow (eg. prove that if a function terminates, it has this type). This has been giving me trouble, but in a good way. I found that I don’t really understand how bottoms behave, and how to talk about them without implying their semantics. I’m confident there is a way, though, maybe after adding a few axioms as a compromise. But I need to think hard and understand bottoms better, so I know what ought to be true, before I try to prove them or add them as axioms.

Might as well face it, I’m addicted to logic

I’m trying really hard not to become a logician. Like my obsession with FRP, it would be very interesting and educational. But my FRP fancy came from a desire to make games more easily, and I have since lost interest in that endeavor, studying FRP for its own sake. Now I am trying to change the world with Dana, and getting caught up in the beauty and unity of different logical systems.

This happened when trying to choose a core calculus for Dana. I am now furiously interested in Martin Bunder’s work on combinatory logic (btw, if anybody has a copy of his PhD thesis, “a one axiom set theory based on combinatory logic”, please let me know). System IΞ — or rather, systems nearby it — strike me as amazingly beautiful. It is based on an untyped lambda calculus, in which you can prove things about untyped functions (which is a way of endowing them with types). For example, to say that f has type A → B, you say:

Ξ A (B ∘ f)

In English: for all x in A, f x is in B.

However, the core logic really isn’t that important; I’ve only been focusing on it because it’s interesting. In fact, a cool thing about Dana is that there is very little dependency between its parts. But I would really like to start making something rather than researching. How come math is so fascinating?

Anyway, I am not sure that IΞ is strong enough. Assuming a “big enough” universe, I’ve been able to construct an equality predicate (the construction is essentially “the smallest reflexive relation”). But I have had little success in constructing any inductive types, such as the naturals. That’s why I want to read Bunder’s thesis — to get ideas.

Not just system IΞ, but logic in general, is fascinating me. Large cardinals in set theory, universe levels in CIC, and related “stratification” ideas abound and unify to create some intuitive notion of truth. In some sense, truth is the strongest consistent such unverse — however there is provably no such thing. In what system should we then work? Is it essential that picking a system of axioms in which to work will always be a part of mathematics? How do you consolidate results which assume different axioms?

That is actually my current goal for Dana’s core. I think the core calculus will be very weak, and you add axioms as you need more (in line with a quote from Dr. Scott himself: “If you want more you have to assume more”). Such axioms will have the same pattern as e.g. the IO monad in Haskell; your assumptions bubble their way to the top. However, it’s a much richer system than “IO or not IO”; you know exactly what you are assuming to run any piece of code. If there is a top level “user OS”, its assumptions will be vast (or maybe there’s some clever way to incrementally add them?).

Anyway, if the itch to make something irritates me so, I can assume I have a strong core logic — whatever it is — and start building things at a higher level. It’s emotionally difficult for me to do so, because I like to feel like I am on a strong foundation (isn’t that the whole point of Dana, after all?).