Monthly Archives: May 2009

Downtime over, hopefully

It has been almost a month since I’ve had something to post about. So here is something to fill the awkward silence.

Me and Anna
Me and Anna

Most of my life has been filled by my new love, Anna. This may explain why my productivity has gone down, but not why it has stopped completely. A recent study may provide justification for the latter.

I’m running out of money, so I’ve been looking for a job. The state of my resume is mixed, so that I look unqualified to first-order employers (looking for specific trendy technologies, most of which I don’t have that much experience with) and qualified to the ones who realize that they’re looking for good programmers, not good encyclopedias. Unfortunately, the latter such employers tend to be sharp people, and put the pieces together to find that I’m more of a researcher than a developer. So… I’m having trouble.

I have been working on a computer game with my friend Max, codename Silver Seed. It is beautiful and innovative, so much so that I’m having trouble seeing where the gameplay is. It’s a gardening game, of sorts. I’m afraid I can’t say much about the game mechanics, because Max is afraid of people stealing our ideas. It will be a mind-blowing game once we know what it is. It is incredibly fun to work on, and it has been the sink of most of my creative energy recently.

Silver Seed screenshot

Of course, this means that Dana has been rather off my mind. The more I think about it, the less I know what Dana is, and what the next step should be. I’m continuing to work on the system, which I hope will be hackage-worthy soon. But once I’m happy with that… what do I do? Is IΞ executable, or is it just a calculus for reasoning –an abstract semantics? If the latter, what is executed? And who executes it?

I guess my first goal still remains, but it’s form keeps changing: I want a command line utility where you can program functions, run them, and see the results. Because I no longer have a code calculus which can be typechecked (rather, you need to manually prove that your programs are well-typed (don’t be afraid, this is a great benefit!)), I think I will start by accepting only simply-typed programs at said command line.

I want to get crackin’ on this, damnit! If only I didn’t need to eat…

Lazy Partial Evaluation

Inspired by Dan Piponi’s latest post, I have been looking into partial evaluation. In particular, I thought that a language which emphasizes currying really ought be good at partial evaluation. In this post I describe some ideas regarding partial evaluation in functional languages, and later sketch a partial evaluation machine I devised.

Supercombinator reduction machines, like GHC, do a limited form of partial evaluation. I.e. when you compile a program to supercombinators, you are optimizing it for specialization from left to right. So if f is defined in pointful form, “let a = f x in (a z, a z)” might be a better program than “(f x z, f x z)”. This is a nice property of combinator reduction. Unfortunately, it doesn’t generalize: “let a = flip f x in (a y, a y)” will never be better than “(f y x, f y x)”, because functions only specialize from left to right. I conjecture that this missing feature is more important than we realize.

Mogensen gives a very elegant partial evaluator in pure lambda calculus, which optimize as expected with the Futamura projections (see Dan’s post). This partial evaluator works on higher order abstract syntax, taking and returning descriptions of terms rather than the terms themselves. Essentially all it is is (very simple) machinery describing how to evaluate under a lambda.

The system in that paper takes many precautions to avoid unfolding dynamic arguments, because otherwise the partial evaluator might not terminate. Apparently he is not well-versed in our Haskell School of the Infinite, because the evaluator is compositional. So what he means by “not terminate” is “return an infinite program”. But an infinite program is fine if you interpret/compile it lazily!

In fact, I believe (I am intuiting — I have done no formal checking) that the simplest-minded of lazy partial evaluators is perfect: it maximally unfolds its static arguments, there is no need for the type inference machinery in that paper, and it will have the same termination properties as the program. I attribute the ease of this task with the built-in metacircularity of lambda calculus.

Cool as a self-embedded partial evaluator is, to do efficient partial evaluation you need to keep quotations of your programs everywhere, then compile them before you actually use them. Lambda calculus is hinting at something more: that simply by applying one of several arguments to a curried function, you are specializing it. Wouldn’t it be great if every time you did that, the program were maximally specialized automatically?

A partial evaluation reduction strategy

It turns out that such an ambitious wish is nothing more than an evaluation order for the lambda calculus. Admittedly, it’s not a very common one. You try to reduce the leftmost application, even under lambdas. We would also like to combine this with call-by-need, so when an argument is duplicated it is only reduced once.

Here’s an example program I’ve been working with, with the standard definitions of the combinators Ix = x and Kxy = x.

  flip (\x y. y I x) K K

It’s not much, but it gets the point across. Let’s look at it in call-by-name:

[1]  (\f a b. f b a) (\x y. y I x) K K
[2]  (\a b. (\x y. y I x) b a) K K
[3]  (\b. (\x y. y I x) b K) K
[4]  (\x y. y I x) K K
[5]  (\y. y I K) K
[6]  K I K
[6'] (\m n. m) I K
[7] (\n. I) K
[8]  I

Notice by the step [4] that we have lost the structure of flip (\x y. y I x) K, so any work we do from then on we will have to redo on subsequent applications of that function. Contrast this with the partial evaluation strategy:

[1]  (\f a b. f b a) (\x y. y I x) K K
[2]  (\a b. (\x y. y I x) b a) K K
[3]  (\a b. (\y. y I b) a) K K
[4]  (\a b. a I b) K K
[5]  (\b. K I b) K
[5'] (\b. (\m n. m) I b) K
[6]  (\b. (\n. I) b) K
[7]  (\b. I) K
[8]  I

We got the function all the way down to a constant function before it was finally applied.

One thing that’s interesting to notice about this strategy is that it seems stricter than call-by-name. That is, if you have a nonterminating term N, then reducing the application (\x y. Ny) z will loop, whereas it won’t in CBN. However, recall that in the domain theory, (\x. ⊥) = ⊥. The only thing you can do to a function to observe it is apply it, and whenever you apply this function you will loop. So you are bound to loop anyway if you are evaluating this application.

The lazy partial evaluation machine (sketch)

Here is a sketch of an efficient low-level machine for this evaluation strategy. It is simple stack machine code (“which I pull out of my bum for now, but I don’t think an algorithm to generate it will be any trouble”). The only tricky bit about it is that pieces are sometimes removed from the middle of the stack, so it can’t necessarily be represented linearly in memory. A singly linked representation should work (I realize this costs performance).

The values in this language look like [v1,v2,…] { instr1 ; instr2 ; … }. They are closures, where the vn are pushed on to the stack in that order before the instructions are executed. Values can also be “indirections”, which point to an absolute position in the stack. Indirections represent a logical reordering of the stack, and are used to model bound variables. When indirections are executed, they remove themselves and execute (and remove) the thing they point to. The instructions are as follows.

  pop n      -- delete the entry at position n
  dup n      -- push the entry at position n on the top of the stack
  (other standard stack ops here)
  abs n      -- push an indirection pointing to position n 
             -- (skipping other indirections) on top of the stack
  exec n     -- pop and execute the closure at position n
  closure [n1,n2,...] { instr1 ; instr2 } 
          -- remove positions n1,n2,... and add them
          -- to the closure with the given code, and push it

dup is the only instruction which duplicates a value; this is where laziness will be encoded.

Let’s look at the code for each piece of our example program:

    I: [] { exec 0 }
    K: [] { pop 1; exec 0 }

These are pretty straightforward. They both receive their arguments on the stack (the first argument in position 0, and downwards from there), reduce and continue execution. Recall the code is what happens when a value is forced, so every value ends with an exec to continue the execution.

    (\x y. y I x): [] { push I; exec 2 }

This one might be a little tricker. The stack comes in like this (position 0 on the left): x y, After push I, it looks like “I x y”, so y’s incoming stack will be “I x” just as it wants. Finally, the interesting one:

   (\f a b. f b a): [] { abs 2; exec 2 }

abs 2 pushes an indirection to the argument b onto the stack before calling f. This is how evaluation is pulled under binders; when an indirection is forced, it reduces the application at the appropriate level, and all below it. I am still not totally sure when to introduce an abs; my guess is you do it whenever a function would otherwise reorder its arguments. An example may demonstrate what I mean (but perhaps not; I haven’t totally wrapped my head around it myself).

Here’s an execution trace for the example above, with the stack growing to the left. I start with the instruction pointer on the first function and the three arguments on the stack. The stack shown is the state before the instruction on each row. An identifier followed by a colon marks a position in the stack for indirections:

1 abs 1 (\x y. y I x) K K
2 abs 3 a (\x y. y I x) a:K K
3 exec 2 b a (\x y. y I x) a:K b:K
4 closure I b a a:K b:K
5 exec 2 I b a a:K b:K
6 pop 1 I b b:K
7 exec 0 I

When an indirection is executed, as in step 5, that is evaluation under a lambda.

This machine still doesn’t support laziness (though it didn’t matter in this example). We can achieve laziness by allocating a thunk when we dup. To evaluate the thunk, we put a mark on the stack. Whenever an instruction tries to reach beyond the mark, we capture the current stack and instruction pointer and jam it in a closure, then write that closure to the thunk. Indirections get replaced by their offsets; i.e. the “abs” commands that would create them. After we have done that, remove the mark (point it to where it was previously) and continue where we left off.

There you have it: my nifty partial evaluation machine. I’m reasonably confident that it’s correct, but I’m still not totally happy with the implementation of indirections — mostly the fact that you have to skip other indirections when you are pushing them. I wonder if there is a better way to get the same effect.

Comments/criticisms requested! :-)